Uploaded image for project: 'Blazegraph (by SYSTAP)'
  1. Blazegraph (by SYSTAP)
  2. BLZG-478

Reduce SPARQL parser heap churn using custom CharStream impl.

    Details

    • Type: New Feature
    • Status: In Progress
    • Resolution: Unresolved
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: Bigdata SAIL
    • Labels:
      None

      Description

      An issue has been documented in [1] where the SPARQL parser drives heap churn due to an inefficient implementation of the CharStream API bundled with javacc. Per [1], Arjohn has recommended writing a new version of CharStream. Also, per [1] there is a simplified version bundled with Lucene (FastCharStream), but we can do even better by directly implementing the CharStream API over a String (rather than a Reader with all the double buffering that entails for tokens). The new CharStream implementation will have to support the same escape sequence mechanisms as the JavaCharStream implementation bundled with javacc (the Lucene FastCharStream does not support those escape sequences).

      [1] https://sourceforge.net/apps/trac/bigdata/ticket/336

        Activity

        Hide
        bryanthompson bryanthompson added a comment -

        We will resolve this for Sesame 2.5.2 in support of the ability of that platform to report a slice of the underlying String containing the SPARQL query that corresponds to a SERVICE's graph pattern.

        Show
        bryanthompson bryanthompson added a comment - We will resolve this for Sesame 2.5.2 in support of the ability of that platform to report a slice of the underlying String containing the SPARQL query that corresponds to a SERVICE's graph pattern.
        Hide
        bryanthompson bryanthompson added a comment -

        We have resolved that this issue will be addressed separately for bigdata rather than as a fix incorporated into openrdf. This is because it is trivial for us to use a JavaCharStream pool while it would cause API issues for openrdf. Also, it appears that there is a much easier solution to the SERVICE's graph pattern capture which was discussed on the openrdf list.

        Show
        bryanthompson bryanthompson added a comment - We have resolved that this issue will be addressed separately for bigdata rather than as a fix incorporated into openrdf. This is because it is trivial for us to use a JavaCharStream pool while it would cause API issues for openrdf. Also, it appears that there is a much easier solution to the SERVICE's graph pattern capture which was discussed on the openrdf list.
        Hide
        bryanthompson bryanthompson added a comment -

        I am dropping the priority on this item again. We will wait until after the 1.1 release to do this optimization.

        Show
        bryanthompson bryanthompson added a comment - I am dropping the priority on this item again. We will wait until after the 1.1 release to do this optimization.

          People

          • Assignee:
            bryanthompson bryanthompson
            Reporter:
            bryanthompson bryanthompson
          • Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

            • Created:
              Updated: