Uploaded image for project: 'Blazegraph (by SYSTAP)'
  1. Blazegraph (by SYSTAP)
  2. BLZG-629

PartlyInlineURIIV support

    XMLWordPrintable

    Details

    • Type: New Feature
    • Status: In Progress
    • Resolution: Unresolved
    • Affects Version/s: BIGDATA_RELEASE_1_1_0
    • Fix Version/s: None
    • Component/s: Bigdata RDF Database
    • Labels:
      None

      Description

      Support namespace-based URI IVs where the namespace IV is a TermId and the localName is XSD:String. The class has been implemented (and I have also implemented a vocabulary based version which is fully inline). However, the PartlyInlineURIIV is not yet integrated.

      The StatementBuffer would have to identify and resolve / create namespace URIs for ALL namespaces (since this needs to be a repeatable process) for URIs having a localName.length() LT maxInlineTextLength. That would be one lexicon pass on TERM2ID. Thus we can generate the full PartlyInlineURI with just the IV for the namespace.

      For the Journal we only need to look at a few methods. We can always do the namespaceIV resolution against the TERM2ID index since it is a local index. The core methods on LexiconRelation might be newAccessPath() and getValue(final IV iv, final boolean readFromIndex).

      ScaleOut: Materialization needs to go to the right place. This is going to be the shard with the TermId of the namespaceIV. We only need one lookup, but we have to be on the correct shard. Scale-out will also need a modified bulk loader. This is definitely trickier, but we can test against the Journal first and see if it is worth supporting.

      My original trials with namespace based URI encoding of IVs indicated a tremendous speedup. We are not realizing that speedup with just the FullyInlineURIIV because it requires us to pre-declare the namespace IVs. This is Ok for fixed vocabularies which are latter extended, but it does not do much for us in the open web or even open application context. Also, some URI generation schemes completely defeat the FullyInlineURIIV. For example, LUBM uses the following kind of pattern. Clearly this sort of pattern is amenable to the dynamic discovery of useful namespaces, but not to their static declaration.

      http://www.Department0.University0.edu/UndergraduateStudent488
      

      Also see [1].

      [1] BLZG-661 (Front-coding schema should eliminate a globally shared prefix before delta coding)

        Attachments

          Issue Links

            Activity

              People

              Assignee:
              bryanthompson bryanthompson
              Reporter:
              bryanthompson bryanthompson
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

                Dates

                Created:
                Updated: