Uploaded image for project: 'Blazegraph (by SYSTAP)'
  1. Blazegraph (by SYSTAP)
  2. BLZG-661

Front-coding schema should eliminate a globally shared prefix before delta coding.

    XMLWordPrintable

    Details

    • Type: New Feature
    • Status: Done
    • Resolution: Done
    • Affects Version/s: BIGDATA_RELEASE_1_2_0
    • Fix Version/s: None
    • Component/s: B+Tree

      Description

      The front-coded scheme is not providing enough compression. It is repeating the leading key every 8 tuples. That is going to be pretty FAT on the TERM2ID index for data such as chembl.

      <http://chem2bio2rdf.org/chembl/resource/chembl_compound_records/59299> <http://chem2bio2rdf.org/chembl/resource/doc_id> <http://chem2bio2rdf.org/chembl/resource/chembl_docs/9806> .
      <http://chem2bio2rdf.org/chembl/resource/chembl_compound_records/59300> <http://chem2bio2rdf.org/chembl/resource/doc_id> 
      

      It would be more efficient if we knew a prefix shared across ALL tuples in a leaf and then factored that out across the entire leaf, THEN did this delta coding game.

      Also see [1].

      [1] https://sourceforge.net/apps/trac/bigdata/ticket/514 (PartlyInlineURIIV support)

        Attachments

          Activity

            People

            Assignee:
            bryanthompson bryanthompson
            Reporter:
            bryanthompson bryanthompson
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

              Dates

              Created:
              Updated:
              Resolved: