Uploaded image for project: 'Blazegraph (by SYSTAP)'
  1. Blazegraph (by SYSTAP)
  2. BLZG-641 Improve load performance
  3. BLZG-1529

Improve read/replace singleton property value

    XMLWordPrintable

    Details

    • Type: Sub-task
    • Status: Open
    • Priority: Medium
    • Resolution: Unresolved
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: None
    • Labels:
      None

      Description

      Singleton is a bit of a misnomer, because the value could be a collection, but the general pattern logically looks like this:

      void setValue(URI s, URI p, Value v) {
          setValues(s, p, Arrays.asList(v));
      }
      
      void setValues(URI s, URI p, Value... vals) {
          db.removeStatements(s, p, null);
          for (Value v : vals) {
              db.addStatement(s, p, v);
          }
      }
      

      Where s is the identifier for an element in the property graph, p is some property, and v/vals are the values for that property, which should replace any old values for the property. If you imagine the use case where we are updating a large number of elements of the same type (elements of the same type share the same schema and thus the same property set), the value space for S will be high and will have poor locality (typically UUIDs), the value space for P will be small, and the value space for O will also be high and widely distributed.

      The code above was the first attempt at this pattern and behaved poorly (poor performance and lots of removing and adding of the same value).

      The second attempt was a listener + SPARQL Update based approach but this failed miserably because of our latency for running a single query. Blaze does very poorly when presented with a large number of queries that each do a small amount of work because of how long it takes for us to parse, optimize, and translate into a physical plan - sometimes on the order of 500ms. So for thousands of queries... not good.

      The current approach uses a list of (SPOPredicate+IElementFilter)s built up over the course of the add process that get run once all the adds are complete. These predicates scan each S and remove any P/O values that should no longer be there based on what was added. This approach has the benefit of letting all the adds happen before the read/removes, but I still don't get great locality on the read/removes - especially for OSP.

        Attachments

          Activity

            People

            Assignee:
            mikepersonick mikepersonick
            Reporter:
            mikepersonick mikepersonick
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

              Dates

              Created:
              Updated: