Uploaded image for project: 'Blazegraph (by SYSTAP)'
  1. Blazegraph (by SYSTAP)
  2. BLZG-1163

Optimized variable projection into subqueries/subgroups



      In patterns like the one from Ticket BLZG-913

      PREFIX  rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
      PREFIX  rdfs: <http://www.w3.org/2000/01/rdf-schema#>
      PREFIX  xsd: <http://www.w3.org/2001/XMLSchema#>
      SELECT  ?ps ?p ?o
      WHERE { 
        GRAPH <http://example.com/graph1>
          ?ps ?p ?o.
            SELECT ?ps
              ?ps a  <http://example.com/data/Person>. 

      a hash index is set up for the binding set produced by the outer triple pattern (?ps, ?p, ?o), for later reuse, and subsequently the variable ?ps is projected and flooded into the subquery. In the projection step, we need to apply an additional JVMDistinctBindingSetsOp to avoid duplicates (cf. ticket BLZG-913). However, this DISTINCT comes for free: the key is (always?) defined by exactly those variables that are projected into the subquery, so it is already computed when setting up the hash index.

      Note that we might even apply this pattern for queries where, for instance, the SELECT subquery is replaced through an OPTIONAL join group. Currently, in such cases no projection is applied at all, but it should be possible to use the same pattern (i.e., a distinct projection).

      Think about the benefits of such an optimization and a possible generalization. This might be considered in the scope of a general strategy/framework to drop variables that are no longer required.


        No work has yet been logged on this issue.


          • Assignee:
            michaelschmidt michaelschmidt
            michaelschmidt michaelschmidt
          • Votes:
            0 Vote for this issue
            1 Start watching this issue


            • Created: