Uploaded image for project: 'Blazegraph (by SYSTAP)'
  1. Blazegraph (by SYSTAP)
  2. BLZG-1495

Property paths with trailing ? not treated properly

    Details

    • Type: Bug
    • Status: Done
    • Priority: Medium
    • Resolution: Done
    • Affects Version/s: None
    • Fix Version/s: BLAZEGRAPH_RELEASE_1_5_3
    • Component/s: None
    • Labels:
      None

      Description

      Consider the database with one tripel

      <http://s1> <http://p1> <http://s2> .
      

      The query

      SELECT * 
      WHERE {
        ?s <http://p1> ?o
      }
      

      gives one result as expected, but

      SELECT * 
      WHERE {
        ?s <http://p1> / <http://unknown>? ?o
      }
      

      returns zero results.

        Issue Links

          Activity

          Hide
          michaelschmidt michaelschmidt added a comment - - edited

          Here's the query plan:

          com.bigdata.bop.solutions.ProjectionOp[9](JVMSolutionSetHashJoinOp[8])[ BOp.bopId=9, BOp.evaluationContext=CONTROLLER, PipelineOp.sharedState=true, JoinAnnotations.select=[s, o], QueryEngine.queryId=98a30249-4cf2-4321-a918-4de0b7675d64]
            com.bigdata.bop.join.JVMSolutionSetHashJoinOp[8](ArbitraryLengthPathOp[7])[ BOp.bopId=8, BOp.evaluationContext=CONTROLLER, PipelineOp.maxParallel=1, PipelineOp.sharedState=true, JoinAnnotations.constraints=null, SolutionSetHashJoinOp.release=true, PipelineOp.lastPass=true, namedSetRef=NamedSolutionSetRef{localName=--set-3,queryId=98a30249-4cf2-4321-a918-4de0b7675d64,joinVars=[]}]
              com.bigdata.bop.paths.ArbitraryLengthPathOp[7](JVMHashIndexOp[4])[ ArbitraryLengthPathOp$Annotations.subquery=com.bigdata.bop.join.PipelineJoin[6]()[ BOp.bopId=6, JoinAnnotations.constraints=null, AST2BOpBase.simpleJoin=true, BOp.evaluationContext=ANY, AccessPathJoinAnnotations.predicate=com.bigdata.rdf.spo.SPOPredicate[5](-tVarLeft-898a4980-4f51-4723-9656-ece7d05d2d37=null, TermId(0U)[http://unknown], -tVarRight-7f5c8dd6-1123-4561-802c-28f96f52c196=null)[ IPredicate.relationName=[test2.spo], IPredicate.timestamp=1442395275163, BOp.bopId=5, AST2BOpBase.estimatedCardinality=0, AST2BOpBase.originalIndex=POS, IPredicate.flags=[KEYS,VALS,READONLY,PARALLEL]]], ArbitraryLengthPathOp$Annotations.leftTerm=--pp-anon-87930a18-1c40-4403-ab82-519d795d3e33, ArbitraryLengthPathOp$Annotations.rightTerm=o, ArbitraryLengthPathOp$Annotations.transitivityVarLeft=-tVarLeft-898a4980-4f51-4723-9656-ece7d05d2d37, ArbitraryLengthPathOp$Annotations.transitivityVarRight=-tVarRight-7f5c8dd6-1123-4561-802c-28f96f52c196, ArbitraryLengthPathOp$Annotations.edgeVar=null, ArbitraryLengthPathOp$Annotations.middleTerm=null, ArbitraryLengthPathOp$Annotations.lowerBound=0, ArbitraryLengthPathOp$Annotations.upperBound=1, ArbitraryLengthPathOp$Annotations.projectInVars=[], BOp.bopId=7, BOp.evaluationContext=CONTROLLER]
              @com.bigdata.bop.paths.ArbitraryLengthPathOp$Annotations.subquery:
                com.bigdata.bop.join.PipelineJoin[6]()[ BOp.bopId=6, JoinAnnotations.constraints=null, AST2BOpBase.simpleJoin=true, BOp.evaluationContext=ANY, AccessPathJoinAnnotations.predicate=com.bigdata.rdf.spo.SPOPredicate[5](-tVarLeft-898a4980-4f51-4723-9656-ece7d05d2d37=null, TermId(0U)[http://unknown], -tVarRight-7f5c8dd6-1123-4561-802c-28f96f52c196=null)[ IPredicate.relationName=[test2.spo], IPredicate.timestamp=1442395275163, BOp.bopId=5, AST2BOpBase.estimatedCardinality=0, AST2BOpBase.originalIndex=POS, IPredicate.flags=[KEYS,VALS,READONLY,PARALLEL]]]
                com.bigdata.bop.join.JVMHashIndexOp[4](PipelineJoin[2])[ BOp.bopId=4, BOp.evaluationContext=CONTROLLER, PipelineOp.maxParallel=1, PipelineOp.lastPass=true, PipelineOp.sharedState=true, JoinAnnotations.joinType=Normal, HashJoinAnnotations.joinVars=[], HashJoinAnnotations.outputDistinctJVs=true, JoinAnnotations.constraints=null, namedSetRef=NamedSolutionSetRef{localName=--set-3,queryId=98a30249-4cf2-4321-a918-4de0b7675d64,joinVars=[]}]
                  com.bigdata.bop.join.PipelineJoin[2]()[ BOp.bopId=2, JoinAnnotations.constraints=null, AST2BOpBase.simpleJoin=true, BOp.evaluationContext=ANY, AccessPathJoinAnnotations.predicate=com.bigdata.rdf.spo.SPOPredicate[1](s=null, TermId(4U)[http://p1], --pp-anon-87930a18-1c40-4403-ab82-519d795d3e33=null)[ IPredicate.relationName=[test2.spo], IPredicate.timestamp=1442395275163, BOp.bopId=1, AST2BOpBase.estimatedCardinality=1, AST2BOpBase.originalIndex=POS, IPredicate.flags=[KEYS,VALS,READONLY,PARALLEL]]]
          

          The query above is rewritten as

          SELECT * 
          WHERE {
            ?s <http://p1> ?anonVar .
            ?anonVar <http://unknown>? ?o
          }
          

          and the pattern for executing the ?-property path is by evaluating the first pattern, building a hash index while projecting in the join vars, the evaluating the inner expression, and finally joining with the index.

          The problem lies in the invalid calculation of the join variables: in that case, the join variables should be the set

          { ?anonVarX }

          , but com.bigdata.rdf.sparql.ast.ArbitraryLengthPathNode.getDefinitelyProducedBindings() ignores anonymous variables. Replacing

                  addProducedBinding(left(), producedBindings);
                  addProducedBinding(right(), producedBindings);
          

          through

                  addVar(left(), producedBindings, true /* consider anonymous vars */);
                  addVar(right(), producedBindings, true /* consider anonymous vars */);
          

          does the job and the query succeeds. Not sure why anon vars are ignored and whether this could break other code though...

          Show
          michaelschmidt michaelschmidt added a comment - - edited Here's the query plan: com.bigdata.bop.solutions.ProjectionOp[9](JVMSolutionSetHashJoinOp[8])[ BOp.bopId=9, BOp.evaluationContext=CONTROLLER, PipelineOp.sharedState= true , JoinAnnotations.select=[s, o], QueryEngine.queryId=98a30249-4cf2-4321-a918-4de0b7675d64] com.bigdata.bop.join.JVMSolutionSetHashJoinOp[8](ArbitraryLengthPathOp[7])[ BOp.bopId=8, BOp.evaluationContext=CONTROLLER, PipelineOp.maxParallel=1, PipelineOp.sharedState= true , JoinAnnotations.constraints= null , SolutionSetHashJoinOp.release= true , PipelineOp.lastPass= true , namedSetRef=NamedSolutionSetRef{localName=--set-3,queryId=98a30249-4cf2-4321-a918-4de0b7675d64,joinVars=[]}] com.bigdata.bop.paths.ArbitraryLengthPathOp[7](JVMHashIndexOp[4])[ ArbitraryLengthPathOp$Annotations.subquery=com.bigdata.bop.join.PipelineJoin[6]()[ BOp.bopId=6, JoinAnnotations.constraints= null , AST2BOpBase.simpleJoin= true , BOp.evaluationContext=ANY, AccessPathJoinAnnotations.predicate=com.bigdata.rdf.spo.SPOPredicate[5](-tVarLeft-898a4980-4f51-4723-9656-ece7d05d2d37= null , TermId(0U)[http: //unknown], -tVarRight-7f5c8dd6-1123-4561-802c-28f96f52c196= null )[ IPredicate.relationName=[test2.spo], IPredicate.timestamp=1442395275163, BOp.bopId=5, AST2BOpBase.estimatedCardinality=0, AST2BOpBase.originalIndex=POS, IPredicate.flags=[KEYS,VALS,READONLY,PARALLEL]]], ArbitraryLengthPathOp$Annotations.leftTerm=--pp-anon-87930a18-1c40-4403-ab82-519d795d3e33, ArbitraryLengthPathOp$Annotations.rightTerm=o, ArbitraryLengthPathOp$Annotations.transitivityVarLeft=-tVarLeft-898a4980-4f51-4723-9656-ece7d05d2d37, ArbitraryLengthPathOp$Annotations.transitivityVarRight=-tVarRight-7f5c8dd6-1123-4561-802c-28f96f52c196, ArbitraryLengthPathOp$Annotations.edgeVar= null , ArbitraryLengthPathOp$Annotations.middleTerm= null , ArbitraryLengthPathOp$Annotations.lowerBound=0, ArbitraryLengthPathOp$Annotations.upperBound=1, ArbitraryLengthPathOp$Annotations.projectInVars=[], BOp.bopId=7, BOp.evaluationContext=CONTROLLER] @com.bigdata.bop.paths.ArbitraryLengthPathOp$Annotations.subquery: com.bigdata.bop.join.PipelineJoin[6]()[ BOp.bopId=6, JoinAnnotations.constraints= null , AST2BOpBase.simpleJoin= true , BOp.evaluationContext=ANY, AccessPathJoinAnnotations.predicate=com.bigdata.rdf.spo.SPOPredicate[5](-tVarLeft-898a4980-4f51-4723-9656-ece7d05d2d37= null , TermId(0U)[http: //unknown], -tVarRight-7f5c8dd6-1123-4561-802c-28f96f52c196= null )[ IPredicate.relationName=[test2.spo], IPredicate.timestamp=1442395275163, BOp.bopId=5, AST2BOpBase.estimatedCardinality=0, AST2BOpBase.originalIndex=POS, IPredicate.flags=[KEYS,VALS,READONLY,PARALLEL]]] com.bigdata.bop.join.JVMHashIndexOp[4](PipelineJoin[2])[ BOp.bopId=4, BOp.evaluationContext=CONTROLLER, PipelineOp.maxParallel=1, PipelineOp.lastPass= true , PipelineOp.sharedState= true , JoinAnnotations.joinType=Normal, HashJoinAnnotations.joinVars=[], HashJoinAnnotations.outputDistinctJVs= true , JoinAnnotations.constraints= null , namedSetRef=NamedSolutionSetRef{localName=--set-3,queryId=98a30249-4cf2-4321-a918-4de0b7675d64,joinVars=[]}] com.bigdata.bop.join.PipelineJoin[2]()[ BOp.bopId=2, JoinAnnotations.constraints= null , AST2BOpBase.simpleJoin= true , BOp.evaluationContext=ANY, AccessPathJoinAnnotations.predicate=com.bigdata.rdf.spo.SPOPredicate[1](s= null , TermId(4U)[http: //p1], --pp-anon-87930a18-1c40-4403-ab82-519d795d3e33= null )[ IPredicate.relationName=[test2.spo], IPredicate.timestamp=1442395275163, BOp.bopId=1, AST2BOpBase.estimatedCardinality=1, AST2BOpBase.originalIndex=POS, IPredicate.flags=[KEYS,VALS,READONLY,PARALLEL]]] The query above is rewritten as SELECT * WHERE { ?s <http: //p1> ?anonVar . ?anonVar <http: //unknown>? ?o } and the pattern for executing the ?-property path is by evaluating the first pattern, building a hash index while projecting in the join vars, the evaluating the inner expression, and finally joining with the index. The problem lies in the invalid calculation of the join variables: in that case, the join variables should be the set { ?anonVarX } , but com.bigdata.rdf.sparql.ast.ArbitraryLengthPathNode.getDefinitelyProducedBindings() ignores anonymous variables. Replacing addProducedBinding(left(), producedBindings); addProducedBinding(right(), producedBindings); through addVar(left(), producedBindings, true /* consider anonymous vars */); addVar(right(), producedBindings, true /* consider anonymous vars */); does the job and the query succeeds. Not sure why anon vars are ignored and whether this could break other code though...
          Hide
          michaelschmidt michaelschmidt added a comment -
          Show
          michaelschmidt michaelschmidt added a comment - See pull request at https://github.com/SYSTAP/bigdata/pull/160
          Show
          beebs Brad Bebee added a comment - https://github.com/SYSTAP/bigdata/commit/5d54bf8e277fb40876925c6a8fd4a8efc85f657c
          Hide
          beebs Brad Bebee added a comment -
          Show
          beebs Brad Bebee added a comment - Maven master merge is https://github.com/SYSTAP/bigdata/pull/161

            People

            • Assignee:
              michaelschmidt michaelschmidt
              Reporter:
              michaelschmidt michaelschmidt
            • Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: