Uploaded image for project: 'Blazegraph (by SYSTAP)'
  1. Blazegraph (by SYSTAP)
  2. BLZG-1297

Query optimizer slows down a query significantly

    XMLWordPrintable

    Details

    • Type: Bug
    • Status: Done
    • Priority: Medium
    • Resolution: Done
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: Wikidata Query Service
    • Labels:
      None

      Description

      During the beta tests on Wikidata, we've discovered the following query performs poorly:

      PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
      PREFIX skos: <http://www.w3.org/2004/02/skos/core#>
      PREFIX wikibase: <http://wikiba.se/ontology#>
      PREFIX hint: <http://www.bigdata.com/queryHints#>
      
      SELECT DISTINCT ?result WHERE {
      	{ 
      		{ ?subject0 rdfs:label "United States"@en . } UNION { ?subject0 skos:altLabel "United States"@en . }
      	}
      	{
      		{ ?predicate1 rdfs:label "president"@en . } UNION { ?predicate1 skos:altLabel "president"@en . }
      	}
      	?predicate1 a wikibase:Property .
      	?predicate1 wikibase:directClaim ?directPredicate2 .
      	?subject0 ?directPredicate2 ?result .
      }
      

      However, without the optimizer the same query runs in 300 ms:

      PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
      PREFIX skos: <http://www.w3.org/2004/02/skos/core#>
      PREFIX wikibase: <http://wikiba.se/ontology#>
      PREFIX hint: <http://www.bigdata.com/queryHints#>
      
      SELECT DISTINCT ?result WHERE {
        	hint:Query hint:optimizer "None" .
      
      	{ 
      		{ ?subject0 rdfs:label "United States"@en . } UNION { ?subject0 skos:altLabel "United States"@en . }
      	}
      	{
      		{ ?predicate1 rdfs:label "president"@en . } UNION { ?predicate1 skos:altLabel "president"@en . }
      	}
      	?predicate1 a wikibase:Property .
      	?predicate1 wikibase:directClaim ?directPredicate2 .
      	?subject0 ?directPredicate2 ?result .
      }
      

      See also https://phabricator.wikimedia.org/T100235

      The query plan by default:

      com.bigdata.bop.solutions.ProjectionOp[34](HTreeDistinctBindingSetsOp[32])[ BOp.bopId=34, BOp.evaluationContext=CONTROLLER, PipelineOp.sharedState=true, JoinAnnotations.select=[result], BOp.timeout=30000, QueryEngine.queryId=cc2246a3-a37f-4397-a731-38b5cf6d0bf9]
        com.bigdata.bop.solutions.HTreeDistinctBindingSetsOp[32](CopyOp[21])[ BOp.bopId=32, HashJoinAnnotations.joinVars=[result], BOp.evaluationContext=CONTROLLER, namedSetRef=NamedSolutionSetRef{localName=--distinct-33,queryId=cc2246a3-a37f-4397-a731-38b5cf6d0bf9,joinVars=[result]}, PipelineOp.sharedState=true, PipelineOp.maxParallel=1]
          com.bigdata.bop.bset.CopyOp[21](PipelineJoin[31])[ BOp.bopId=21, BOp.evaluationContext=CONTROLLER]
            com.bigdata.bop.join.PipelineJoin[31](CopyOp[23])[ BOp.bopId=31, JoinAnnotations.constraints=null, AST2BOpBase.simpleJoin=true, BOp.evaluationContext=ANY, AccessPathJoinAnnotations.predicate=com.bigdata.rdf.spo.SPOPredicate[29](predicate1=null, Vocab(-69)[http://www.w3.org/2004/02/skos/core#altLabel], TermId(21493L)[president], --anon-30=null)[ IPredicate.relationName=[wdq.spo], IPredicate.timestamp=1432575799791, BOp.bopId=29, AST2BOpBase.estimatedCardinality=4, AST2BOpBase.originalIndex=POS, IPredicate.flags=[KEYS,VALS,READONLY,PARALLEL]], PipelineOp.sinkRef=21]
              com.bigdata.bop.bset.CopyOp[23](PipelineJoin[28])[ BOp.bopId=23]
                com.bigdata.bop.join.PipelineJoin[28](CopyOp[22])[ BOp.bopId=28, JoinAnnotations.constraints=null, AST2BOpBase.simpleJoin=true, BOp.evaluationContext=ANY, AccessPathJoinAnnotations.predicate=com.bigdata.rdf.spo.SPOPredicate[26](predicate1=null, Vocab(68)[http://www.w3.org/2000/01/rdf-schema#label], TermId(21493L)[president], --anon-27=null)[ IPredicate.relationName=[wdq.spo], IPredicate.timestamp=1432575799791, BOp.bopId=26, AST2BOpBase.estimatedCardinality=2, AST2BOpBase.originalIndex=POS, IPredicate.flags=[KEYS,VALS,READONLY,PARALLEL]], PipelineOp.sinkRef=21]
                  com.bigdata.bop.bset.CopyOp[22](Tee[24])[ BOp.bopId=22]
                    com.bigdata.bop.bset.Tee[24](CopyOp[10])[ BOp.bopId=24, PipelineOp.sinkRef=22, PipelineOp.altSinkRef=23]
                      com.bigdata.bop.bset.CopyOp[10](PipelineJoin[20])[ BOp.bopId=10, BOp.evaluationContext=CONTROLLER]
                        com.bigdata.bop.join.PipelineJoin[20](CopyOp[12])[ BOp.bopId=20, JoinAnnotations.constraints=null, AST2BOpBase.simpleJoin=true, BOp.evaluationContext=ANY, AccessPathJoinAnnotations.predicate=com.bigdata.rdf.spo.SPOPredicate[18](subject0=null, Vocab(-69)[http://www.w3.org/2004/02/skos/core#altLabel], TermId(17806558L)[United States], --anon-19=null)[ IPredicate.relationName=[wdq.spo], IPredicate.timestamp=1432575799791, BOp.bopId=18, AST2BOpBase.estimatedCardinality=1, AST2BOpBase.originalIndex=POS, IPredicate.flags=[KEYS,VALS,READONLY,PARALLEL]], PipelineOp.sinkRef=10]
                          com.bigdata.bop.bset.CopyOp[12](PipelineJoin[17])[ BOp.bopId=12]
                            com.bigdata.bop.join.PipelineJoin[17](CopyOp[11])[ BOp.bopId=17, JoinAnnotations.constraints=null, AST2BOpBase.simpleJoin=true, BOp.evaluationContext=ANY, AccessPathJoinAnnotations.predicate=com.bigdata.rdf.spo.SPOPredicate[15](subject0=null, Vocab(68)[http://www.w3.org/2000/01/rdf-schema#label], TermId(17806558L)[United States], --anon-16=null)[ IPredicate.relationName=[wdq.spo], IPredicate.timestamp=1432575799791, BOp.bopId=15, AST2BOpBase.estimatedCardinality=3, AST2BOpBase.originalIndex=POS, IPredicate.flags=[KEYS,VALS,READONLY,PARALLEL]], PipelineOp.sinkRef=10]
                              com.bigdata.bop.bset.CopyOp[11](Tee[13])[ BOp.bopId=11]
                                com.bigdata.bop.bset.Tee[13](PipelineJoin[9])[ BOp.bopId=13, PipelineOp.sinkRef=11, PipelineOp.altSinkRef=12]
                                  com.bigdata.bop.join.PipelineJoin[9](PipelineJoin[6])[ BOp.bopId=9, JoinAnnotations.constraints=null, AST2BOpBase.simpleJoin=true, BOp.evaluationContext=ANY, AccessPathJoinAnnotations.predicate=com.bigdata.rdf.spo.SPOPredicate[7](subject0=null, directPredicate2=null, result=null, --anon-8=null)[ IPredicate.relationName=[wdq.spo], IPredicate.timestamp=1432575799791, BOp.bopId=7, AST2BOpBase.estimatedCardinality=600539593, AST2BOpBase.originalIndex=SPO, IPredicate.flags=[KEYS,VALS,READONLY,PARALLEL]]]
                                    com.bigdata.bop.join.PipelineJoin[6](PipelineJoin[3])[ BOp.bopId=6, JoinAnnotations.constraints=null, AST2BOpBase.simpleJoin=true, BOp.evaluationContext=ANY, AccessPathJoinAnnotations.predicate=com.bigdata.rdf.spo.SPOPredicate[4](predicate1=null, TermId(529U)[http://wikiba.se/ontology#directClaim], directPredicate2=null, --anon-5=null)[ IPredicate.relationName=[wdq.spo], IPredicate.timestamp=1432575799791, BOp.bopId=4, AST2BOpBase.estimatedCardinality=1543, AST2BOpBase.originalIndex=POS, IPredicate.flags=[KEYS,VALS,READONLY,PARALLEL]]]
                                      com.bigdata.bop.join.PipelineJoin[3]()[ BOp.bopId=3, JoinAnnotations.constraints=null, AST2BOpBase.simpleJoin=true, BOp.evaluationContext=ANY, AccessPathJoinAnnotations.predicate=com.bigdata.rdf.spo.SPOPredicate[1](predicate1=null, Vocab(57)[http://www.w3.org/1999/02/22-rdf-syntax-ns#type], TermId(532U)[http://wikiba.se/ontology#Property], --anon-2=null)[ IPredicate.relationName=[wdq.spo], IPredicate.timestamp=1432575799791, BOp.bopId=1, AST2BOpBase.estimatedCardinality=1543, AST2BOpBase.originalIndex=POS, IPredicate.flags=[KEYS,VALS,READONLY,PARALLEL]]]
      

      with None optimizer hint:

        com.bigdata.bop.solutions.HTreeDistinctBindingSetsOp[32](PipelineJoin[31])[ BOp.bopId=32, HashJoinAnnotations.joinVars=[result], BOp.evaluationContext=CONTROLLER, namedSetRef=NamedSolutionSetRef{localName=--distinct-33,queryId=d0d15a14-c4a4-433f-9282-0be160314640,joinVars=[result]}, PipelineOp.sharedState=true, PipelineOp.maxParallel=1]
          com.bigdata.bop.join.PipelineJoin[31](PipelineJoin[28])[ BOp.bopId=31, JoinAnnotations.constraints=null, AST2BOpBase.simpleJoin=true, BOp.evaluationContext=ANY, AccessPathJoinAnnotations.predicate=com.bigdata.rdf.spo.SPOPredicate[29](subject0=null, directPredicate2=null, result=null, --anon-30=null)[ IPredicate.relationName=[wdq.spo], IPredicate.timestamp=1432575927085, BOp.bopId=29, AST2BOpBase.estimatedCardinality=600539593, AST2BOpBase.originalIndex=SPO, IPredicate.flags=[KEYS,VALS,READONLY,PARALLEL]]]
            com.bigdata.bop.join.PipelineJoin[28](PipelineJoin[25])[ BOp.bopId=28, JoinAnnotations.constraints=null, AST2BOpBase.simpleJoin=true, BOp.evaluationContext=ANY, AccessPathJoinAnnotations.predicate=com.bigdata.rdf.spo.SPOPredicate[26](predicate1=null, TermId(529U)[http://wikiba.se/ontology#directClaim], directPredicate2=null, --anon-27=null)[ IPredicate.relationName=[wdq.spo], IPredicate.timestamp=1432575927085, BOp.bopId=26, AST2BOpBase.estimatedCardinality=1543, AST2BOpBase.originalIndex=POS, IPredicate.flags=[KEYS,VALS,READONLY,PARALLEL]]]
              com.bigdata.bop.join.PipelineJoin[25](CopyOp[12])[ BOp.bopId=25, JoinAnnotations.constraints=null, AST2BOpBase.simpleJoin=true, BOp.evaluationContext=ANY, AccessPathJoinAnnotations.predicate=com.bigdata.rdf.spo.SPOPredicate[23](predicate1=null, Vocab(57)[http://www.w3.org/1999/02/22-rdf-syntax-ns#type], TermId(532U)[http://wikiba.se/ontology#Property], --anon-24=null)[ IPredicate.relationName=[wdq.spo], IPredicate.timestamp=1432575927085, BOp.bopId=23, AST2BOpBase.estimatedCardinality=1543, AST2BOpBase.originalIndex=POS, IPredicate.flags=[KEYS,VALS,READONLY,PARALLEL]]]
                com.bigdata.bop.bset.CopyOp[12](PipelineJoin[22])[ BOp.bopId=12, BOp.evaluationContext=CONTROLLER]
                  com.bigdata.bop.join.PipelineJoin[22](CopyOp[14])[ BOp.bopId=22, JoinAnnotations.constraints=null, AST2BOpBase.simpleJoin=true, BOp.evaluationContext=ANY, AccessPathJoinAnnotations.predicate=com.bigdata.rdf.spo.SPOPredicate[20](predicate1=null, Vocab(-69)[http://www.w3.org/2004/02/skos/core#altLabel], TermId(21493L)[president], --anon-21=null)[ IPredicate.relationName=[wdq.spo], IPredicate.timestamp=1432575927085, BOp.bopId=20, AST2BOpBase.estimatedCardinality=4, AST2BOpBase.originalIndex=POS, IPredicate.flags=[KEYS,VALS,READONLY,PARALLEL]], PipelineOp.sinkRef=12]
                    com.bigdata.bop.bset.CopyOp[14](PipelineJoin[19])[ BOp.bopId=14]
                      com.bigdata.bop.join.PipelineJoin[19](CopyOp[13])[ BOp.bopId=19, JoinAnnotations.constraints=null, AST2BOpBase.simpleJoin=true, BOp.evaluationContext=ANY, AccessPathJoinAnnotations.predicate=com.bigdata.rdf.spo.SPOPredicate[17](predicate1=null, Vocab(68)[http://www.w3.org/2000/01/rdf-schema#label], TermId(21493L)[president], --anon-18=null)[ IPredicate.relationName=[wdq.spo], IPredicate.timestamp=1432575927085, BOp.bopId=17, AST2BOpBase.estimatedCardinality=2, AST2BOpBase.originalIndex=POS, IPredicate.flags=[KEYS,VALS,READONLY,PARALLEL]], PipelineOp.sinkRef=12]
                        com.bigdata.bop.bset.CopyOp[13](Tee[15])[ BOp.bopId=13]
                          com.bigdata.bop.bset.Tee[15](CopyOp[1])[ BOp.bopId=15, PipelineOp.sinkRef=13, PipelineOp.altSinkRef=14]
                            com.bigdata.bop.bset.CopyOp[1](PipelineJoin[11])[ BOp.bopId=1, BOp.evaluationContext=CONTROLLER]
                              com.bigdata.bop.join.PipelineJoin[11](CopyOp[3])[ BOp.bopId=11, JoinAnnotations.constraints=null, AST2BOpBase.simpleJoin=true, BOp.evaluationContext=ANY, AccessPathJoinAnnotations.predicate=com.bigdata.rdf.spo.SPOPredicate[9](subject0=null, Vocab(-69)[http://www.w3.org/2004/02/skos/core#altLabel], TermId(17806558L)[United States], --anon-10=null)[ IPredicate.relationName=[wdq.spo], IPredicate.timestamp=1432575927085, BOp.bopId=9, AST2BOpBase.estimatedCardinality=1, AST2BOpBase.originalIndex=POS, IPredicate.flags=[KEYS,VALS,READONLY,PARALLEL]], PipelineOp.sinkRef=1]
                                com.bigdata.bop.bset.CopyOp[3](PipelineJoin[8])[ BOp.bopId=3]
                                  com.bigdata.bop.join.PipelineJoin[8](CopyOp[2])[ BOp.bopId=8, JoinAnnotations.constraints=null, AST2BOpBase.simpleJoin=true, BOp.evaluationContext=ANY, AccessPathJoinAnnotations.predicate=com.bigdata.rdf.spo.SPOPredicate[6](subject0=null, Vocab(68)[http://www.w3.org/2000/01/rdf-schema#label], TermId(17806558L)[United States], --anon-7=null)[ IPredicate.relationName=[wdq.spo], IPredicate.timestamp=1432575927085, BOp.bopId=6, AST2BOpBase.estimatedCardinality=3, AST2BOpBase.originalIndex=POS, IPredicate.flags=[KEYS,VALS,READONLY,PARALLEL]], PipelineOp.sinkRef=1]
                                    com.bigdata.bop.bset.CopyOp[2](Tee[4])[ BOp.bopId=2]
                                      com.bigdata.bop.bset.Tee[4]()[ BOp.bopId=4, PipelineOp.sinkRef=2, PipelineOp.altSinkRef=3]
      

        Attachments

          Activity

            People

            Assignee:
            michaelschmidt michaelschmidt
            Reporter:
            stasmalyshev stasmalyshev
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

              Dates

              Created:
              Updated:
              Resolved: