Details

      Description

      putting a hint:query hint:optimizer "None" into a query with a FILTER EXISTS does not work because the FILTER EXISTS gets split into a subselect that is put at the end and a FILTER that is left where it is

        Issue Links

          Activity

          beebs Brad Bebee created issue -
          beebs Brad Bebee made changes -
          Field Original Value New Value
          Workflow Trac Import v2 [ 12857 ] Trac Import v3 [ 13235 ]
          beebs Brad Bebee made changes -
          Workflow Trac Import v3 [ 13235 ] Trac Import v4 [ 14564 ]
          Hide
          jjc Jeremy Carroll added a comment -

          Here is an example:

          INSERT {
            
             <eg:a> <eg:p> 1 .
             <eg:a> <eg:ap> 2.
          }
          WHERE
          {}
          

          and then

          select * { 
             ?s <eg:p>  ?o .
            FILTER EXISTS {
              ?s <eg:ap> ?oo
            }
           # hint:Query hint:optimizer "None" .
          }
          

          without the hint we get the correct answer, with the hint we get none

          Show
          jjc Jeremy Carroll added a comment - Here is an example: INSERT { <eg:a> <eg:p> 1 . <eg:a> <eg:ap> 2. } WHERE {} and then select * { ?s <eg:p> ?o . FILTER EXISTS { ?s <eg:ap> ?oo } # hint:Query hint:optimizer "None" . } without the hint we get the correct answer, with the hint we get none
          beebs Brad Bebee made changes -
          Workflow Trac Import v4 [ 14564 ] Trac Import v5 [ 15915 ]
          beebs Brad Bebee made changes -
          Assignee mrpersonick [ mrpersonick ] michaelschmidt [ michaelschmidt ]
          Hide
          beebs Brad Bebee added a comment -

          michaelschmidt Do you mind taking a look at this one?

          Show
          beebs Brad Bebee added a comment - michaelschmidt Do you mind taking a look at this one?
          Hide
          beebs Brad Bebee added a comment -

          See recent comment by michaelschmidt

          https://wiki.blazegraph.com/wiki/index.php/QueryHints

          Note that value "None" disables reordering and might, in some special cases, harm correctness of the queries. In particular, when choosing "None" you should make sure that constructs requiring variables to be bound (such as BIND node introducing new variables based on the value of other variables, e.g. <code>BIND(2*?x AS ?y)</code>) are placed at a position where the required variables have been introduced already.

          Show
          beebs Brad Bebee added a comment - See recent comment by michaelschmidt https://wiki.blazegraph.com/wiki/index.php/QueryHints Note that value "None" disables reordering and might, in some special cases, harm correctness of the queries. In particular, when choosing "None" you should make sure that constructs requiring variables to be bound (such as BIND node introducing new variables based on the value of other variables, e.g. <code>BIND(2*?x AS ?y)</code>) are placed at a position where the required variables have been introduced already.
          Hide
          beebs Brad Bebee added a comment -

          From Jeremy Carroll: Is it possible to detect when the query hint will produce incorrect results and returning and error?

          Show
          beebs Brad Bebee added a comment - From Jeremy Carroll : Is it possible to detect when the query hint will produce incorrect results and returning and error?
          Hide
          michaelschmidt michaelschmidt added a comment -

          Have to think about this and look at the code again. Assuming you write correct SPARQL 1.1, my guess is that this affects only FILTERs, possibly even only queries involving FILTER (NOT) EXISTS. If you observe other cases please let me know. Imho, this could be fixed in the code, i.e. in that case we could still take care that they are translated properly. I put the ticket on my backlog.

          Show
          michaelschmidt michaelschmidt added a comment - Have to think about this and look at the code again. Assuming you write correct SPARQL 1.1, my guess is that this affects only FILTERs, possibly even only queries involving FILTER (NOT) EXISTS. If you observe other cases please let me know. Imho, this could be fixed in the code, i.e. in that case we could still take care that they are translated properly. I put the ticket on my backlog.
          Hide
          michaelschmidt michaelschmidt added a comment -

          Proposed fix in BLZG-1021, which should resolve the FILTER (NOT) EXISTS case. Though I'm not 100% sure about the proper treatment of other FILTER expressions.

          Show
          michaelschmidt michaelschmidt added a comment - Proposed fix in BLZG-1021 , which should resolve the FILTER (NOT) EXISTS case. Though I'm not 100% sure about the proper treatment of other FILTER expressions.
          Hide
          michaelschmidt michaelschmidt added a comment -

          Independent from whether the fix will make it into the release – I don't know why you're disabling the optimizer, but just as a side note: the 1.5.2 release will implemented improved logics for the placement of FILTER (NOT) EXISTS. These patterns were not properly optimized before, but are now treated similar in style to FILTERs, so you may want to give it a try without optimizer disabled.

          Show
          michaelschmidt michaelschmidt added a comment - Independent from whether the fix will make it into the release – I don't know why you're disabling the optimizer, but just as a side note: the 1.5.2 release will implemented improved logics for the placement of FILTER (NOT) EXISTS. These patterns were not properly optimized before, but are now treated similar in style to FILTERs, so you may want to give it a try without optimizer disabled.
          Hide
          jjc Jeremy Carroll added a comment - - edited

          In my opinion an adequate partial fix would be to raise an error in the combo of optimizer = none and filter (not) exists. Or maybe optimizer none and bind.
          The really bad bug is an optimizer hint that silently gives incorrect results.

          Show
          jjc Jeremy Carroll added a comment - - edited In my opinion an adequate partial fix would be to raise an error in the combo of optimizer = none and filter (not) exists. Or maybe optimizer none and bind. The really bad bug is an optimizer hint that silently gives incorrect results.
          Hide
          jjc Jeremy Carroll added a comment -

          I have also seen incorrect results with a hint:Prior hint:runLast true, hint - but we failed to get a strong repro.

          select *
              WHERE
              {
                   
                  <http://localhost:8000/graph/syapse#subClassOf> rdfs:domain ?domain .
                  <http://localhost:8000/graph/syapse#subClassOf> rdfs:range ?range .
                  { GRAPH <http://localhost:8000/graph/ontology/sys:sys> {
                      ?superNew rdf:type  ?range
                  } }
                  UNION
                  { ?range rdf:type owl:Restriction ;
                          owl:onProperty ?p ;
                          owl:hasValue ?v
                      GRAPH <http://localhost:8000/graph/ontology/sys:sys> {
                        ?superNew ?p ?v
                      }
                  }
                  ?subNew rdfs:subClassOf * ?superNew .
                #  hint:Prior hint:runLast true .
                  FILTER EXISTS {
                    # Force range check to be last
                      {  ?subNew rdf:type ?domain }
                      UNION
                      {  ?domain rdf:type owl:Restriction ;
                          owl:onProperty ?pp ;
                          owl:hasValue ?vv .
                          ?subNew ?pp ?vv
                      }
                  }
                  
              }
          

          seems like there is hope that this may be fixed with 1.5.2

          Show
          jjc Jeremy Carroll added a comment - I have also seen incorrect results with a hint:Prior hint:runLast true, hint - but we failed to get a strong repro. select * WHERE { <http://localhost:8000/graph/syapse#subClassOf> rdfs:domain ?domain . <http://localhost:8000/graph/syapse#subClassOf> rdfs:range ?range . { GRAPH <http://localhost:8000/graph/ontology/sys:sys> { ?superNew rdf:type ?range } } UNION { ?range rdf:type owl:Restriction ; owl:onProperty ?p ; owl:hasValue ?v GRAPH <http://localhost:8000/graph/ontology/sys:sys> { ?superNew ?p ?v } } ?subNew rdfs:subClassOf * ?superNew . # hint:Prior hint:runLast true . FILTER EXISTS { # Force range check to be last { ?subNew rdf:type ?domain } UNION { ?domain rdf:type owl:Restriction ; owl:onProperty ?pp ; owl:hasValue ?vv . ?subNew ?pp ?vv } } } seems like there is hope that this may be fixed with 1.5.2
          jjc Jeremy Carroll made changes -
          Link This issue relates to BLZG-1366 [ BLZG-1366 ]
          Hide
          michaelschmidt michaelschmidt added a comment -

          The point is that, in cases where we can identify harmful situations (regarding the optimizer hint), it will be quite easy to fix them now that we have done the refactoring of join order. As I said, I'm pretty optimistic my recent fix will do the job, the problem was that FILTER (NOT) EXISTS was decomposed into a FILTER + subquery instead of subquery + FILTER - and when reordering is disabled this gives the wrong result.

          Apart from the FILTER (NOT) EXISTS the only thing I could think of are complex filters that cannot be attached, BIND shouldn't be a problem (please let me know if you observed otherwise). I'll test that tomorrow.

          Regarding your second query: I agree, there's a high chance that this has been fixed. If you can provide a reproducible setting please let me know and I'll test it.

          Show
          michaelschmidt michaelschmidt added a comment - The point is that, in cases where we can identify harmful situations (regarding the optimizer hint), it will be quite easy to fix them now that we have done the refactoring of join order. As I said, I'm pretty optimistic my recent fix will do the job, the problem was that FILTER (NOT) EXISTS was decomposed into a FILTER + subquery instead of subquery + FILTER - and when reordering is disabled this gives the wrong result. Apart from the FILTER (NOT) EXISTS the only thing I could think of are complex filters that cannot be attached, BIND shouldn't be a problem (please let me know if you observed otherwise). I'll test that tomorrow. Regarding your second query: I agree, there's a high chance that this has been fixed. If you can provide a reproducible setting please let me know and I'll test it.
          Hide
          michaelschmidt michaelschmidt added a comment - - edited

          Here's another one that's failing with optimizer hint disabled:

          select * { 
             FILTER (?a = ?b)
             OPTIONAL { <http://s> <http://p> ?a }
             <http://s> <http://p> ?b .
             hint:Query hint:optimizer "None" .
          }
          

          The reason is that the FILTER is not attached to the last statement pattern, since the OPTIONAL does not guarantee us that the variable ?a is bound. We need to selectively enable the filter placement mechanism of the ASTJoinGroupOrderOptimizer to assert correctness here (while not reordering anything else).

          Provided a fix and also added a test case in branch BLZG-1021. If CI runs through I'd consider this issue solved.

          Show
          michaelschmidt michaelschmidt added a comment - - edited Here's another one that's failing with optimizer hint disabled: select * { FILTER (?a = ?b) OPTIONAL { <http: //s> <http://p> ?a } <http: //s> <http://p> ?b . hint:Query hint:optimizer "None" . } The reason is that the FILTER is not attached to the last statement pattern, since the OPTIONAL does not guarantee us that the variable ?a is bound. We need to selectively enable the filter placement mechanism of the ASTJoinGroupOrderOptimizer to assert correctness here (while not reordering anything else). Provided a fix and also added a test case in branch BLZG-1021 . If CI runs through I'd consider this issue solved.
          Hide
          michaelschmidt michaelschmidt added a comment -

          CI runs through, so this will be solved in upcoming release. Don't see any other situations where correctness could be violated when optimizer:none is selected. Closing issue. Will also remove the remark in the wiki, it should now always be sound to use the optimizer hint.

          Show
          michaelschmidt michaelschmidt added a comment - CI runs through, so this will be solved in upcoming release. Don't see any other situations where correctness could be violated when optimizer:none is selected. Closing issue. Will also remove the remark in the wiki, it should now always be sound to use the optimizer hint.
          michaelschmidt michaelschmidt made changes -
          Status Open [ 1 ] Accepted [ 10101 ]
          michaelschmidt michaelschmidt made changes -
          Status Accepted [ 10101 ] In Progress [ 3 ]
          Priority Highest [ 1 ]
          michaelschmidt michaelschmidt made changes -
          Status In Progress [ 3 ] Resolved [ 5 ]
          michaelschmidt michaelschmidt made changes -
          Status Resolved [ 5 ] In Review [ 10100 ]
          michaelschmidt michaelschmidt made changes -
          Resolution Done [ 10000 ]
          Status In Review [ 10100 ] Done [ 10000 ]
          beebs Brad Bebee made changes -
          Workflow Trac Import v5 [ 15915 ] Trac Import v6 [ 18246 ]
          beebs Brad Bebee made changes -
          Fix Version/s BLAZEGRAPH_RELEASE_1_5_2 [ 10164 ]
          beebs Brad Bebee made changes -
          Workflow Trac Import v6 [ 18246 ] Trac Import v7 [ 19643 ]
          beebs Brad Bebee made changes -
          Workflow Trac Import v7 [ 19643 ] Trac Import v8 [ 21265 ]

            People

            • Assignee:
              michaelschmidt michaelschmidt
              Reporter:
              jeremycarroll jeremycarroll
            • Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: