Details

      Description

      putting a hint:query hint:optimizer "None" into a query with a FILTER EXISTS does not work because the FILTER EXISTS gets split into a subselect that is put at the end and a FILTER that is left where it is

        Issue Links

          Activity

          Hide
          michaelschmidt michaelschmidt added a comment -

          CI runs through, so this will be solved in upcoming release. Don't see any other situations where correctness could be violated when optimizer:none is selected. Closing issue. Will also remove the remark in the wiki, it should now always be sound to use the optimizer hint.

          Show
          michaelschmidt michaelschmidt added a comment - CI runs through, so this will be solved in upcoming release. Don't see any other situations where correctness could be violated when optimizer:none is selected. Closing issue. Will also remove the remark in the wiki, it should now always be sound to use the optimizer hint.
          Hide
          michaelschmidt michaelschmidt added a comment - - edited

          Here's another one that's failing with optimizer hint disabled:

          select * { 
             FILTER (?a = ?b)
             OPTIONAL { <http://s> <http://p> ?a }
             <http://s> <http://p> ?b .
             hint:Query hint:optimizer "None" .
          }
          

          The reason is that the FILTER is not attached to the last statement pattern, since the OPTIONAL does not guarantee us that the variable ?a is bound. We need to selectively enable the filter placement mechanism of the ASTJoinGroupOrderOptimizer to assert correctness here (while not reordering anything else).

          Provided a fix and also added a test case in branch BLZG-1021. If CI runs through I'd consider this issue solved.

          Show
          michaelschmidt michaelschmidt added a comment - - edited Here's another one that's failing with optimizer hint disabled: select * { FILTER (?a = ?b) OPTIONAL { <http: //s> <http://p> ?a } <http: //s> <http://p> ?b . hint:Query hint:optimizer "None" . } The reason is that the FILTER is not attached to the last statement pattern, since the OPTIONAL does not guarantee us that the variable ?a is bound. We need to selectively enable the filter placement mechanism of the ASTJoinGroupOrderOptimizer to assert correctness here (while not reordering anything else). Provided a fix and also added a test case in branch BLZG-1021 . If CI runs through I'd consider this issue solved.
          Hide
          michaelschmidt michaelschmidt added a comment -

          The point is that, in cases where we can identify harmful situations (regarding the optimizer hint), it will be quite easy to fix them now that we have done the refactoring of join order. As I said, I'm pretty optimistic my recent fix will do the job, the problem was that FILTER (NOT) EXISTS was decomposed into a FILTER + subquery instead of subquery + FILTER - and when reordering is disabled this gives the wrong result.

          Apart from the FILTER (NOT) EXISTS the only thing I could think of are complex filters that cannot be attached, BIND shouldn't be a problem (please let me know if you observed otherwise). I'll test that tomorrow.

          Regarding your second query: I agree, there's a high chance that this has been fixed. If you can provide a reproducible setting please let me know and I'll test it.

          Show
          michaelschmidt michaelschmidt added a comment - The point is that, in cases where we can identify harmful situations (regarding the optimizer hint), it will be quite easy to fix them now that we have done the refactoring of join order. As I said, I'm pretty optimistic my recent fix will do the job, the problem was that FILTER (NOT) EXISTS was decomposed into a FILTER + subquery instead of subquery + FILTER - and when reordering is disabled this gives the wrong result. Apart from the FILTER (NOT) EXISTS the only thing I could think of are complex filters that cannot be attached, BIND shouldn't be a problem (please let me know if you observed otherwise). I'll test that tomorrow. Regarding your second query: I agree, there's a high chance that this has been fixed. If you can provide a reproducible setting please let me know and I'll test it.
          Hide
          jjc Jeremy Carroll added a comment -

          I have also seen incorrect results with a hint:Prior hint:runLast true, hint - but we failed to get a strong repro.

          select *
              WHERE
              {
                   
                  <http://localhost:8000/graph/syapse#subClassOf> rdfs:domain ?domain .
                  <http://localhost:8000/graph/syapse#subClassOf> rdfs:range ?range .
                  { GRAPH <http://localhost:8000/graph/ontology/sys:sys> {
                      ?superNew rdf:type  ?range
                  } }
                  UNION
                  { ?range rdf:type owl:Restriction ;
                          owl:onProperty ?p ;
                          owl:hasValue ?v
                      GRAPH <http://localhost:8000/graph/ontology/sys:sys> {
                        ?superNew ?p ?v
                      }
                  }
                  ?subNew rdfs:subClassOf * ?superNew .
                #  hint:Prior hint:runLast true .
                  FILTER EXISTS {
                    # Force range check to be last
                      {  ?subNew rdf:type ?domain }
                      UNION
                      {  ?domain rdf:type owl:Restriction ;
                          owl:onProperty ?pp ;
                          owl:hasValue ?vv .
                          ?subNew ?pp ?vv
                      }
                  }
                  
              }
          

          seems like there is hope that this may be fixed with 1.5.2

          Show
          jjc Jeremy Carroll added a comment - I have also seen incorrect results with a hint:Prior hint:runLast true, hint - but we failed to get a strong repro. select * WHERE { <http://localhost:8000/graph/syapse#subClassOf> rdfs:domain ?domain . <http://localhost:8000/graph/syapse#subClassOf> rdfs:range ?range . { GRAPH <http://localhost:8000/graph/ontology/sys:sys> { ?superNew rdf:type ?range } } UNION { ?range rdf:type owl:Restriction ; owl:onProperty ?p ; owl:hasValue ?v GRAPH <http://localhost:8000/graph/ontology/sys:sys> { ?superNew ?p ?v } } ?subNew rdfs:subClassOf * ?superNew . # hint:Prior hint:runLast true . FILTER EXISTS { # Force range check to be last { ?subNew rdf:type ?domain } UNION { ?domain rdf:type owl:Restriction ; owl:onProperty ?pp ; owl:hasValue ?vv . ?subNew ?pp ?vv } } } seems like there is hope that this may be fixed with 1.5.2
          Hide
          jjc Jeremy Carroll added a comment - - edited

          In my opinion an adequate partial fix would be to raise an error in the combo of optimizer = none and filter (not) exists. Or maybe optimizer none and bind.
          The really bad bug is an optimizer hint that silently gives incorrect results.

          Show
          jjc Jeremy Carroll added a comment - - edited In my opinion an adequate partial fix would be to raise an error in the combo of optimizer = none and filter (not) exists. Or maybe optimizer none and bind. The really bad bug is an optimizer hint that silently gives incorrect results.

            People

            • Assignee:
              michaelschmidt michaelschmidt
              Reporter:
              jeremycarroll jeremycarroll
            • Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: