Details

    • Type: Bug
    • Status: In Progress
    • Priority: Highest
    • Resolution: Unresolved
    • Affects Version/s: BIGDATA_RELEASE_1_2_2
    • Fix Version/s: BLAZEGRAPH_2_X_BACKLOG
    • Component/s: Bigdata SAIL
    • Labels:
      None

      Description

      This query gave :bob in the result set, which surprised me.

      PREFIX :       <http://example/>
      
      SELECT  ?s
      WHERE {
         { 
           BIND ( :bob as ?s )
         } UNION {
         }
         MINUS {
            BIND ( :bob as ?s )
         }
      } LIMIT 10
      

        Activity

        Hide
        jeremycarroll jeremycarroll added a comment -

        Plausibly related is this query

        PREFIX :       <http://example/>
        
        SELECT  ?s
        WHERE {
           { 
             BIND ( :bob as ?s )
           } UNION {
           }
           FILTER (!BOUND(?s) || ?s != :bob)
        }
        

        which also surprises me in giving :bob in the result set

        Show
        jeremycarroll jeremycarroll added a comment - Plausibly related is this query PREFIX : <http://example/> SELECT ?s WHERE { { BIND ( :bob as ?s ) } UNION { } FILTER (!BOUND(?s) || ?s != :bob) } which also surprises me in giving :bob in the result set
        Hide
        michaelschmidt michaelschmidt added a comment - - edited

        There are actually two problems:

        1.) bigdata-rdf/src/java/com/bigdata/rdf/sparql/ast/optimizers/ASTBottomUpOptimizer.java is too generous in optimizing a way MINUS blocks.

        {{

        { final Set<IVariable<?>> incomingBound = sa.getDefinitelyIncomingBindings(childGroup, ...) }

        }}

        should be

           // the static condition under which we can drop the MINUS is
           // that the variables that left and right variables do not
           // overlap, satisfying the condition that the intersection of
           // the left and right variables is empty; for a justification
           // see http://www.w3.org/TR/sparql11-query/#sparqlAlgebra
           final Set<IVariable<?>> incomingBound = sa.getMaybeIncomingBindings(childGroup, ...)
        
        

        2.) The latter will not fix the problem per se. For evaluating MINUS, the NOT EXISTS clause is currenty used. However, there is a difference (basically the case when the left and right side variables are disjoint), and I think that this cannot be statically checked in the general case… See http://www.w3.org/TR/sparql11-query/#sparqlAlgebra and the comment above.

        com.bigdata.bop.join.JoinTypeEnum.Exists must be extended by a MINUS join type, which must then be implemented properly.

        3.) The test case com.bigdata.rdf.sparql.ast.eval.TestUnionMinus must be hooked in (containing, amonst others, the tests for this ticket)

        Show
        michaelschmidt michaelschmidt added a comment - - edited There are actually two problems: 1.) bigdata-rdf/src/java/com/bigdata/rdf/sparql/ast/optimizers/ASTBottomUpOptimizer.java is too generous in optimizing a way MINUS blocks. {{ { final Set<IVariable<?>> incomingBound = sa.getDefinitelyIncomingBindings(childGroup, ...) } }} should be // the static condition under which we can drop the MINUS is // that the variables that left and right variables do not // overlap, satisfying the condition that the intersection of // the left and right variables is empty; for a justification // see http://www.w3.org/TR/sparql11-query/#sparqlAlgebra final Set<IVariable<?>> incomingBound = sa.getMaybeIncomingBindings(childGroup, ...) 2.) The latter will not fix the problem per se. For evaluating MINUS, the NOT EXISTS clause is currenty used. However, there is a difference (basically the case when the left and right side variables are disjoint), and I think that this cannot be statically checked in the general case… See http://www.w3.org/TR/sparql11-query/#sparqlAlgebra and the comment above. com.bigdata.bop.join.JoinTypeEnum.Exists must be extended by a MINUS join type, which must then be implemented properly. 3.) The test case com.bigdata.rdf.sparql.ast.eval.TestUnionMinus must be hooked in (containing, amonst others, the tests for this ticket)
        Hide
        michaelschmidt michaelschmidt added a comment -

        Fixed a couple of cases where MINUS was optimized away errorneously. Concretely, the MINUS was optimized away for the query at hand before, and this is why Bob was contained in the result. The fix is described by 1.) above.

        2.) + 3.) remain open (I enabled the succeeding TestUnionMinus tests, but the rest need to be enabled as well once we have implemented the new operator).

        Verified changes in CI, will merge down into master and leave this ticket open.

        Show
        michaelschmidt michaelschmidt added a comment - Fixed a couple of cases where MINUS was optimized away errorneously. Concretely, the MINUS was optimized away for the query at hand before, and this is why Bob was contained in the result. The fix is described by 1.) above. 2.) + 3.) remain open (I enabled the succeeding TestUnionMinus tests, but the rest need to be enabled as well once we have implemented the new operator). Verified changes in CI, will merge down into master and leave this ticket open.
        Hide
        jjc Jeremy Carroll added a comment -

        If I have a query involving MINUS, and I wish to have a similar query that is definitely not impacted by this defect can I just add a shared variable ... e.g.

        instead of the currently incorrect:

        select *
        { { { BIND(1 as $x) } UNION { BIND(2 as $y ) } }
          MINUS
         { { BIND(1 as $x) } UNION { BIND(2 as $y ) } }
        }
        

        I could ask

        select *
        { { BIND(3 as $z) { BIND(1 as $x) } UNION { BIND(2 as $y ) } }
          MINUS
         { BIND(3 as $z) { BIND(1 as $x) } UNION { BIND(2 as $y ) } }
        }
        

        and this is a perfectly general pattern.

        In fact, could this even be put in a static optimizer, so the runtime code remains with this defect, which appears hard to solve, but the static optimizer for MINUS adds an additional dummy variable that ensures the case never occurs?

        Show
        jjc Jeremy Carroll added a comment - If I have a query involving MINUS, and I wish to have a similar query that is definitely not impacted by this defect can I just add a shared variable ... e.g. instead of the currently incorrect: select * { { { BIND(1 as $x) } UNION { BIND(2 as $y ) } } MINUS { { BIND(1 as $x) } UNION { BIND(2 as $y ) } } } I could ask select * { { BIND(3 as $z) { BIND(1 as $x) } UNION { BIND(2 as $y ) } } MINUS { BIND(3 as $z) { BIND(1 as $x) } UNION { BIND(2 as $y ) } } } and this is a perfectly general pattern. In fact, could this even be put in a static optimizer, so the runtime code remains with this defect, which appears hard to solve, but the static optimizer for MINUS adds an additional dummy variable that ensures the case never occurs?
        Hide
        michaelschmidt michaelschmidt added a comment -

        Jeremy, the problem with your query was not related to the "dynamic" evaluation problem (bullets 2.) and 3.) above) but a minor defect in the optimizer. I've fixed it in https://github.com/SYSTAP/bigdata/pull/263, also the first query now runs succesfully. Will merge this into master once CI is clean.

        Leaving this ticket open, since there are still some edge cases for which MINUS does not work properly, as discussed above.

        Show
        michaelschmidt michaelschmidt added a comment - Jeremy, the problem with your query was not related to the "dynamic" evaluation problem (bullets 2.) and 3.) above) but a minor defect in the optimizer. I've fixed it in https://github.com/SYSTAP/bigdata/pull/263 , also the first query now runs succesfully. Will merge this into master once CI is clean. Leaving this ticket open, since there are still some edge cases for which MINUS does not work properly, as discussed above.

          People

          • Assignee:
            michaelschmidt michaelschmidt
            Reporter:
            jeremycarroll jeremycarroll
          • Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

            • Created:
              Updated: