Uploaded image for project: 'Blazegraph (by SYSTAP)'
  1. Blazegraph (by SYSTAP)
  2. BLZG-486

BigdataValueFactory.asValue() must return new instance when DummyIV is used.

    Details

      Description

      An NPE has been reported in SPO.hashCode(). Unfortunately, we do not yet have the full bop plan or the original SPARQL query. The bop fragment which causes the error is below, as is the stack trace for the NPE. Based on the bop fragment, this is clearly a default graph query with the context being stripped from quads and a distinct filter being applied to the resulting (s,p,o) triples. One hypothesis is that the DummyIV represents an unknown Value in the query and that we should simply be filtering out any solutions on the default graph access path in which the subject (or other position) is not bound.

      > > ERROR - ChunkedRunningQuery        -
      > > queryId=42c270b6-4eaf-4cc6-b71d-7fb86465c497, bopId=4, 
      > > bop=com.bigdata.bop.join.PipelineJoin[4](StartOp[1])[
      > > com.bigdata.bop.BOp.bopId=4,
      > > com.bigdata.rdf.sail.Rule2BOpUtility.cost.scan=com.bigdata.bop
      > > .cost.ScanCostReport@57801e5f{rangeCount=1,shardCount=1,cost=0.0},
      > > com.bigdata.rdf.sail.Rule2BOpUtility.cost.subquery=null,
      > > com.bigdata.bop.BOp.evaluationContext=ANY,
      > > com.bigdata.bop.join.PipelineJoin.predicate=com.bigdata.rdf.sp
      > > o.SPOPredicate[2](DummyIV,
      > > TermId(96U), tpi__priv__1__=null,
      > > 247919b4-2efa-454e-a053-5fc02d53002d=null)[
      > > com.bigdata.bop.IPredicate.relationName=[kb.spo],
      > > com.bigdata.bop.IPredicate.timestamp=-1309959326929,
      > > com.bigdata.bop.IPredicate.flags=[KEYS,VALS,READONLY,PARALLEL],
      > > com.bigdata.bop.BOp.bopId=2,
      > > com.bigdata.rdf.sail.Rule2BOpUtility.originalIndex=SPOC,
      > > com.bigdata.rdf.sail.Rule2BOpUtility.estimatedCardinality=1,
      > > com.bigdata.bop.IPredicate.accessPathFilter=cutthecrap.utils.s
      > > triterators.NOPFilter@1c4a1bda{annotations=null,filterChain=[c
      > > om.bigdata.bop.rdf.filter.StripContextFilter(),
      > > com.bigdata.bop.ap.filter.DistinctFilter()]}]]
      
      Caused by: java.lang.NullPointerException
      	at com.bigdata.rdf.spo.SPO.hashCode(SPO.java:462)
      	at java.util.HashMap.put(HashMap.java:372)
      	at java.util.HashSet.add(HashSet.java:200)
      	at
      	com.bigdata.bop.ap.filter.DistinctFilter$DistinctFilterImpl.isValid(DistinctFilter.java:117)
      	at
      	cutthecrap.utils.striterators.Filterator.getNext(Filterator.java:70)
      	at
      	cutthecrap.utils.striterators.Prefetch.checkInit(Prefetch.java:12)
      	at
      	cutthecrap.utils.striterators.Prefetch.hasNext(Prefetch.java:20)
      	at
      	cutthecrap.utils.striterators.Striterator.hasNext(Striterator.java:80)
      	at
      	com.bigdata.relation.accesspath.AccessPath.synchronousIterator(AccessPath.java:1048)
      	at
      	com.bigdata.relation.accesspath.AccessPath.iterator(AccessPath.java:965)
      	at
      	com.bigdata.relation.accesspath.AccessPath.iterator(AccessPath.java:688)
      	at
      	com.bigdata.bop.join.PipelineJoin$JoinTask$AccessPathTask.handleJoin(PipelineJoin.java:1683)
      	at
      	com.bigdata.bop.join.PipelineJoin$JoinTask$AccessPathTask.call(PipelineJoin.java:1669)
      	at
      	com.bigdata.bop.join.PipelineJoin$JoinTask$BindingSetConsumerTask.runOneTask(PipelineJoin.java:1146)
      	at
      	com.bigdata.bop.join.PipelineJoin$JoinTask$BindingSetConsumerTask.call(PipelineJoin.java:1072)
      	... 16 more
      

        Activity

        Hide
        gjdev gjdev added a comment -

        I have added a testcase TestNPE.java that on my workstation triggers a failure 9 out of 10 times I run it. (So if it doesn't reproduce directly, please try to run it a couple of times). The testcase is pretty involved in that there are a couple of layers of iterating over query results, while adding a statement from those results, but I think it's small enough to be usable. I have spend quite some time removing stuff that doesn't lead to the error. This also means there is no functional value in what the test does (it just adds a statement in a very complex way). But afaik the code is correct and should not lead to any exception from bigdata.

        My gut feeling is that the ValueFactory.addStatement at line 186 (which uses binding results from a query) is the source of the error. But that's probably something you guys can verify faster than I can.

        Show
        gjdev gjdev added a comment - I have added a testcase TestNPE.java that on my workstation triggers a failure 9 out of 10 times I run it. (So if it doesn't reproduce directly, please try to run it a couple of times). The testcase is pretty involved in that there are a couple of layers of iterating over query results, while adding a statement from those results, but I think it's small enough to be usable. I have spend quite some time removing stuff that doesn't lead to the error. This also means there is no functional value in what the test does (it just adds a statement in a very complex way). But afaik the code is correct and should not lead to any exception from bigdata. My gut feeling is that the ValueFactory.addStatement at line 186 (which uses binding results from a query) is the source of the error. But that's probably something you guys can verify faster than I can.
        Hide
        gjdev gjdev added a comment -

        Please ignore my last comment about the addStatement/bindings -> that line of the code is never executed, since the query does not actually have any results.

        Show
        gjdev gjdev added a comment - Please ignore my last comment about the addStatement/bindings -> that line of the code is never executed, since the query does not actually have any results.
        Hide
        gjdev gjdev added a comment -

        I have attached a simpler version of TestNPE, that shows the error without all the SailWrapper stuff.

        Show
        gjdev gjdev added a comment - I have attached a simpler version of TestNPE, that shows the error without all the SailWrapper stuff.
        Hide
        bryanthompson bryanthompson added a comment -

        Gerjon,

        Please try the following change to BigdataValueFactoryImpl's asValue() method and let me know if this fixes your problems. I've included the entire method inline below so you can just substitute the new version for the old one. Basically, it mints a new BigdataValue if the IV exists and is a "dummy iv" (termId == 0L in the 1.0.0 release).

            final public BigdataValue asValue(final Value v) {
        
                if (v == null)
                    return null;
        
        		if (v instanceof BigdataValueImpl
        				&& ((BigdataValueImpl) v).getValueFactory() == this) {
        
        			final BigdataValueImpl v1 = (BigdataValueImpl) v;
        
        			final IV<?, ?> iv = v1.getIV();
        
        			if (iv == null || iv.isTermId() && iv.getTermId() != 0L) {
        
        				/*
        				 * A value from the same value factory whose IV is either
        				 * unknown or defined (but not a NullIV or DummyIV).
        				 */
        
        				return (BigdataValue) v;
        
        			}
        
                }
        
                if (v instanceof BooleanLiteralImpl) {
                	
            		final BooleanLiteralImpl bl = (BooleanLiteralImpl) v;
            		
                    return createLiteral(bl.booleanValue());
        
                } else if (v instanceof URI) {
                	
                    return createURI(((URI) v).stringValue());
                    
                } else if (v instanceof BNode) {
        
                    return createBNode(((BNode) v).stringValue());
        
                } else if (v instanceof Literal) {
        
                    final Literal tmp = ((Literal) v);
        
                    final String label = tmp.getLabel();
        
                    final String language = tmp.getLanguage();
        
                    final URI datatype = tmp.getDatatype();
        
                    return new BigdataLiteralImpl(//
                            this,// Note: Passing in this factory!
                            label,//
                            language,//
                            (BigdataURI)asValue(datatype)//
                            );
        
                } else {
        
                    throw new AssertionError();
        
                }
        
            }
        
        Show
        bryanthompson bryanthompson added a comment - Gerjon, Please try the following change to BigdataValueFactoryImpl's asValue() method and let me know if this fixes your problems. I've included the entire method inline below so you can just substitute the new version for the old one. Basically, it mints a new BigdataValue if the IV exists and is a "dummy iv" (termId == 0L in the 1.0.0 release). final public BigdataValue asValue(final Value v) { if (v == null) return null; if (v instanceof BigdataValueImpl && ((BigdataValueImpl) v).getValueFactory() == this) { final BigdataValueImpl v1 = (BigdataValueImpl) v; final IV<?, ?> iv = v1.getIV(); if (iv == null || iv.isTermId() && iv.getTermId() != 0L) { /* * A value from the same value factory whose IV is either * unknown or defined (but not a NullIV or DummyIV). */ return (BigdataValue) v; } } if (v instanceof BooleanLiteralImpl) { final BooleanLiteralImpl bl = (BooleanLiteralImpl) v; return createLiteral(bl.booleanValue()); } else if (v instanceof URI) { return createURI(((URI) v).stringValue()); } else if (v instanceof BNode) { return createBNode(((BNode) v).stringValue()); } else if (v instanceof Literal) { final Literal tmp = ((Literal) v); final String label = tmp.getLabel(); final String language = tmp.getLanguage(); final URI datatype = tmp.getDatatype(); return new BigdataLiteralImpl(// this,// Note: Passing in this factory! label,// language,// (BigdataURI)asValue(datatype)// ); } else { throw new AssertionError(); } }
        Hide
        bryanthompson bryanthompson added a comment -

        I have modified BigdataValueFactoryImpl per the change recommended above.

        I have integrated the unit test into the test suite for TestBigdataSailWithQuads.

        I have extended the test suite for the BigdataValueFactoryImpl to verify the behavior when a DummyIV is used.

        These changes have been applied to the 1.0.0 release branch and to the current development branch.

        Committed revision r4890.

        Show
        bryanthompson bryanthompson added a comment - I have modified BigdataValueFactoryImpl per the change recommended above. I have integrated the unit test into the test suite for TestBigdataSailWithQuads. I have extended the test suite for the BigdataValueFactoryImpl to verify the behavior when a DummyIV is used. These changes have been applied to the 1.0.0 release branch and to the current development branch. Committed revision r4890.
        Hide
        bryanthompson bryanthompson added a comment -

        The change described above fixes the original problem. However, it uncovered an issue where equals() was not symmetric when one bigdata value had a "dummy" IV. I have extended the test coverage for this, fixed the various concrete BigdatsValue classes and run through the test suite locally.

        Committed revision r4898.

        Show
        bryanthompson bryanthompson added a comment - The change described above fixes the original problem. However, it uncovered an issue where equals() was not symmetric when one bigdata value had a "dummy" IV. I have extended the test coverage for this, fixed the various concrete BigdatsValue classes and run through the test suite locally. Committed revision r4898.
        Hide
        bryanthompson bryanthompson added a comment -

        The initial fix described above has caused a problem with some tests in SIDs mode. The tests which fail are:

        >>><<< com.bigdata.rdf.rio.TestRDFXMLInterchangeWithStatementIdentifiers.test_rdfXmlInterchange 
        Loading...
        	0.142	1
        >>><<< com.bigdata.rdf.rio.TestRDFXMLInterchangeWithStatementIdentifiers.test_rdfXmlInterchange 
        Loading...
        	0.143	1
        >>><<< com.bigdata.rdf.sail.TestProvenanceQuery.test_query 
        Loading...
        	0.195	1
        >>><<< com.bigdata.rdf.sail.TestReadWriteTransactions.test_multiple_transaction 
        Loading...
        	0.151	1
        >>><<< com.bigdata.rdf.sail.TestSids.testSids 
        Loading...
        	0.139	1
        >>><<< com.bigdata.rdf.sail.TestSids.testSids2 
        Loading...
        	0.137	1
        

        A sample stack trace follows (they all fail in exactly the same way).

        Stacktrace
        
        java.lang.AssertionError: Not fully bound? : < NULL, TermId(9U), TermId(2U) : Explicit >
        	at com.bigdata.rdf.rio.StatementBuffer.addStatements(StatementBuffer.java:853)
        	at com.bigdata.rdf.rio.StatementBuffer.incrementalWrite(StatementBuffer.java:719)
        	at com.bigdata.rdf.rio.StatementBuffer.processDeferredStatements(StatementBuffer.java:567)
        	at com.bigdata.rdf.rio.StatementBuffer.flush(StatementBuffer.java:384)
        	at com.bigdata.rdf.sail.BigdataSail$BigdataSailConnection.flushStatementBuffers(BigdataSail.java:2858)
        	at com.bigdata.rdf.sail.BigdataSail$BigdataSailConnection.flush(BigdataSail.java:2833)
        	at com.bigdata.rdf.sail.BigdataSailRepositoryConnection.flush(BigdataSailRepositoryConnection.java:302)
        	at com.bigdata.rdf.sail.TestSids.testSids2(TestSids.java:263)
        
        Show
        bryanthompson bryanthompson added a comment - The initial fix described above has caused a problem with some tests in SIDs mode. The tests which fail are: >>><<< com.bigdata.rdf.rio.TestRDFXMLInterchangeWithStatementIdentifiers.test_rdfXmlInterchange Loading... 0.142 1 >>><<< com.bigdata.rdf.rio.TestRDFXMLInterchangeWithStatementIdentifiers.test_rdfXmlInterchange Loading... 0.143 1 >>><<< com.bigdata.rdf.sail.TestProvenanceQuery.test_query Loading... 0.195 1 >>><<< com.bigdata.rdf.sail.TestReadWriteTransactions.test_multiple_transaction Loading... 0.151 1 >>><<< com.bigdata.rdf.sail.TestSids.testSids Loading... 0.139 1 >>><<< com.bigdata.rdf.sail.TestSids.testSids2 Loading... 0.137 1 A sample stack trace follows (they all fail in exactly the same way). Stacktrace java.lang.AssertionError: Not fully bound? : < NULL, TermId(9U), TermId(2U) : Explicit > at com.bigdata.rdf.rio.StatementBuffer.addStatements(StatementBuffer.java:853) at com.bigdata.rdf.rio.StatementBuffer.incrementalWrite(StatementBuffer.java:719) at com.bigdata.rdf.rio.StatementBuffer.processDeferredStatements(StatementBuffer.java:567) at com.bigdata.rdf.rio.StatementBuffer.flush(StatementBuffer.java:384) at com.bigdata.rdf.sail.BigdataSail$BigdataSailConnection.flushStatementBuffers(BigdataSail.java:2858) at com.bigdata.rdf.sail.BigdataSail$BigdataSailConnection.flush(BigdataSail.java:2833) at com.bigdata.rdf.sail.BigdataSailRepositoryConnection.flush(BigdataSailRepositoryConnection.java:302) at com.bigdata.rdf.sail.TestSids.testSids2(TestSids.java:263)
        Hide
        bryanthompson bryanthompson added a comment -

        Bug fix to BigdataValueFactory.asValue() in the 1.0.0 release branch. It was always returning a new value if the given value was not a TermId. This was causing the SIDs tests to fail since a SidIV is not a TermId.

        Test setup/tear down changes to TestFactory in the 1.0.0 release branch and in the development branch. The fixture reference is now cleared when the test is torn down to prevent memory leaks during CI.

        Committed revision r4899.

        Show
        bryanthompson bryanthompson added a comment - Bug fix to BigdataValueFactory.asValue() in the 1.0.0 release branch. It was always returning a new value if the given value was not a TermId. This was causing the SIDs tests to fail since a SidIV is not a TermId. Test setup/tear down changes to TestFactory in the 1.0.0 release branch and in the development branch. The fixture reference is now cleared when the test is torn down to prevent memory leaks during CI. Committed revision r4899.

          People

          • Assignee:
            bryanthompson bryanthompson
            Reporter:
            bryanthompson bryanthompson
          • Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved: