Uploaded image for project: 'Blazegraph (by SYSTAP)'
  1. Blazegraph (by SYSTAP)
  2. BLZG-481

Stochastic assert in AbstractBTree#writeNodeOrLeaf() in CI

    XMLWordPrintable

    Details

    • Type: Bug
    • Status: Closed - Won't Fix
    • Resolution: Incomplete
    • Affects Version/s: BIGDATA_RELEASE_1_3_1
    • Fix Version/s: None
    • Component/s: B+Tree
    • Labels:
      None

      Description

      There is a stochastic failure in AbstractBTree#writeNodeOfLeaf() in CI runs which is sometimes observed for TestMROWTransactions and also in StressTestUnisolatedReadWriteIndex (7/16/2014).

      The relevant line in AbstractBTree is:

          protected long writeNodeOrLeaf(final AbstractNode<?> node) {
      
              assert root != null; // i.e., isOpen().
              assert node != null;
              assert node.btree == this;
              assert node.dirty;
              assert !node.deleted;
              assert !node.isPersistent();
              assert !node.isReadOnly(); // FIXME Occasional CI errors on this assert for TestMROWTransactions
              assertNotReadOnly();
      

      This error is suggestive of a concurrent update during the commit protocol, which is something which we addressed in some depth a few months ago and which was corrected, in part, by protecting the commit using the same lock as the UnisolatedReadWriteIndex.

      The failure was observed in the QUADS_QUERY_BRANCH, but it is presumed to be present in the TERMS_REFACTOR_BRANCH as well. There were 20 clean builds between the exceptions listed below. These are the only builds in which this assert has been tripped since the fix to the "Record already in cache" (CI build #1431).

      This exception is from CI build #1451.

      junit.framework.AssertionFailedError: Test failed: 
      firstCause=java.lang.RuntimeException: Could not commit
      index: name=kb.spo.POS,
      commitList=[DirtyListener{name=kb.spo.OSP,checkpointAddr=0},
      DirtyListener{name=kb.spo.POS,needsCheckpoint},
      DirtyListener{name=kb.spo.SPO,needsCheckpoint}]: 
      lastRootBlock=rootBlock{ rootBlock=1, challisField=37, version=2, 
      nextOffset=4346506912097, localTime=1309190439952 [Monday, June 27, 
      2011 12:00:39 PM EDT],
      firstCommitTime=1309190381802 [Monday, June 27, 2011 11:59:41 AM EDT], 
      lastCommitTime=1309190439934 [Monday, June 27, 2011
      12:00:39 PM EDT], commitCounter=37,
      commitRecordAddr={off=NATIVE:-60227,len=422},
      commitRecordIndexAddr={off=NATIVE:-8544,len=220},
      quorumToken=0, metaBitsAddr=63174606902, metaStartAddr=1376, 
      storeType=RW, uuid=6f139bc9-46d8-4f33-b199-c105e7dadcce,
      offsetBits=42, checksum=1623663730, createTime=1309190381786 [Monday, 
      June 27, 2011 11:59:41 AM EDT], closeTime=0}
      	at
      com.bigdata.rdf.sail.TestMROWTransactions.test_multiple_csem_t
      ransaction(TestMROWTransactions.java:382)
      Caused by: java.lang.RuntimeException: Could not commit
      index: name=kb.spo.POS,
      commitList=[DirtyListener{name=kb.spo.OSP,checkpointAddr=0},
      DirtyListener{name=kb.spo.POS,needsCheckpoint},
      DirtyListener{name=kb.spo.SPO,needsCheckpoint}]: 
      lastRootBlock=rootBlock{ rootBlock=1, challisField=37, version=2, 
      nextOffset=4346506912097, localTime=1309190439952 [Monday, June 27, 
      2011 12:00:39 PM EDT],
      firstCommitTime=1309190381802 [Monday, June 27, 2011 11:59:41 AM EDT], 
      lastCommitTime=1309190439934 [Monday, June 27, 2011
      12:00:39 PM EDT], commitCounter=37,
      commitRecordAddr={off=NATIVE:-60227,len=422},
      commitRecordIndexAddr={off=NATIVE:-8544,len=220},
      quorumToken=0, metaBitsAddr=63174606902, metaStartAddr=1376, 
      storeType=RW, uuid=6f139bc9-46d8-4f33-b199-c105e7dadcce,
      offsetBits=42, checksum=1623663730, createTime=1309190381786 [Monday, 
      June 27, 2011 11:59:41 AM EDT], closeTime=0}
      	at
      com.bigdata.journal.AbstractJournal.commit(AbstractJournal.java:2252)
      	at
      com.bigdata.rdf.store.LocalTripleStore.commit(LocalTripleStore
      .java:74)
      	at
      com.bigdata.rdf.sail.TestMROWTransactions$1Writer.call(TestMRO
      WTransactions.java:274)
      	at
      com.bigdata.rdf.sail.TestMROWTransactions$1Writer.call(TestMRO
      WTransactions.java:247)
      	at
      java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
      	at java.util.concurrent.FutureTask.run(FutureTask.java:138)
      	at
      java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadP
      oolExecutor.java:886)
      	at
      java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolE
      xecutor.java:908)
      	at java.lang.Thread.run(Thread.java:680)
      Caused by: java.lang.RuntimeException: Could not commit
      index: name=kb.spo.POS,
      commitList=[DirtyListener{name=kb.spo.OSP,checkpointAddr=0},
      DirtyListener{name=kb.spo.POS,needsCheckpoint},
      DirtyListener{name=kb.spo.SPO,needsCheckpoint}]
      	at
      com.bigdata.journal.Name2Addr.handleCommit(Name2Addr.java:571)
      	at
      com.bigdata.journal.AbstractJournal.notifyCommitters(AbstractJ
      ournal.java:1983)
      	at
      com.bigdata.journal.AbstractJournal.commitNow(AbstractJournal.
      java:2336)
      	at
      com.bigdata.journal.AbstractJournal.commit(AbstractJournal.java:2250)
      Caused by: java.lang.AssertionError
      	at
      com.bigdata.btree.AbstractBTree.writeNodeOrLeaf(AbstractBTree.
      java:3561)
      	at
      com.bigdata.btree.AbstractBTree.writeNodeRecursive(AbstractBTr
      ee.java:3485)
      	at com.bigdata.btree.BTree.flush(BTree.java:770)
      	at com.bigdata.btree.BTree.writeCheckpoint2(BTree.java:910)
      	at com.bigdata.btree.BTree.writeCheckpoint(BTree.java:881)
      	at com.bigdata.btree.BTree.handleCommit(BTree.java:1210)
      	at
      com.bigdata.journal.Name2Addr.handleCommit(Name2Addr.java:563)
      

      This exception is from CI build #1471.

      junit.framework.AssertionFailedError: Test failed: firstCause=java.lang.RuntimeException: Could not commit index: name=kb.spo.POS, commitList=[DirtyListener{name=kb.spo.OSP,checkpointAddr=0}, DirtyListener{name=kb.spo.POS,needsCheckpoint}, DirtyListener{name=kb.spo.SPO,needsCheckpoint}]: lastRootBlock=rootBlock{ rootBlock=0, challisField=36, version=2, nextOffset=4273492468087, localTime=1309056345571 [Saturday, June 25, 2011 10:45:45 PM EDT], firstCommitTime=1309056286594 [Saturday, June 25, 2011 10:44:46 PM EDT], lastCommitTime=1309056345554 [Saturday, June 25, 2011 10:45:45 PM EDT], commitCounter=36, commitRecordAddr={off=NATIVE:-61228,len=422}, commitRecordIndexAddr={off=NATIVE:-8566,len=220}, quorumToken=0, metaBitsAddr=63635980342, metaStartAddr=1376, storeType=RW, uuid=80833dc5-f461-4f66-8ffd-54298387ded9, offsetBits=42, checksum=-1789582556, createTime=1309056286577 [Saturday, June 25, 2011 10:44:46 PM EDT], closeTime=0}
      	at com.bigdata.rdf.sail.TestMROWTransactions.test_multiple_csem_transaction(TestMROWTransactions.java:382)
      Caused by: java.lang.RuntimeException: Could not commit index: name=kb.spo.POS, commitList=[DirtyListener{name=kb.spo.OSP,checkpointAddr=0}, DirtyListener{name=kb.spo.POS,needsCheckpoint}, DirtyListener{name=kb.spo.SPO,needsCheckpoint}]: lastRootBlock=rootBlock{ rootBlock=0, challisField=36, version=2, nextOffset=4273492468087, localTime=1309056345571 [Saturday, June 25, 2011 10:45:45 PM EDT], firstCommitTime=1309056286594 [Saturday, June 25, 2011 10:44:46 PM EDT], lastCommitTime=1309056345554 [Saturday, June 25, 2011 10:45:45 PM EDT], commitCounter=36, commitRecordAddr={off=NATIVE:-61228,len=422}, commitRecordIndexAddr={off=NATIVE:-8566,len=220}, quorumToken=0, metaBitsAddr=63635980342, metaStartAddr=1376, storeType=RW, uuid=80833dc5-f461-4f66-8ffd-54298387ded9, offsetBits=42, checksum=-1789582556, createTime=1309056286577 [Saturday, June 25, 2011 10:44:46 PM EDT], closeTime=0}
      	at com.bigdata.journal.AbstractJournal.commit(AbstractJournal.java:2252)
      	at com.bigdata.rdf.store.LocalTripleStore.commit(LocalTripleStore.java:74)
      	at com.bigdata.rdf.sail.TestMROWTransactions$1Writer.call(TestMROWTransactions.java:274)
      	at com.bigdata.rdf.sail.TestMROWTransactions$1Writer.call(TestMROWTransactions.java:247)
      	at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
      	at java.util.concurrent.FutureTask.run(FutureTask.java:138)
      	at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
      	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
      	at java.lang.Thread.run(Thread.java:680)
      Caused by: java.lang.RuntimeException: Could not commit index: name=kb.spo.POS, commitList=[DirtyListener{name=kb.spo.OSP,checkpointAddr=0}, DirtyListener{name=kb.spo.POS,needsCheckpoint}, DirtyListener{name=kb.spo.SPO,needsCheckpoint}]
      	at com.bigdata.journal.Name2Addr.handleCommit(Name2Addr.java:571)
      	at com.bigdata.journal.AbstractJournal.notifyCommitters(AbstractJournal.java:1983)
      	at com.bigdata.journal.AbstractJournal.commitNow(AbstractJournal.java:2336)
      	at com.bigdata.journal.AbstractJournal.commit(AbstractJournal.java:2250)
      Caused by: java.lang.AssertionError
      	at com.bigdata.btree.AbstractBTree.writeNodeOrLeaf(AbstractBTree.java:3561)
      	at com.bigdata.btree.AbstractBTree.writeNodeRecursive(AbstractBTree.java:3485)
      	at com.bigdata.btree.BTree.flush(BTree.java:770)
      	at com.bigdata.btree.BTree.writeCheckpoint2(BTree.java:910)
      	at com.bigdata.btree.BTree.writeCheckpoint(BTree.java:881)
      	at com.bigdata.btree.BTree.handleCommit(BTree.java:1210)
      	at com.bigdata.journal.Name2Addr.handleCommit(Name2Addr.java:563) 
      

      See BLZG-933 (AssertionError: Child does not have persistent identity) which is pretty much a duplicate of this.

        Attachments

          Issue Links

            Activity

              People

              Assignee:
              bryanthompson bryanthompson
              Reporter:
              bryanthompson bryanthompson
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

                Dates

                Created:
                Updated:
                Resolved: