Uploaded image for project: 'Blazegraph (by SYSTAP)'
  1. Blazegraph (by SYSTAP)
  2. BLZG-271

Concurrency problem when dirty B+Tree nodes are evicted.

    Details

      Description

      A concurrency problem has been observed with the indices for the LocalTripleStore when running against the journal. The underlying problem is that Node#getChild(int) can now execute in multiple threads for different requests due to the combination of the Memoizer pattern and the UnisolatedReadWriteIndex pattern. This means that touches can drive eviction from the B+Tree's write retention queue in more than one thread. Since eviction of dirty nodes drives IO, this can result in concurrent eviction of a dirty node and its child.

      To close this concurrency hole we need to (a) ensure that AbstractBTree#touch(...) uses appropriate synchronization for a mutable B+Tree; and (b) apply either a Memoizer pattern or a single thread to handle eviction of dirty nodes. I have not analyzed this problem enough to verify whether this could also be handled by synchronization on the BTree object in writeNodeOrLeaf(), which could be a simpler fix, nor whether there is in fact a potential unsynchronized update in touch() for a mutable BTree.

      A stack trace which demonstrates this problem follows. The stack trace showed up in a regression of the truth maintenance stress test. The error is only occasionally observed.

      junit.framework.AssertionFailedError: Not expecting: java.lang.IllegalStateException
      	at com.bigdata.rdf.rules.TestTruthMaintenance.test_stress(TestTruthMaintenance.java:1011)
      Caused by: java.lang.IllegalStateException
      	at com.bigdata.btree.NodeSerializer.encodeLive(NodeSerializer.java:423)
      	at com.bigdata.btree.AbstractBTree.writeNodeOrLeaf(AbstractBTree.java:3630)
      	at com.bigdata.btree.AbstractBTree.writeNodeRecursive(AbstractBTree.java:3477)
      	at com.bigdata.btree.DefaultEvictionListener.evicted(DefaultEvictionListener.java:102)
      	at com.bigdata.btree.DefaultEvictionListener.evicted(DefaultEvictionListener.java:38)
      	at com.bigdata.cache.HardReferenceQueue.evict(HardReferenceQueue.java:226)
      	at com.bigdata.cache.HardReferenceQueue.beforeOffer(HardReferenceQueue.java:199)
      	at com.bigdata.cache.RingBuffer.add(RingBuffer.java:159)
      	at com.bigdata.cache.HardReferenceQueue.add(HardReferenceQueue.java:176)
      	at com.bigdata.btree.AbstractBTree.doTouch(AbstractBTree.java:3365)
      	at com.bigdata.btree.AbstractBTree.touch(AbstractBTree.java:3331)
      	at com.bigdata.btree.AbstractNode.<init>(AbstractNode.java:297)
      	at com.bigdata.btree.Leaf.<init>(Leaf.java:266)
      	at com.bigdata.btree.BTree$NodeFactory.allocLeaf(BTree.java:1636)
      	at com.bigdata.btree.AbstractBTree.readNodeOrLeaf(AbstractBTree.java:3772)
      	at com.bigdata.btree.Node._getChild(Node.java:2653)
      	at com.bigdata.btree.AbstractBTree$1.compute(AbstractBTree.java:340)
      	at com.bigdata.btree.AbstractBTree$1.compute(AbstractBTree.java:325)
      	at com.bigdata.util.concurrent.Memoizer$1.call(Memoizer.java:79)
      	at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
      	at java.util.concurrent.FutureTask.run(FutureTask.java:138)
      	at com.bigdata.util.concurrent.Memoizer.compute(Memoizer.java:87)
      	at com.bigdata.btree.AbstractBTree.loadChild(AbstractBTree.java:440)
      	at com.bigdata.btree.Node.getChild(Node.java:2564)
      	at com.bigdata.btree.ChildIterator.next(ChildIterator.java:163)
      	at com.bigdata.btree.ChildIterator.next(ChildIterator.java:37)
      	at cutthecrap.utils.striterators.Expanderator.hasNext(Expanderator.java:59)
      	at cutthecrap.utils.striterators.Striterator.hasNext(Striterator.java:55)
      	at cutthecrap.utils.striterators.Appenderator.hasNext(Appenderator.java:52)
      	at cutthecrap.utils.striterators.Striterator.hasNext(Striterator.java:55)
      	at cutthecrap.utils.striterators.Expanderator.hasNext(Expanderator.java:56)
      	at cutthecrap.utils.striterators.Striterator.hasNext(Striterator.java:55)
      	at cutthecrap.utils.striterators.Appenderator.hasNext(Appenderator.java:52)
      	at cutthecrap.utils.striterators.Striterator.hasNext(Striterator.java:55)
      	at cutthecrap.utils.striterators.Expanderator.hasNext(Expanderator.java:56)
      	at cutthecrap.utils.striterators.Striterator.hasNext(Striterator.java:55)
      	at cutthecrap.utils.striterators.Appenderator.hasNext(Appenderator.java:52)
      	at cutthecrap.utils.striterators.Striterator.hasNext(Striterator.java:55)
      	at cutthecrap.utils.striterators.Expanderator.hasNext(Expanderator.java:58)
      	at cutthecrap.utils.striterators.Striterator.hasNext(Striterator.java:55)
      	at com.bigdata.btree.AbstractNode$PostOrderEntryIterator.hasNext(AbstractNode.java:633)
      	at com.bigdata.btree.ResultSet.<init>(ResultSet.java:1059)
      	at com.bigdata.btree.ChunkedLocalRangeIterator.getResultSet(ChunkedLocalRangeIterator.java:140)
      	at com.bigdata.btree.UnisolatedReadWriteIndex$ChunkedIterator.getResultSet(UnisolatedReadWriteIndex.java:720)
      	at com.bigdata.btree.AbstractChunkedTupleIterator.rangeQuery(AbstractChunkedTupleIterator.java:304)
      	at com.bigdata.btree.AbstractChunkedTupleIterator.hasNext(AbstractChunkedTupleIterator.java:426)
      	at cutthecrap.utils.striterators.Resolverator.hasNext(Resolverator.java:48)
      	at cutthecrap.utils.striterators.Striterator.hasNext(Striterator.java:55)
      	at com.bigdata.relation.accesspath.AbstractAccessPath.synchronousIterator(AbstractAccessPath.java:965)
      	at com.bigdata.relation.accesspath.AbstractAccessPath.iterator(AbstractAccessPath.java:882)
      	at com.bigdata.relation.accesspath.AbstractAccessPath.iterator(AbstractAccessPath.java:612)
      	at com.bigdata.rdf.inf.Justification.isGrounded(Justification.java:724)
      	at com.bigdata.rdf.inf.Justification.isGrounded(Justification.java:845)
      	at com.bigdata.rdf.inf.Justification.isGrounded(Justification.java:774)
      	at com.bigdata.rdf.inf.Justification.isGrounded(Justification.java:643)
      	at com.bigdata.rdf.inf.TruthMaintenance.retractAll(TruthMaintenance.java:778)
      	at com.bigdata.rdf.inf.TruthMaintenance.retractAll(TruthMaintenance.java:967)
      	at com.bigdata.rdf.inf.TruthMaintenance.retractAll(TruthMaintenance.java:515)
      	at com.bigdata.rdf.rules.TestTruthMaintenance.retractAndAssert(TestTruthMaintenance.java:1144)
      	at com.bigdata.rdf.rules.TestTruthMaintenance.retractAndAssert(TestTruthMaintenance.java:1156)
      	at com.bigdata.rdf.rules.TestTruthMaintenance.retractAndAssert(TestTruthMaintenance.java:1156)
      	at com.bigdata.rdf.rules.TestTruthMaintenance.doStressTest(TestTruthMaintenance.java:1072)
      	at com.bigdata.rdf.rules.TestTruthMaintenance.test_stress(TestTruthMaintenance.java:1002)
      

        Issue Links

          Activity

          Hide
          bryanthompson bryanthompson added a comment -

          I believe that [1] actually documents why this issue can not be observed (due to a synchronized(this) block for the code path associated with read or write operations against a mutable BTree).

          [1] https://sourceforge.net/apps/trac/bigdata/ticket/71

          Show
          bryanthompson bryanthompson added a comment - I believe that [1] actually documents why this issue can not be observed (due to a synchronized(this) block for the code path associated with read or write operations against a mutable BTree). [1] https://sourceforge.net/apps/trac/bigdata/ticket/71
          Hide
          bryanthompson bryanthompson added a comment -

          My bad. That should have been [1].

          [1] https://sourceforge.net/apps/trac/bigdata/ticket/201

          Show
          bryanthompson bryanthompson added a comment - My bad. That should have been [1] . [1] https://sourceforge.net/apps/trac/bigdata/ticket/201
          Hide
          bryanthompson bryanthompson added a comment -

          I have verified that this issue no longer appears in CI. The issue is now closed.

          Show
          bryanthompson bryanthompson added a comment - I have verified that this issue no longer appears in CI. The issue is now closed.

            People

            • Assignee:
              bryanthompson bryanthompson
              Reporter:
              bryanthompson bryanthompson
            • Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: