Uploaded image for project: 'Blazegraph (by SYSTAP)'
  1. Blazegraph (by SYSTAP)
  2. BLZG-291

ClientService fails to make progress during bulk data load

    Details

      Description

      A problem has been identified where some of the client services may fail to make progressing during a bulk data load. The symptom is that toldTriplesRestartSafeCount becomes flat at some point for that client. The other client(s) may continue to process documents.

      The stuck client will continue to make progress against some shards and will continue to parse documents, but it will be unable to reach completion on most (or all) documents because some threads are blocked awaiting the inner ReentrantLock on the documentRestartSafeLatch for some document. The stack trace passes through handleChunk() and into Latch.dec(), as can be seen below. Since the stack trace passes through handleChunk(), the client is unable to write any further data on the shard associated with that chunk until the lock is obtained. However, based on an examination of thread dumps, there is no thread holding the lock.

      The problem appears to be related to a JVM bug, http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6822370. While that bug is resolved in JDK1.6.0_18, there are problems with JDK1.6.0_18 which lead to segfaults. However, a workaround is specified in that bug report, which is to specify "-XX:+UseMembar" as a JVM parameter.

      "com.bigdata.service.jini.JiniFederation.executorService424" daemon prio=10 tid=0x00002ab07366b800 nid=0x724d waiting on condition [0x0000000060188000]
      java.lang.Thread.State: WAITING (parking)
      at sun.misc.Unsafe.park(Native Method)
      - parking to wait for <0x00002aaab43b4cf0> (a java.util.concurrent.locks.ReentrantLock$NonfairSync)
      at java.util.concurrent.locks.LockSupport.park(LockSupport.java:158)
      at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:747)
      at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:778)
      at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1114)
      at java.util.concurrent.locks.ReentrantLock$NonfairSync.lock(ReentrantLock.java:186)
      at java.util.concurrent.locks.ReentrantLock.lock(ReentrantLock.java:262)
      at com.bigdata.util.concurrent.Latch.dec(Latch.java:206)
      at com.bigdata.service.ndx.pipeline.KVOC.done(KVOC.java:60)
      at com.bigdata.service.ndx.pipeline.IndexPartitionWriteTask.handleChunk(IndexPartitionWriteTask.java:285)
      at com.bigdata.service.ndx.pipeline.IndexPartitionWriteTask.handleChunk(IndexPartitionWriteTask.java:53)
      at com.bigdata.service.ndx.pipeline.AbstractSubtask.call(AbstractSubtask.java:182)
      at com.bigdata.service.ndx.pipeline.AbstractSubtask.call(AbstractSubtask.java:66)
      at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
      at java.util.concurrent.FutureTask.run(FutureTask.java:138)
      at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
      at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
      at java.lang.Thread.run(Thread.java:619)

        Activity

        beebs Brad Bebee created issue -
        Hide
        bryanthompson bryanthompson added a comment -

        Initial testing using "-XX:+UseMembar" as a workaround looks positive. This issue is closed pending further developments.

        Show
        bryanthompson bryanthompson added a comment - Initial testing using "-XX:+UseMembar" as a workaround looks positive. This issue is closed pending further developments.
        beebs Brad Bebee made changes -
        Field Original Value New Value
        Workflow Trac Import v2 [ 12127 ] Trac Import v3 [ 13730 ]
        beebs Brad Bebee made changes -
        Workflow Trac Import v3 [ 13730 ] Trac Import v4 [ 15059 ]
        beebs Brad Bebee made changes -
        Workflow Trac Import v4 [ 15059 ] Trac Import v5 [ 16445 ]
        beebs Brad Bebee made changes -
        Labels Issue_patch_20150625
        beebs Brad Bebee made changes -
        Status Closed - Won't Fix [ 6 ] Open [ 1 ]
        beebs Brad Bebee made changes -
        Status Open [ 1 ] Accepted [ 10101 ]
        beebs Brad Bebee made changes -
        Status Accepted [ 10101 ] In Progress [ 3 ]
        beebs Brad Bebee made changes -
        Status In Progress [ 3 ] Resolved [ 5 ]
        beebs Brad Bebee made changes -
        Status Resolved [ 5 ] In Review [ 10100 ]
        beebs Brad Bebee made changes -
        Resolution Fixed [ 1 ] Done [ 10000 ]
        Status In Review [ 10100 ] Done [ 10000 ]
        beebs Brad Bebee made changes -
        Workflow Trac Import v5 [ 16445 ] Trac Import v6 [ 17701 ]
        beebs Brad Bebee made changes -
        Workflow Trac Import v6 [ 17701 ] Trac Import v7 [ 19098 ]
        beebs Brad Bebee made changes -
        Workflow Trac Import v7 [ 19098 ] Trac Import v8 [ 20719 ]

          People

          • Assignee:
            Unassigned
            Reporter:
            bryanthompson bryanthompson
          • Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved: