Uploaded image for project: 'Blazegraph (by SYSTAP)'
  1. Blazegraph (by SYSTAP)
  2. BLZG-1172

ClocksNotSynchronizedException (HA, GROUP_COMMIT)

    Details

      Description

      The ClocksNotSynchronizedException is not being propagated in a manner that allows the test to pass. The exception is correctly detected by the server, but it now appears in an AbstractTask / Group Commit context.

      See TestHAJournalServerOverride.testStartABC_releaseTimeConsensusProtocol_clockSkew

      junit.framework.AssertionFailedError: Expecting class com.bigdata.util.ClocksNotSynchronizedException
      	at junit.framework.TestCase2.fail(TestCase2.java:90)
      	at com.bigdata.journal.jini.ha.TestHAJournalServerOverride.testStartABC_releaseTimeConsensusProtocol_clockSkew(TestHAJournalServerOverride.java:278)
      	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
      	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
      	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
      	at java.lang.reflect.Method.invoke(Method.java:606)
      	at junit.framework.TestCase.runTest(TestCase.java:176)
      	at junit.framework.TestCase.runBare(TestCase.java:141)
      	at junit.framework.TestResult$1.protect(TestResult.java:122)
      	at junit.framework.TestResult.runProtected(TestResult.java:142)
      	at junit.framework.TestResult.run(TestResult.java:125)
      	at junit.framework.TestCase.run(TestCase.java:129)
      	at junit.framework.TestSuite.runTest(TestSuite.java:255)
      	at junit.framework.TestSuite.run(TestSuite.java:250)
      	at org.junit.internal.runners.JUnit38ClassRunner.run(JUnit38ClassRunner.java:84)
      	at org.eclipse.jdt.internal.junit4.runner.JUnit4TestReference.run(JUnit4TestReference.java:50)
      	at org.eclipse.jdt.internal.junit.runner.TestExecution.run(TestExecution.java:38)
      	at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:467)
      	at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:683)
      	at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.run(RemoteTestRunner.java:390)
      	at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.main(RemoteTestRunner.java:197)
      Caused by: com.bigdata.rdf.sail.webapp.client.HttpException: Status Code=500, Status Line=Server Error, Response=<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd"><html><head><meta http-equiv="Content-Type" content="text&BLZG-251;html;charset=UTF-8"><title>bigdata&BLZG-347;</title
      ></head
      ><body><pre>Drop ALL-GRAPHS ALL-SOLUTIONS
      </pre
      ><p>totalElapsed=83ms, elapsed=11ms</p
      ><hr><pre>InsertData
        &lt;http:&BLZG-251;&BLZG-251;example&BLZG-251;book1, http:&BLZG-251;&BLZG-251;purl.org&BLZG-251;dc&BLZG-251;elements&BLZG-251;1.1&BLZG-251;title, &BLZG-240;A new book&BLZG-240;&gt;&lt;http:&BLZG-251;&BLZG-251;example&BLZG-251;book1, http:&BLZG-251;&BLZG-251;purl.org&BLZG-251;dc&BLZG-251;elements&BLZG-251;1.1&BLZG-251;creator, &BLZG-240;A.N.Other&BLZG-240;&gt;
      </pre
      ><p>totalElapsed=91ms, elapsed=2ms</p
      ><hr><p>COMMIT: totalElapsed=109ms, commitTime=0, mutationCount=2</p
      ></body
      ></html
      >SPARQL-UPDATE: updateStr=DROP ALL;
      PREFIX dc: <http://purl.org/dc/elements/1.1/>
      INSERT DATA {
        <http://example/book1> dc:title "A new book" ;
        dc:creator "A.N.Other" .
      }
      
      java.util.concurrent.ExecutionException: java.lang.RuntimeException: Commit failed: Task{com.bigdata.rdf.task.ApiTaskForJournal,timestamp=unisolated,resource=[kb]}::{delegate=com.bigdata.rdf.sail.webapp.QueryServlet$SparqlUpdateTask{namespace=kb,timestamp=0, updateStr=[DROP ALL;
      PREFIX dc: <http://purl.org/dc/elements/1.1/>
      INSERT DATA {
        <http://example/book1> dc:title "A new book" ;
        dc:creator "A.N.Other" .
      }
      ]}}
      	at java.util.concurrent.FutureTask.report(FutureTask.java:122)
      	at java.util.concurrent.FutureTask.get(FutureTask.java:188)
      	at com.bigdata.rdf.sail.webapp.BigdataServlet.submitApiTask(BigdataServlet.java:260)
      	at com.bigdata.rdf.sail.webapp.QueryServlet.doSparqlUpdate(QueryServlet.java:359)
      	at com.bigdata.rdf.sail.webapp.QueryServlet.doPost(QueryServlet.java:165)
      	at com.bigdata.rdf.sail.webapp.RESTServlet.doPost(RESTServlet.java:237)
      	at javax.servlet.http.HttpServlet.service(HttpServlet.java:707)
      	at javax.servlet.http.HttpServlet.service(HttpServlet.java:790)
      	at org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:769)
      	at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:585)
      	at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)
      	at org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:577)
      	at org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:223)
      	at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1125)
      	at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:515)
      	at org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:185)
      	at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1059)
      	at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
      	at org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:215)
      	at org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:110)
      	at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:97)
      	at org.eclipse.jetty.server.Server.handle(Server.java:497)
      	at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:311)
      	at org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:248)
      	at org.eclipse.jetty.io.AbstractConnection$2.run(AbstractConnection.java:540)
      	at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:610)
      	at org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:539)
      	at java.lang.Thread.run(Thread.java:745)
      Caused by: java.lang.RuntimeException: Commit failed: Task{com.bigdata.rdf.task.ApiTaskForJournal,timestamp=unisolated,resource=[kb]}::{delegate=com.bigdata.rdf.sail.webapp.QueryServlet$SparqlUpdateTask{namespace=kb,timestamp=0, updateStr=[DROP ALL;
      PREFIX dc: <http://purl.org/dc/elements/1.1/>
      INSERT DATA {
        <http://example/book1> dc:title "A new book" ;
        dc:creator "A.N.Other" .
      }
      ]}}
      	at com.bigdata.journal.WriteExecutorService.afterTask(WriteExecutorService.java:956)
      	at com.bigdata.journal.AbstractTask.doUnisolatedReadWriteTask(AbstractTask.java:2138)
      	at com.bigdata.journal.AbstractTask.call2(AbstractTask.java:2029)
      	at com.bigdata.journal.AbstractTask.call(AbstractTask.java:1895)
      	at java.util.concurrent.FutureTask.run(FutureTask.java:262)
      	at com.bigdata.concurrent.FutureTaskMon.run(FutureTaskMon.java:63)
      	at com.bigdata.concurrent.NonBlockingLockManagerWithNewDesign$LockFutureTask.run(NonBlockingLockManagerWithNewDesign.java:1984)
      	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
      	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
      	... 1 more
      
      	at com.bigdata.rdf.sail.webapp.client.RemoteRepository.checkResponseCode(RemoteRepository.java:1790)
      	at com.bigdata.rdf.sail.webapp.client.RemoteRepository$SparqlUpdate.evaluate(RemoteRepository.java:1375)
      	at com.bigdata.rdf.sail.webapp.client.RemoteRepository$SparqlUpdate.evaluate(RemoteRepository.java:1360)
      	at com.bigdata.journal.jini.ha.AbstractHA3JournalServerTestCase.simpleTransaction_noQuorumCheck(AbstractHA3JournalServerTestCase.java:2893)
      	at com.bigdata.journal.jini.ha.AbstractHA3JournalServerTestCase.simpleTransaction_noQuorumCheck(AbstractHA3JournalServerTestCase.java:2843)
      	at com.bigdata.journal.jini.ha.AbstractHA3JournalServerTestCase.simpleTransaction(AbstractHA3JournalServerTestCase.java:2808)
      	at com.bigdata.journal.jini.ha.TestHAJournalServerOverride.testStartABC_releaseTimeConsensusProtocol_clockSkew(TestHAJournalServerOverride.java:254)
      	... 19 more
      
      

        Activity

        Hide
        bryanthompson bryanthompson added a comment -

        This test is causing the actual commit to fail. The WriteExcecutorService probably needs to be smarter / kinder about the firstCause message when the ACID prepare/commit operation fails. This will cause all tasks in the commit group to fail. They should receive better information about the root cause. That would probably allow the client to see the ClocksNotSynchronizedException and let this test pass.

        Show
        bryanthompson bryanthompson added a comment - This test is causing the actual commit to fail. The WriteExcecutorService probably needs to be smarter / kinder about the firstCause message when the ACID prepare/commit operation fails. This will cause all tasks in the commit group to fail. They should receive better information about the root cause. That would probably allow the client to see the ClocksNotSynchronizedException and let this test pass.
        Hide
        bryanthompson bryanthompson added a comment -

        Here is the exception on the leader when the ACID 2-Phase commit fails due to clock skew (as mocked by the test):

        ERROR: 09:16:14,412 4889   com.bigdata.rdf.task.ApiTaskForJournal 0  com.bigdata.journal.ConcurrencyManager.writeService3 com.bigdata.journal.AbstractJournal$CommitState.gatherPhase(AbstractJournal.java:3679): java.lang.RuntimeException: com.bigdata.util.ClocksNotSynchronizedException: service1=7c022b8e-ab4d-4a80-984f-b35581a81842, serviceId2=a0c25ca3-4eaa-439e-9126-fbb50dc52f28, skew=1425219374394ms exceeds maximumSkew=5000ms.
        java.lang.RuntimeException: com.bigdata.util.ClocksNotSynchronizedException: service1=7c022b8e-ab4d-4a80-984f-b35581a81842, serviceId2=a0c25ca3-4eaa-439e-9126-fbb50dc52f28, skew=1425219374394ms exceeds maximumSkew=5000ms.
        	at com.bigdata.journal.Journal$InnerJournalTransactionService.updateReleaseTimeConsensus(Journal.java:1403)
        	at com.bigdata.journal.AbstractJournal$CommitState.gatherPhase(AbstractJournal.java:3671)
        	at com.bigdata.journal.AbstractJournal$CommitState.access$1(AbstractJournal.java:3618)
        	at com.bigdata.journal.AbstractJournal.commitNow(AbstractJournal.java:4055)
        	at com.bigdata.journal.AbstractJournal.commit(AbstractJournal.java:3084)
        	at com.bigdata.journal.WriteExecutorService.commit(WriteExecutorService.java:2395)
        	at com.bigdata.journal.WriteExecutorService.groupCommit(WriteExecutorService.java:1470)
        	at com.bigdata.journal.WriteExecutorService.afterTask(WriteExecutorService.java:936)
        	at com.bigdata.journal.AbstractTask.doUnisolatedReadWriteTask(AbstractTask.java:2138)
        	at com.bigdata.journal.AbstractTask.call2(AbstractTask.java:2029)
        	at com.bigdata.journal.AbstractTask.call(AbstractTask.java:1895)
        	at java.util.concurrent.FutureTask.run(FutureTask.java:262)
        	at com.bigdata.concurrent.FutureTaskMon.run(FutureTaskMon.java:63)
        	at com.bigdata.concurrent.NonBlockingLockManagerWithNewDesign$LockFutureTask.run(NonBlockingLockManagerWithNewDesign.java:1984)
        	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        	at java.lang.Thread.run(Thread.java:745)
        Caused by: com.bigdata.util.ClocksNotSynchronizedException: service1=7c022b8e-ab4d-4a80-984f-b35581a81842, serviceId2=a0c25ca3-4eaa-439e-9126-fbb50dc52f28, skew=1425219374394ms exceeds maximumSkew=5000ms.
        	at com.bigdata.util.ClocksNotSynchronizedException.assertBefore(ClocksNotSynchronizedException.java:107)
        	at com.bigdata.journal.AbstractJournal.assertBefore(AbstractJournal.java:1843)
        	at com.bigdata.journal.Journal$BarrierState.run(Journal.java:637)
        	at java.util.concurrent.CyclicBarrier.dowait(CyclicBarrier.java:213)
        	at java.util.concurrent.CyclicBarrier.await(CyclicBarrier.java:355)
        	at com.bigdata.journal.Journal$InnerJournalTransactionService.notifyEarliestCommitTime(Journal.java:2055)
        	at com.bigdata.journal.AbstractJournal$BasicHA.notifyEarliestCommitTime(AbstractJournal.java:8470)
        	at com.bigdata.journal.jini.ha.HAJournalTest$HAGlueTestImpl.notifyEarliestCommitTime(HAJournalTest.java:994)
        	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
        	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        	at java.lang.reflect.Method.invoke(Method.java:606)
        	at net.jini.jeri.BasicInvocationDispatcher.invoke(BasicInvocationDispatcher.java:1126)
        	at net.jini.jeri.BasicInvocationDispatcher.dispatch(BasicInvocationDispatcher.java:608)
        	at com.sun.jini.jeri.internal.runtime.Target$2.run(Target.java:487)
        	at net.jini.export.ServerContext.doWithServerContext(ServerContext.java:108)
        	at com.sun.jini.jeri.internal.runtime.Target.dispatch(Target.java:484)
        	at com.sun.jini.jeri.internal.runtime.Target.access$000(Target.java:57)
        	at com.sun.jini.jeri.internal.runtime.Target$1.run(Target.java:464)
        	at java.security.AccessController.doPrivileged(Native Method)
        	at com.sun.jini.jeri.internal.runtime.Target.dispatch(Target.java:461)
        	at com.sun.jini.jeri.internal.runtime.Target.dispatch(Target.java:426)
        	at com.sun.jini.jeri.internal.runtime.DgcRequestDispatcher.dispatch(DgcRequestDispatcher.java:210)
        	at net.jini.jeri.connection.ServerConnectionManager$Dispatcher.dispatch(ServerConnectionManager.java:147)
        	at com.sun.jini.jeri.internal.mux.MuxServer$1$1.run(MuxServer.java:244)
        	at java.security.AccessController.doPrivileged(Native Method)
        	at com.sun.jini.jeri.internal.mux.MuxServer$1.run(MuxServer.java:241)
        	at com.sun.jini.thread.ThreadPool$Worker.run(ThreadPool.java:136)
        	... 1 more
        

        You can clearly see that the ClocksNotSynchronizedException is being thrown inside of the group commit.

        	at com.bigdata.journal.AbstractJournal.commit(AbstractJournal.java:3084)
        	at com.bigdata.journal.WriteExecutorService.commit(WriteExecutorService.java:2395)
        	at com.bigdata.journal.WriteExecutorService.groupCommit(WriteExecutorService.java:1470)
        	at com.bigdata.journal.WriteExecutorService.afterTask(WriteExecutorService.java:936)
        

        We just need to set that cause on the tasks in the commit group.

        Show
        bryanthompson bryanthompson added a comment - Here is the exception on the leader when the ACID 2-Phase commit fails due to clock skew (as mocked by the test): ERROR: 09:16:14,412 4889 com.bigdata.rdf.task.ApiTaskForJournal 0 com.bigdata.journal.ConcurrencyManager.writeService3 com.bigdata.journal.AbstractJournal$CommitState.gatherPhase(AbstractJournal.java:3679): java.lang.RuntimeException: com.bigdata.util.ClocksNotSynchronizedException: service1=7c022b8e-ab4d-4a80-984f-b35581a81842, serviceId2=a0c25ca3-4eaa-439e-9126-fbb50dc52f28, skew=1425219374394ms exceeds maximumSkew=5000ms. java.lang.RuntimeException: com.bigdata.util.ClocksNotSynchronizedException: service1=7c022b8e-ab4d-4a80-984f-b35581a81842, serviceId2=a0c25ca3-4eaa-439e-9126-fbb50dc52f28, skew=1425219374394ms exceeds maximumSkew=5000ms. at com.bigdata.journal.Journal$InnerJournalTransactionService.updateReleaseTimeConsensus(Journal.java:1403) at com.bigdata.journal.AbstractJournal$CommitState.gatherPhase(AbstractJournal.java:3671) at com.bigdata.journal.AbstractJournal$CommitState.access$1(AbstractJournal.java:3618) at com.bigdata.journal.AbstractJournal.commitNow(AbstractJournal.java:4055) at com.bigdata.journal.AbstractJournal.commit(AbstractJournal.java:3084) at com.bigdata.journal.WriteExecutorService.commit(WriteExecutorService.java:2395) at com.bigdata.journal.WriteExecutorService.groupCommit(WriteExecutorService.java:1470) at com.bigdata.journal.WriteExecutorService.afterTask(WriteExecutorService.java:936) at com.bigdata.journal.AbstractTask.doUnisolatedReadWriteTask(AbstractTask.java:2138) at com.bigdata.journal.AbstractTask.call2(AbstractTask.java:2029) at com.bigdata.journal.AbstractTask.call(AbstractTask.java:1895) at java.util.concurrent.FutureTask.run(FutureTask.java:262) at com.bigdata.concurrent.FutureTaskMon.run(FutureTaskMon.java:63) at com.bigdata.concurrent.NonBlockingLockManagerWithNewDesign$LockFutureTask.run(NonBlockingLockManagerWithNewDesign.java:1984) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) Caused by: com.bigdata.util.ClocksNotSynchronizedException: service1=7c022b8e-ab4d-4a80-984f-b35581a81842, serviceId2=a0c25ca3-4eaa-439e-9126-fbb50dc52f28, skew=1425219374394ms exceeds maximumSkew=5000ms. at com.bigdata.util.ClocksNotSynchronizedException.assertBefore(ClocksNotSynchronizedException.java:107) at com.bigdata.journal.AbstractJournal.assertBefore(AbstractJournal.java:1843) at com.bigdata.journal.Journal$BarrierState.run(Journal.java:637) at java.util.concurrent.CyclicBarrier.dowait(CyclicBarrier.java:213) at java.util.concurrent.CyclicBarrier.await(CyclicBarrier.java:355) at com.bigdata.journal.Journal$InnerJournalTransactionService.notifyEarliestCommitTime(Journal.java:2055) at com.bigdata.journal.AbstractJournal$BasicHA.notifyEarliestCommitTime(AbstractJournal.java:8470) at com.bigdata.journal.jini.ha.HAJournalTest$HAGlueTestImpl.notifyEarliestCommitTime(HAJournalTest.java:994) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at net.jini.jeri.BasicInvocationDispatcher.invoke(BasicInvocationDispatcher.java:1126) at net.jini.jeri.BasicInvocationDispatcher.dispatch(BasicInvocationDispatcher.java:608) at com.sun.jini.jeri.internal.runtime.Target$2.run(Target.java:487) at net.jini.export.ServerContext.doWithServerContext(ServerContext.java:108) at com.sun.jini.jeri.internal.runtime.Target.dispatch(Target.java:484) at com.sun.jini.jeri.internal.runtime.Target.access$000(Target.java:57) at com.sun.jini.jeri.internal.runtime.Target$1.run(Target.java:464) at java.security.AccessController.doPrivileged(Native Method) at com.sun.jini.jeri.internal.runtime.Target.dispatch(Target.java:461) at com.sun.jini.jeri.internal.runtime.Target.dispatch(Target.java:426) at com.sun.jini.jeri.internal.runtime.DgcRequestDispatcher.dispatch(DgcRequestDispatcher.java:210) at net.jini.jeri.connection.ServerConnectionManager$Dispatcher.dispatch(ServerConnectionManager.java:147) at com.sun.jini.jeri.internal.mux.MuxServer$1$1.run(MuxServer.java:244) at java.security.AccessController.doPrivileged(Native Method) at com.sun.jini.jeri.internal.mux.MuxServer$1.run(MuxServer.java:241) at com.sun.jini.thread.ThreadPool$Worker.run(ThreadPool.java:136) ... 1 more You can clearly see that the ClocksNotSynchronizedException is being thrown inside of the group commit. at com.bigdata.journal.AbstractJournal.commit(AbstractJournal.java:3084) at com.bigdata.journal.WriteExecutorService.commit(WriteExecutorService.java:2395) at com.bigdata.journal.WriteExecutorService.groupCommit(WriteExecutorService.java:1470) at com.bigdata.journal.WriteExecutorService.afterTask(WriteExecutorService.java:936) We just need to set that cause on the tasks in the commit group.
        Hide
        bryanthompson bryanthompson added a comment -

        Fix for BLZG-1172 (group commit must report first cause)?Added logic to report the first cause if the commit() itself fails during processing of a commit group. This fixes the HA test for clock skew.

        Commit 88a7a99f909ef4a8c68150d695f77138100d2b32

        Show
        bryanthompson bryanthompson added a comment - Fix for BLZG-1172 (group commit must report first cause)?Added logic to report the first cause if the commit() itself fails during processing of a commit group. This fixes the HA test for clock skew. Commit 88a7a99f909ef4a8c68150d695f77138100d2b32

          People

          • Assignee:
            bryanthompson bryanthompson
            Reporter:
            bryanthompson bryanthompson
          • Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved: