Uploaded image for project: 'Blazegraph (by SYSTAP)'
  1. Blazegraph (by SYSTAP)
  2. BLZG-520

IV ordering is not consistent with unsigned byte[] ordering of IV keys.

    Details

      Description

      This issue directly effects IVUtility.compare(), IV.compareTo(), and SPOKeyOrder.getComparator(). In indirectly effects several things which rely on those ordering semantics, including the efficiency of B+Tree ordered writes, the correctness of scattered reads or writes in scale-out, and the correctness of the Justification class.

        Activity

        beebs Brad Bebee created issue -
        Hide
        bryanthompson bryanthompson added a comment -

        The issue has been documented for some time, e.g., on IVUtility#compare(IV,IV). The issue may not cause errors on the Journal, but can lead to exceptions in scale-out due to the inability to correctly scatter keys. The following stack trace appears when attempting to validate the scale-out bulk loader against a quads mode data set. While we could probably work around the use of the SPOKeyOrder comparator in this case, this kind of exception could occur in online operations in scale-out.

        java.lang.RuntimeException: java.util.concurrent.ExecutionException: java.lang.IllegalArgumentException: keys out of order: i=4, lastKey=[128, 128, 0, 0, 0, 0, 0, 0, 0, 128, 224, 0, 0, 0, 0, 0, 0, 0, 161, 106, 128, 0, 0, 0, 0, 0, 0, 0, 0], key=[128, 32, 0, 0, 0, 0, 0, 0, 0, 128, 192, 0, 0, 0, 0, 0, 0, 0, 161, 25, 0, 144, 0, 0, 0, 0, 0, 0, 0]
        	at com.bigdata.rdf.rio.AbstractRIOTestCase.doVerify(AbstractRIOTestCase.java:495)
        	at com.bigdata.rdf.rio.TestAsynchronousStatementBufferFactory.doLoadAndVerifyTest(TestAsynchronousStatementBufferFactory.java:449)
        	at com.bigdata.rdf.rio.TestAsynchronousStatementBufferFactory.test_loadAndVerify_smallQuads_quadsMode(TestAsynchronousStatementBufferFactory.java:239)
        	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
        	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        	at java.lang.reflect.Method.invoke(Method.java:597)
        	at junit.framework.TestCase.runTest(TestCase.java:154)
        	at junit.framework.TestCase.runBare(TestCase.java:127)
        	at junit.framework.TestResult$1.protect(TestResult.java:106)
        	at junit.framework.TestResult.runProtected(TestResult.java:124)
        	at junit.framework.TestResult.run(TestResult.java:109)
        	at junit.framework.TestCase.run(TestCase.java:118)
        	at org.eclipse.jdt.internal.junit.runner.junit3.JUnit3TestReference.run(JUnit3TestReference.java:130)
        	at org.eclipse.jdt.internal.junit.runner.TestExecution.run(TestExecution.java:38)
        	at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:467)
        	at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:683)
        	at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.run(RemoteTestRunner.java:390)
        	at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.main(RemoteTestRunner.java:197)
        Caused by: java.util.concurrent.ExecutionException: java.lang.IllegalArgumentException: keys out of order: i=4, lastKey=[128, 128, 0, 0, 0, 0, 0, 0, 0, 128, 224, 0, 0, 0, 0, 0, 0, 0, 161, 106, 128, 0, 0, 0, 0, 0, 0, 0, 0], key=[128, 32, 0, 0, 0, 0, 0, 0, 0, 128, 192, 0, 0, 0, 0, 0, 0, 0, 161, 25, 0, 144, 0, 0, 0, 0, 0, 0, 0]
        	at java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:222)
        	at java.util.concurrent.FutureTask.get(FutureTask.java:83)
        	at com.bigdata.rdf.rio.AbstractRIOTestCase.doVerify(AbstractRIOTestCase.java:477)
        	... 18 more
        Caused by: java.lang.IllegalArgumentException: keys out of order: i=4, lastKey=[128, 128, 0, 0, 0, 0, 0, 0, 0, 128, 224, 0, 0, 0, 0, 0, 0, 0, 161, 106, 128, 0, 0, 0, 0, 0, 0, 0, 0], key=[128, 32, 0, 0, 0, 0, 0, 0, 0, 128, 192, 0, 0, 0, 0, 0, 0, 0, 161, 25, 0, 144, 0, 0, 0, 0, 0, 0, 0]
        	at com.bigdata.service.ndx.AbstractSplitter.isValidSplit(AbstractSplitter.java:349)
        	at com.bigdata.service.ndx.AbstractSplitter.splitKeys(AbstractSplitter.java:190)
        	at com.bigdata.service.ndx.ClientIndexView.splitKeys(ClientIndexView.java:1860)
        	at com.bigdata.service.ndx.ClientIndexView.submit(ClientIndexView.java:1549)
        	at com.bigdata.service.ndx.ClientIndexView.submit(ClientIndexView.java:1498)
        	at com.bigdata.rdf.store.AbstractTestCase.assertSameStatements(AbstractTestCase.java:757)
        	at com.bigdata.rdf.store.AbstractTestCase.assertStatementIndicesConsistent(AbstractTestCase.java:613)
        	at com.bigdata.rdf.rio.AbstractRIOTestCase$VerifyTask.verify(AbstractRIOTestCase.java:694)
        	at com.bigdata.rdf.rio.AbstractRIOTestCase$VerifyTask.call(AbstractRIOTestCase.java:621)
        	at com.bigdata.rdf.rio.AbstractRIOTestCase$VerifyTask.call(AbstractRIOTestCase.java:1)
        	at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
        	at java.util.concurrent.FutureTask.run(FutureTask.java:138)
        	at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
        	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
        	at java.lang.Thread.run(Thread.java:619)
        
        Show
        bryanthompson bryanthompson added a comment - The issue has been documented for some time, e.g., on IVUtility#compare(IV,IV). The issue may not cause errors on the Journal, but can lead to exceptions in scale-out due to the inability to correctly scatter keys. The following stack trace appears when attempting to validate the scale-out bulk loader against a quads mode data set. While we could probably work around the use of the SPOKeyOrder comparator in this case, this kind of exception could occur in online operations in scale-out. java.lang.RuntimeException: java.util.concurrent.ExecutionException: java.lang.IllegalArgumentException: keys out of order: i=4, lastKey=[128, 128, 0, 0, 0, 0, 0, 0, 0, 128, 224, 0, 0, 0, 0, 0, 0, 0, 161, 106, 128, 0, 0, 0, 0, 0, 0, 0, 0], key=[128, 32, 0, 0, 0, 0, 0, 0, 0, 128, 192, 0, 0, 0, 0, 0, 0, 0, 161, 25, 0, 144, 0, 0, 0, 0, 0, 0, 0] at com.bigdata.rdf.rio.AbstractRIOTestCase.doVerify(AbstractRIOTestCase.java:495) at com.bigdata.rdf.rio.TestAsynchronousStatementBufferFactory.doLoadAndVerifyTest(TestAsynchronousStatementBufferFactory.java:449) at com.bigdata.rdf.rio.TestAsynchronousStatementBufferFactory.test_loadAndVerify_smallQuads_quadsMode(TestAsynchronousStatementBufferFactory.java:239) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at junit.framework.TestCase.runTest(TestCase.java:154) at junit.framework.TestCase.runBare(TestCase.java:127) at junit.framework.TestResult$1.protect(TestResult.java:106) at junit.framework.TestResult.runProtected(TestResult.java:124) at junit.framework.TestResult.run(TestResult.java:109) at junit.framework.TestCase.run(TestCase.java:118) at org.eclipse.jdt.internal.junit.runner.junit3.JUnit3TestReference.run(JUnit3TestReference.java:130) at org.eclipse.jdt.internal.junit.runner.TestExecution.run(TestExecution.java:38) at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:467) at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:683) at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.run(RemoteTestRunner.java:390) at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.main(RemoteTestRunner.java:197) Caused by: java.util.concurrent.ExecutionException: java.lang.IllegalArgumentException: keys out of order: i=4, lastKey=[128, 128, 0, 0, 0, 0, 0, 0, 0, 128, 224, 0, 0, 0, 0, 0, 0, 0, 161, 106, 128, 0, 0, 0, 0, 0, 0, 0, 0], key=[128, 32, 0, 0, 0, 0, 0, 0, 0, 128, 192, 0, 0, 0, 0, 0, 0, 0, 161, 25, 0, 144, 0, 0, 0, 0, 0, 0, 0] at java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:222) at java.util.concurrent.FutureTask.get(FutureTask.java:83) at com.bigdata.rdf.rio.AbstractRIOTestCase.doVerify(AbstractRIOTestCase.java:477) ... 18 more Caused by: java.lang.IllegalArgumentException: keys out of order: i=4, lastKey=[128, 128, 0, 0, 0, 0, 0, 0, 0, 128, 224, 0, 0, 0, 0, 0, 0, 0, 161, 106, 128, 0, 0, 0, 0, 0, 0, 0, 0], key=[128, 32, 0, 0, 0, 0, 0, 0, 0, 128, 192, 0, 0, 0, 0, 0, 0, 0, 161, 25, 0, 144, 0, 0, 0, 0, 0, 0, 0] at com.bigdata.service.ndx.AbstractSplitter.isValidSplit(AbstractSplitter.java:349) at com.bigdata.service.ndx.AbstractSplitter.splitKeys(AbstractSplitter.java:190) at com.bigdata.service.ndx.ClientIndexView.splitKeys(ClientIndexView.java:1860) at com.bigdata.service.ndx.ClientIndexView.submit(ClientIndexView.java:1549) at com.bigdata.service.ndx.ClientIndexView.submit(ClientIndexView.java:1498) at com.bigdata.rdf.store.AbstractTestCase.assertSameStatements(AbstractTestCase.java:757) at com.bigdata.rdf.store.AbstractTestCase.assertStatementIndicesConsistent(AbstractTestCase.java:613) at com.bigdata.rdf.rio.AbstractRIOTestCase$VerifyTask.verify(AbstractRIOTestCase.java:694) at com.bigdata.rdf.rio.AbstractRIOTestCase$VerifyTask.call(AbstractRIOTestCase.java:621) at com.bigdata.rdf.rio.AbstractRIOTestCase$VerifyTask.call(AbstractRIOTestCase.java:1) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) at java.util.concurrent.FutureTask.run(FutureTask.java:138) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:619)
        Hide
        bryanthompson bryanthompson added a comment -

        Bug fixes to the xsd unsigned IV classes.

        Extensions of the IV test suite.

        Comparator failures are still reported for IVs which use fully inlined unicode data (whether bnodes, uris, or literals). I need to continue to look into this. But the root cause will be that the unsigned byte[] keys are generated based on IVUtility.un (a UnicodeHelper class) while the IV implementations are using String.compareTo(String). These things clearly are not guaranteed to have the same ordering. The IV implementations which use fully inline Unicode data should probably use a different method in their _compare() implementations.

        Show
        bryanthompson bryanthompson added a comment - Bug fixes to the xsd unsigned IV classes. Extensions of the IV test suite. Comparator failures are still reported for IVs which use fully inlined unicode data (whether bnodes, uris, or literals). I need to continue to look into this. But the root cause will be that the unsigned byte[] keys are generated based on IVUtility.un (a UnicodeHelper class) while the IV implementations are using String.compareTo(String). These things clearly are not guaranteed to have the same ordering. The IV implementations which use fully inline Unicode data should probably use a different method in their _compare() implementations.
        Hide
        bryanthompson bryanthompson added a comment -

        The scale-out data loader should now support both TermIds and BlobIVs. I've added unit tests for quads and with blobs. A split handler is now registered in scale-out for the blobs index. One of the quads mode tests is failing, but it is failing because the [c] position is not always bound on a statement in quads mode. I am still looking at this issue. I think that we will probably wind up adding asserts for the key arity into SPOIndexWriter and then fixing the DataLoader and AsynchronousStatementBufferFactory to set the [c] position to the resource from which the data are being loaded (default context?).

        I believe that I have found the cause of the errors in TestRollbacks during CI. I think that there has been a side-effect through the private fields on the BigdataValueFactoryImpl. If the same namespace is used in CI for different triple store instances, then we can have old data in those fields. Perhaps we should explicitly (and atomically) clear the namespace in CI before a test is run? (I ran into a similar problem in the IV unit tests which is how I finally figured out where the side-effect was coming from.)

        Javadoc on TestSubQuery which captures the dialog with Matt. I have not yet fixed the bug.

        I still need to update the scale-out data loader to write on the full text index.

        I still need to provision the BLOBS index defaults for high efficiency in scale-out.

        I am committing now to get feedback on these changes from CI.

        The IV ordering is still not consistent for Unicode inlining.

        Committed revision r5287.

        Show
        bryanthompson bryanthompson added a comment - The scale-out data loader should now support both TermIds and BlobIVs. I've added unit tests for quads and with blobs. A split handler is now registered in scale-out for the blobs index. One of the quads mode tests is failing, but it is failing because the [c] position is not always bound on a statement in quads mode. I am still looking at this issue. I think that we will probably wind up adding asserts for the key arity into SPOIndexWriter and then fixing the DataLoader and AsynchronousStatementBufferFactory to set the [c] position to the resource from which the data are being loaded (default context?). I believe that I have found the cause of the errors in TestRollbacks during CI. I think that there has been a side-effect through the private fields on the BigdataValueFactoryImpl. If the same namespace is used in CI for different triple store instances, then we can have old data in those fields. Perhaps we should explicitly (and atomically) clear the namespace in CI before a test is run? (I ran into a similar problem in the IV unit tests which is how I finally figured out where the side-effect was coming from.) Javadoc on TestSubQuery which captures the dialog with Matt. I have not yet fixed the bug. I still need to update the scale-out data loader to write on the full text index. I still need to provision the BLOBS index defaults for high efficiency in scale-out. I am committing now to get feedback on these changes from CI. The IV ordering is still not consistent for Unicode inlining. Committed revision r5287.
        Hide
        bryanthompson bryanthompson added a comment -

        Fixed IV's with inline Unicode. While the present solution may not be
        optimal, it does correctly encode/decode inline Unicode values and the
        IVs having those Unicode values obey the same ordering as the keys
        generated from those IVs. Inlining IVs with Unicode data is still off
        by default in the AbstractTripleStore's Options. The relevant logic
        is now isolated in an IVUnicode class. That class has its own test
        suite.

        Fixed problem where the parsers could fail to set the context position
        in the quads mode such that we could get (s,p,o) objects into a quads
        mode index. The SPOIndexWriter now checks this and will not permit a
        triple into a quads index.

        Committed revision r5289.

        Show
        bryanthompson bryanthompson added a comment - Fixed IV's with inline Unicode. While the present solution may not be optimal, it does correctly encode/decode inline Unicode values and the IVs having those Unicode values obey the same ordering as the keys generated from those IVs. Inlining IVs with Unicode data is still off by default in the AbstractTripleStore's Options. The relevant logic is now isolated in an IVUnicode class. That class has its own test suite. Fixed problem where the parsers could fail to set the context position in the quads mode such that we could get (s,p,o) objects into a quads mode index. The SPOIndexWriter now checks this and will not permit a triple into a quads index. Committed revision r5289.
        beebs Brad Bebee made changes -
        Field Original Value New Value
        Workflow Trac Import v2 [ 12356 ] Trac Import v3 [ 13937 ]
        beebs Brad Bebee made changes -
        Workflow Trac Import v3 [ 13937 ] Trac Import v4 [ 15266 ]
        beebs Brad Bebee made changes -
        Workflow Trac Import v4 [ 15266 ] Trac Import v5 [ 16652 ]
        beebs Brad Bebee made changes -
        Labels Issue_patch_20150625
        beebs Brad Bebee made changes -
        Status Closed - Won't Fix [ 6 ] Open [ 1 ]
        beebs Brad Bebee made changes -
        Status Open [ 1 ] Accepted [ 10101 ]
        beebs Brad Bebee made changes -
        Status Accepted [ 10101 ] In Progress [ 3 ]
        beebs Brad Bebee made changes -
        Status In Progress [ 3 ] Resolved [ 5 ]
        beebs Brad Bebee made changes -
        Status Resolved [ 5 ] In Review [ 10100 ]
        beebs Brad Bebee made changes -
        Resolution Fixed [ 1 ] Done [ 10000 ]
        Status In Review [ 10100 ] Done [ 10000 ]
        beebs Brad Bebee made changes -
        Workflow Trac Import v5 [ 16652 ] Trac Import v6 [ 17887 ]
        beebs Brad Bebee made changes -
        Workflow Trac Import v6 [ 17887 ] Trac Import v7 [ 19284 ]
        beebs Brad Bebee made changes -
        Workflow Trac Import v7 [ 19284 ] Trac Import v8 [ 20905 ]

          People

          • Assignee:
            bryanthompson bryanthompson
            Reporter:
            bryanthompson bryanthompson
          • Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved: