Uploaded image for project: 'Blazegraph (by SYSTAP)'
  1. Blazegraph (by SYSTAP)
  2. BLZG-1182

Isolation broken in NSS when groupCommit disabled

    Details

      Description

      To replicate:


      - Use master (I actually did this with BIGDATA_RELEASE_1_5_0_Ticket_1136, but there should be no difference)
      - Load a large data set. E.g.,

      LOAD <file:/global/data/bsbmtools/td_100m/dataset.nt.gz>;
      


      - Wait until that load has been running. Then POST a NAMESPACE create request:

      curl -v -X POST --data-binary @tmp.properties --header 'Content-Type:text/plain' http://bigdata12:8090/bigdata/namespace
      * Hostname was NOT found in DNS cache
      *   Trying 192.168.1.12...
      * Connected to bigdata12 (192.168.1.12) port 8090 (#0)
      > POST /bigdata/namespace HTTP/1.1
      > User-Agent: curl/7.37.1
      > Host: bigdata12:8090
      > Accept: */*
      > Content-Type:text/plain
      > Content-Length: 591
      >
      * upload completely sent off: 591 out of 591 bytes
      < HTTP/1.1 201 Created
      < Content-Type: text/plain; charset=ISO-8859-1
      < Content-Length: 12
      * Server Jetty(9.2.3.v20140905) is not blacklisted
      < Server: Jetty(9.2.3.v20140905)
      <
      * Connection #0 to host bigdata12 left intact
      CREATED: kb2Bryans-MacBook-Pro:bigdata bryan$
      

      where tmp.properties is:

      com.bigdata.rdf.sail.namespace=kb2
      com.bigdata.namespace.test.spo.com.bigdata.btree.BTree.branchingFactor=1024
      com.bigdata.namespace.test.lex.com.bigdata.btree.BTree.branchingFactor=400
      com.bigdata.rdf.store.AbstractTripleStore.vocabularyClass=com.bigdata.rdf.vocab.BSBMVocabulary
      com.bigdata.rdf.store.AbstractTripleStore.textIndex=false
      com.bigdata.rdf.store.AbstractTripleStore.axiomsClass=com.bigdata.rdf.axioms.NoAxioms
      com.bigdata.rdf.sail.truthMaintenance=true
      com.bigdata.rdf.store.AbstractTripleStore.quads=false
      com.bigdata.rdf.store.AbstractTripleStore.statementIdentifiers=false
      

      Per the HTTP response, KB2 was created while the LOAD was running.

      Checking .../counters shows groupCommit:=false

      Running .../status?dumpJournal shows that the original kb has significant data
      - exactly as I would expect if the KB2 create when through without using the group commit mechanisms since that would imply that isolation was broken.

      name=kb.lex.BLOBS
      	Checkpoint{indexType=BTree,height=2,nnodes=3,nleaves=588,nentries=162806,counter=0,addrRoot=-5129668420173760,addrMetadata=-21474835651,addrBloomFilter=0,addrCheckpoint=-4634136568397604}
      	addrMetadata=0, name=kb.lex.BLOBS, indexType=BTree, indexUUID=975b1ad6-9160-43ea-9710-fc928725552c, branchingFactor=400, pmd=null, btreeClassName=com.bigdata.btree.BTree, checkpointClass=com.bigdata.btree.Checkpoint, nodeKeysCoder=com.bigdata.btree.raba.codec.FrontCodedRabaCoder$DefaultFrontCodedRabaCoder@3269c671{ratio=8}, btreeRecordCompressorFactory=N/A, tupleSerializer=com.bigdata.rdf.lexicon.BlobsTupleSerializer{, keyBuilderFactory=com.bigdata.btree.keys.ASCIIKeyBuilderFactory{ initialCapacity=8}, leafKeysCoder=com.bigdata.btree.raba.codec.FrontCodedRabaCoder$DefaultFrontCodedRabaCoder@3e633e92{ratio=8}, leafValuesCoder=com.bigdata.btree.raba.codec.SimpleRabaCoder@65a1033d}, conflictResolver=N/A, deleteMarkers=false, versionTimestamps=false, versionTimestampFilters=false, isolatable=false, rawRecords=true, maxRecLen=0, bloomFilterFactory=N/A, overflowHandler=N/A, splitHandler=N/A, indexSegmentBranchingFactor=512, indexSegmentBufferNodes=false, indexSegmentRecordCompressorFactory=N/A, asynchronousIndexWriteConfiguration=com.bigdata.btree.AsynchronousIndexWriteConfiguration{ masterQueueCapacity=5000, masterChunkSize=10000, masterChunkTimeoutNanos=50000000, sinkIdleTimeoutNanos=9223372036854775807, sinkPollTimeoutNanos=50000000, sinkQueueCapacity=5000, sinkChunkSize=10000, sinkChunkTimeoutNanos=9223372036854775807}, scatterSplitConfiguration=com.bigdata.btree.ScatterSplitConfiguration{enabled=true, percentOfSplitThreshold=0.25, dataServiceCount=0, indexPartitionCount=0}
      	com.bigdata.btree.BaseIndexStats@2ddbcdeb
      name=kb.lex.ID2TERM
      	Checkpoint{indexType=BTree,height=2,nnodes=6,nleaves=2156,nentries=862901,counter=0,addrRoot=-5129711369846671,addrMetadata=-17179868351,addrBloomFilter=0,addrCheckpoint=-4634200992907044}
      	addrMetadata=0, name=kb.lex.ID2TERM, indexType=BTree, indexUUID=515a2f07-bc9f-4913-9822-a07a16bac3f8, branchingFactor=800, pmd=null, btreeClassName=com.bigdata.btree.BTree, checkpointClass=com.bigdata.btree.Checkpoint, nodeKeysCoder=com.bigdata.btree.raba.codec.FrontCodedRabaCoder$DefaultFrontCodedRabaCoder@510e5292{ratio=8}, btreeRecordCompressorFactory=N/A, tupleSerializer=com.bigdata.rdf.lexicon.Id2TermTupleSerializer{, keyBuilderFactory=com.bigdata.btree.keys.ASCIIKeyBuilderFactory{ initialCapacity=9}, leafKeysCoder=com.bigdata.btree.raba.codec.FrontCodedRabaCoder$DefaultFrontCodedRabaCoder@34f18d1{ratio=8}, leafValuesCoder=com.bigdata.btree.raba.codec.SimpleRabaCoder@3de64309}, conflictResolver=N/A, deleteMarkers=false, versionTimestamps=false, versionTimestampFilters=false, isolatable=false, rawRecords=true, maxRecLen=16, bloomFilterFactory=N/A, overflowHandler=N/A, splitHandler=N/A, indexSegmentBranchingFactor=512, indexSegmentBufferNodes=false, indexSegmentRecordCompressorFactory=N/A, asynchronousIndexWriteConfiguration=com.bigdata.btree.AsynchronousIndexWriteConfiguration{ masterQueueCapacity=5000, masterChunkSize=10000, masterChunkTimeoutNanos=50000000, sinkIdleTimeoutNanos=9223372036854775807, sinkPollTimeoutNanos=50000000, sinkQueueCapacity=5000, sinkChunkSize=10000, sinkChunkTimeoutNanos=9223372036854775807}, scatterSplitConfiguration=com.bigdata.btree.ScatterSplitConfiguration{enabled=true, percentOfSplitThreshold=0.25, dataServiceCount=0, indexPartitionCount=0}
      	com.bigdata.btree.BaseIndexStats@583b239e
      name=kb.lex.TERM2ID
      	Checkpoint{indexType=BTree,height=2,nnodes=25,nleaves=4515,nentries=862901,counter=862901,addrRoot=-4342710152461592,addrMetadata=-70390219012900,addrBloomFilter=0,addrCheckpoint=-4634308367089444}
      	addrMetadata=0, name=kb.lex.TERM2ID, indexType=BTree, indexUUID=c46492d4-c9e1-42c9-901b-7be7a4e4a9c8, branchingFactor=300, pmd=null, btreeClassName=com.bigdata.btree.BTree, checkpointClass=com.bigdata.btree.Checkpoint, nodeKeysCoder=com.bigdata.btree.raba.codec.FrontCodedRabaCoder$DefaultFrontCodedRabaCoder@276d63b2{ratio=8}, btreeRecordCompressorFactory=N/A, tupleSerializer=com.bigdata.rdf.lexicon.Term2IdTupleSerializer{, keyBuilderFactory=com.bigdata.btree.keys.DefaultKeyBuilderFactory{ initialCapacity=0, collator=ICU, locale=en_US, strength=null, decomposition=null}, leafKeysCoder=com.bigdata.btree.raba.codec.FrontCodedRabaCoder$DefaultFrontCodedRabaCoder@7eac4f56{ratio=8}, leafValuesCoder=com.bigdata.btree.raba.codec.FixedLengthValueRabaCoder@628cd812}, conflictResolver=N/A, deleteMarkers=false, versionTimestamps=false, versionTimestampFilters=false, isolatable=false, rawRecords=false, maxRecLen=256, bloomFilterFactory=N/A, overflowHandler=N/A, splitHandler=N/A, indexSegmentBranchingFactor=512, indexSegmentBufferNodes=false, indexSegmentRecordCompressorFactory=N/A, asynchronousIndexWriteConfiguration=com.bigdata.btree.AsynchronousIndexWriteConfiguration{ masterQueueCapacity=5000, masterChunkSize=10000, masterChunkTimeoutNanos=50000000, sinkIdleTimeoutNanos=9223372036854775807, sinkPollTimeoutNanos=50000000, sinkQueueCapacity=5000, sinkChunkSize=10000, sinkChunkTimeoutNanos=9223372036854775807}, scatterSplitConfiguration=com.bigdata.btree.ScatterSplitConfiguration{enabled=true, percentOfSplitThreshold=0.25, dataServiceCount=0, indexPartitionCount=0}
      	com.bigdata.btree.BaseIndexStats@cbdb84a
      name=kb.spo.JUST
      	Checkpoint{indexType=BTree,height=0,nnodes=0,nleaves=1,nentries=0,counter=0,addrRoot=-105596065939426,addrMetadata=-30064770290,addrBloomFilter=0,addrCheckpoint=-35235911696164}
      	addrMetadata=0, name=kb.spo.JUST, indexType=BTree, indexUUID=624ab29f-b200-42b4-aae8-b34fc990a576, branchingFactor=1024, pmd=null, btreeClassName=com.bigdata.btree.BTree, checkpointClass=com.bigdata.btree.Checkpoint, nodeKeysCoder=com.bigdata.btree.raba.codec.FrontCodedRabaCoder$DefaultFrontCodedRabaCoder@29e44bc1{ratio=8}, btreeRecordCompressorFactory=N/A, tupleSerializer=com.bigdata.rdf.spo.JustificationTupleSerializer{, keyBuilderFactory=com.bigdata.btree.keys.ASCIIKeyBuilderFactory{ initialCapacity=0}, leafKeysCoder=com.bigdata.btree.raba.codec.FrontCodedRabaCoder$DefaultFrontCodedRabaCoder@4b298163{ratio=8}, leafValuesCoder=com.bigdata.btree.raba.codec.EmptyRabaValueCoder@11edb122}, conflictResolver=N/A, deleteMarkers=false, versionTimestamps=false, versionTimestampFilters=false, isolatable=false, rawRecords=false, maxRecLen=256, bloomFilterFactory=N/A, overflowHandler=N/A, splitHandler=N/A, indexSegmentBranchingFactor=512, indexSegmentBufferNodes=false, indexSegmentRecordCompressorFactory=N/A, asynchronousIndexWriteConfiguration=com.bigdata.btree.AsynchronousIndexWriteConfiguration{ masterQueueCapacity=5000, masterChunkSize=10000, masterChunkTimeoutNanos=50000000, sinkIdleTimeoutNanos=9223372036854775807, sinkPollTimeoutNanos=50000000, sinkQueueCapacity=5000, sinkChunkSize=10000, sinkChunkTimeoutNanos=9223372036854775807}, scatterSplitConfiguration=com.bigdata.btree.ScatterSplitConfiguration{enabled=true, percentOfSplitThreshold=0.25, dataServiceCount=0, indexPartitionCount=0}
      	com.bigdata.btree.BaseIndexStats@c083860
      name=kb.spo.OSP
      	Checkpoint{indexType=BTree,height=2,nnodes=21,nleaves=10224,nentries=5000000,counter=0,addrRoot=-478184478866983,addrMetadata=-140750373256454,addrBloomFilter=0,addrCheckpoint=-4634566065127204}
      	addrMetadata=0, name=kb.spo.OSP, indexType=BTree, indexUUID=ee7c9d42-7641-46fa-ab0c-036c5337d5ca, branchingFactor=800, pmd=null, btreeClassName=com.bigdata.btree.BTree, checkpointClass=com.bigdata.btree.Checkpoint, nodeKeysCoder=com.bigdata.btree.raba.codec.FrontCodedRabaCoder$DefaultFrontCodedRabaCoder@5257fcb4{ratio=8}, btreeRecordCompressorFactory=N/A, tupleSerializer=com.bigdata.rdf.spo.SPOTupleSerializer{, keyBuilderFactory=com.bigdata.btree.keys.ASCIIKeyBuilderFactory{ initialCapacity=0}, leafKeysCoder=com.bigdata.btree.raba.codec.FrontCodedRabaCoder$DefaultFrontCodedRabaCoder@e8fb7a8{ratio=8}, leafValuesCoder=com.bigdata.rdf.spo.FastRDFValueCoder2@7d627e0f}, conflictResolver=N/A, deleteMarkers=false, versionTimestamps=false, versionTimestampFilters=false, isolatable=false, rawRecords=false, maxRecLen=256, bloomFilterFactory=N/A, overflowHandler=N/A, splitHandler=N/A, indexSegmentBranchingFactor=512, indexSegmentBufferNodes=false, indexSegmentRecordCompressorFactory=N/A, asynchronousIndexWriteConfiguration=com.bigdata.btree.AsynchronousIndexWriteConfiguration{ masterQueueCapacity=5000, masterChunkSize=10000, masterChunkTimeoutNanos=50000000, sinkIdleTimeoutNanos=9223372036854775807, sinkPollTimeoutNanos=50000000, sinkQueueCapacity=5000, sinkChunkSize=10000, sinkChunkTimeoutNanos=9223372036854775807}, scatterSplitConfiguration=com.bigdata.btree.ScatterSplitConfiguration{enabled=true, percentOfSplitThreshold=0.25, dataServiceCount=0, indexPartitionCount=0}
      	com.bigdata.btree.BaseIndexStats@4d424b18
      name=kb.spo.POS
      	Checkpoint{indexType=BTree,height=2,nnodes=10,nleaves=6662,nentries=4000000,counter=0,addrRoot=-4634325546958604,addrMetadata=-140754668223750,addrBloomFilter=0,addrCheckpoint=-4634329841925924}
      	addrMetadata=0, name=kb.spo.POS, indexType=BTree, indexUUID=125d53f0-f10a-4bb5-b517-9f9e3bd0b8bd, branchingFactor=1024, pmd=null, btreeClassName=com.bigdata.btree.BTree, checkpointClass=com.bigdata.btree.Checkpoint, nodeKeysCoder=com.bigdata.btree.raba.codec.FrontCodedRabaCoder$DefaultFrontCodedRabaCoder@10a5b964{ratio=8}, btreeRecordCompressorFactory=N/A, tupleSerializer=com.bigdata.rdf.spo.SPOTupleSerializer{, keyBuilderFactory=com.bigdata.btree.keys.ASCIIKeyBuilderFactory{ initialCapacity=0}, leafKeysCoder=com.bigdata.btree.raba.codec.FrontCodedRabaCoder$DefaultFrontCodedRabaCoder@702e5cc5{ratio=8}, leafValuesCoder=com.bigdata.rdf.spo.FastRDFValueCoder2@73ccc30c}, conflictResolver=N/A, deleteMarkers=false, versionTimestamps=false, versionTimestampFilters=false, isolatable=false, rawRecords=false, maxRecLen=256, bloomFilterFactory=N/A, overflowHandler=N/A, splitHandler=N/A, indexSegmentBranchingFactor=512, indexSegmentBufferNodes=false, indexSegmentRecordCompressorFactory=N/A, asynchronousIndexWriteConfiguration=com.bigdata.btree.AsynchronousIndexWriteConfiguration{ masterQueueCapacity=5000, masterChunkSize=10000, masterChunkTimeoutNanos=50000000, sinkIdleTimeoutNanos=9223372036854775807, sinkPollTimeoutNanos=50000000, sinkQueueCapacity=5000, sinkChunkSize=10000, sinkChunkTimeoutNanos=9223372036854775807}, scatterSplitConfiguration=com.bigdata.btree.ScatterSplitConfiguration{enabled=true, percentOfSplitThreshold=0.25, dataServiceCount=0, indexPartitionCount=0}
      	com.bigdata.btree.BaseIndexStats@8198439
      name=kb.spo.SPO
      	Checkpoint{indexType=BTree,height=2,nnodes=20,nleaves=9765,nentries=5000000,counter=0,addrRoot=-478188773834265,addrMetadata=-25769802916,addrBloomFilter=0,addrCheckpoint=-4634570360094500}
      	addrMetadata=0, name=kb.spo.SPO, indexType=BTree, indexUUID=3c3b288d-f1e6-48f4-9b10-b28368095d69, branchingFactor=1024, pmd=null, btreeClassName=com.bigdata.btree.BTree, checkpointClass=com.bigdata.btree.Checkpoint, nodeKeysCoder=com.bigdata.btree.raba.codec.FrontCodedRabaCoder$DefaultFrontCodedRabaCoder@b12294{ratio=8}, btreeRecordCompressorFactory=N/A, tupleSerializer=com.bigdata.rdf.spo.SPOTupleSerializer{, keyBuilderFactory=com.bigdata.btree.keys.ASCIIKeyBuilderFactory{ initialCapacity=0}, leafKeysCoder=com.bigdata.btree.raba.codec.FrontCodedRabaCoder$DefaultFrontCodedRabaCoder@6d5522e6{ratio=8}, leafValuesCoder=com.bigdata.rdf.spo.FastRDFValueCoder2@7066621d}, conflictResolver=N/A, deleteMarkers=false, versionTimestamps=false, versionTimestampFilters=false, isolatable=false, rawRecords=false, maxRecLen=256, bloomFilterFactory=BloomFilterFactory{ n=1000000, p=0.02, maxP=0.15, maxN=1883227}, overflowHandler=N/A, splitHandler=N/A, indexSegmentBranchingFactor=512, indexSegmentBufferNodes=false, indexSegmentRecordCompressorFactory=N/A, asynchronousIndexWriteConfiguration=com.bigdata.btree.AsynchronousIndexWriteConfiguration{ masterQueueCapacity=5000, masterChunkSize=10000, masterChunkTimeoutNanos=50000000, sinkIdleTimeoutNanos=9223372036854775807, sinkPollTimeoutNanos=50000000, sinkQueueCapacity=5000, sinkChunkSize=10000, sinkChunkTimeoutNanos=9223372036854775807}, scatterSplitConfiguration=com.bigdata.btree.ScatterSplitConfiguration{enabled=true, percentOfSplitThreshold=0.25, dataServiceCount=0, indexPartitionCount=0}
      	com.bigdata.btree.BaseIndexStats@51b39791
      

      While I observed this in HA3, there is no reason to believe that this is an HA specific bug.

      Testing is required to determine whether this bug exists prior to the group commit refactoring or if that refactoring has broken isolation. My inclination is to believe that it is a pre-existing bug and that the GSR index operations and index register/drop operations are able to proceed while there is a Sail operation (such as SPARQL UPDATE LOAD) holding the unisolated connection on the Journal.

        Activity

        Hide
        bryanthompson bryanthompson added a comment -

        I suspect that this was introduced by the refactor of the CreateKBTask out of the BigdataSail.createLTS() code. I think that it might no longer obtain the unisolated sail connection around the namespace create when group commit is not enabled. It was obtaining that unisolated sail connection that caused the global journal semaphore lock to be obtained and which was responsible for serializing updates prior to the introduction of group commit.

        Further inspection reveals that the following pre-/post- logic was commented out when refactoring to support group commit. This logic was specifically present to prevent a namespace create / destroy operation by contending for the unisolated connection used by the BigdataSail. I have restored this logic into the code paths for CreateKBTask and DestroyKBTask. The logic is used IFF group commit is NOT enabled.

                boolean acquiredConnection = false;
                try {
                    
                    if (getIndexManager() instanceof Journal) {
                        // acquire permit from Journal.
                        ((Journal) getIndexManager()).acquireUnisolatedConnection();
                        acquiredConnection = true;
                    }
        
                    if (acquiredConnection) {
                    
                        ((Journal) getIndexManager()).releaseUnisolatedConnection();
                        
                    }
        
        Show
        bryanthompson bryanthompson added a comment - I suspect that this was introduced by the refactor of the CreateKBTask out of the BigdataSail.createLTS() code. I think that it might no longer obtain the unisolated sail connection around the namespace create when group commit is not enabled. It was obtaining that unisolated sail connection that caused the global journal semaphore lock to be obtained and which was responsible for serializing updates prior to the introduction of group commit. Further inspection reveals that the following pre-/post- logic was commented out when refactoring to support group commit. This logic was specifically present to prevent a namespace create / destroy operation by contending for the unisolated connection used by the BigdataSail. I have restored this logic into the code paths for CreateKBTask and DestroyKBTask. The logic is used IFF group commit is NOT enabled. boolean acquiredConnection = false; try { if (getIndexManager() instanceof Journal) { // acquire permit from Journal. ((Journal) getIndexManager()).acquireUnisolatedConnection(); acquiredConnection = true; } if (acquiredConnection) { ((Journal) getIndexManager()).releaseUnisolatedConnection(); }
        Hide
        bryanthompson bryanthompson added a comment -

        Fixed in 9582e051690e18e7a9aa3bc357fc885e9c4e993e.

        Show
        bryanthompson bryanthompson added a comment - Fixed in 9582e051690e18e7a9aa3bc357fc885e9c4e993e.

          People

          • Assignee:
            bryanthompson bryanthompson
            Reporter:
            bryanthompson bryanthompson
          • Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved: