Uploaded image for project: 'Blazegraph (by SYSTAP)'
  1. Blazegraph (by SYSTAP)
  2. BLZG-744

Compress write cache blocks for replication and in HALogs

    Details

      Description

      It would be advantageous to compress the write cache blocks before they are replicated and before they are written to the HALog files. The leader would do the compression in the WriteCacheService.WriteTask.call() thread, so this would not add latency to writes. The receiver would immediately replicate the buffer. A receiver that only needs to log the buffer would log the compressed buffer. If the receiver needs to write the buffer onto the backing store, then it would decompress the buffer into a 2nd buffer
      - that 2nd buffer could be allocated and held by the HAJournalServer in its life cycle.

      We need to add fields for the compressionScheme (None, Zip, etc) and the wireByteLength to the IHAWriteMessage interface. For non-compressed data, the wireByteLength and the data byte length would be the same. The compressionScheme should default to None and the wireByteLength should default to the actual byte length for backwards compatibility.

      The HAWriteMessages are written onto the HALog. The messages themselves would not be compressed. However, the replicated write cache blocks WOULD be compressed. This should present a substantial space savings for the HALog files.

        Activity

        Hide
        bryanthompson bryanthompson added a comment -

        I have verified this both in CI and by loading BSBM 100M onto an HA3 cluster configuration and then running both the BSBM UPDATE mixture (on the leader) and the BSBM EXPLORE mixture (on the 2nd follower).

        This was tested against r7211.

        See https://sourceforge.net/apps/trac/bigdata/ticket/674#comment:6

        Show
        bryanthompson bryanthompson added a comment - I have verified this both in CI and by loading BSBM 100M onto an HA3 cluster configuration and then running both the BSBM UPDATE mixture (on the leader) and the BSBM EXPLORE mixture (on the 2nd follower). This was tested against r7211. See https://sourceforge.net/apps/trac/bigdata/ticket/674#comment:6
        Hide
        bryanthompson bryanthompson added a comment -

        Enabling HALog compression in CI. WriteCache? compaction was recently enabled and is looking good in CI, but has not been exercised on an HA3 cluster. I would like to see if we can get a clean load onto an HA3 cluster with both features and then run through the BSBM UPDATE + EXPLORE workloads.

        Committed revision r7211.

        Show
        bryanthompson bryanthompson added a comment - Enabling HALog compression in CI. WriteCache? compaction was recently enabled and is looking good in CI, but has not been exercised on an HA3 cluster. I would like to see if we can get a clean load onto an HA3 cluster with both features and then run through the BSBM UPDATE + EXPLORE workloads. Committed revision r7211.
        Hide
        bryanthompson bryanthompson added a comment -

        Disabling HALog compression. It is causing failures for me locally. Stack trace is below.

        ERROR: 19737 2013-06-03 10:03:19,871      com.bigdata.rwstore.RWStore$11 com.bigdata.io.writecache.WriteCacheService$WriteTask.call(WriteCacheService.java:953): java.lang.AssertionError: b.capacity=6254, checksumBuffer.capacity=1048576
        java.lang.AssertionError: b.capacity=6254, checksumBuffer.capacity=1048576
        at com.bigdata.io.writecache.WriteCache.getWholeBufferChecksum(WriteCache.java:803)
        at com.bigdata.io.writecache.WriteCache.newHAPackage(WriteCache.java:1651)
        at com.bigdata.io.writecache.WriteCacheService$WriteTask.writeCacheBlock(WriteCacheService.java:1403)
        at com.bigdata.io.writecache.WriteCacheService$WriteTask.doRun(WriteCacheService.java:1031)
        at com.bigdata.io.writecache.WriteCacheService$WriteTask.call(WriteCacheService.java:900)
        at com.bigdata.io.writecache.WriteCacheService$WriteTask.call(WriteCacheService.java:1)
        at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
        at java.util.concurrent.FutureTask.run(FutureTask.java:166)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
        at java.lang.Thread.run(Thread.java:722)
        ERROR: 19737 2013-06-03 10:03:19,871      com.bigdata.journal.jini.ha.HAJournal.executorService1 com.bigdata.rdf.store.AbstractTripleStore.create(AbstractTripleStore.java:1719): java.lang.RuntimeException: java.lang.AssertionError: b.capacity=6254, checksumBuffer.capacity=1048576: lastRootBlock=rootBlock{ rootBlock=0, challisField=0, version=3, nextOffset=0, localTime=1370278981364 [Monday, June 3, 2013 10:03:01 AM PDT], firstCommitTime=0, lastCommitTime=0, commitCounter=0, commitRecordAddr={off=NATIVE:0,len=0}, commitRecordIndexAddr={off=NATIVE:0,len=0}, blockSequence=0, quorumToken=-1, metaBitsAddr=0, metaStartAddr=0, storeType=RW, uuid=61bb4399-69b7-49f4-b354-ed8bac09b2d0, offsetBits=42, checksum=23665086, createTime=1370278981358 [Monday, June 3, 2013 10:03:01 AM PDT], closeTime=0}
        java.lang.RuntimeException: java.lang.AssertionError: b.capacity=6254, checksumBuffer.capacity=1048576: lastRootBlock=rootBlock{  rootBlock=0, challisField=0, version=3, nextOffset=0, localTime=1370278981364 [Monday, June 3, 2013 10:03:01 AM PDT], firstCommitTime=0, lastCommitTime=0, commitCounter=0, commitRecordAddr={off=NATIVE:0,len=0}, commitRecordIndexAddr={off=NATIVE:0,len=0}, blockSequence=0, quorumToken=-1, metaBitsAddr=0, metaStartAddr=0, storeType=RW, uuid=61bb4399-69b7-49f4-b354-ed8bac09b2d0, offsetBits=42, checksum=23665086, createTime=1370278981358 [Monday, June 3, 2013 10:03:01 AM PDT], closeTime=0}
        at com.bigdata.journal.AbstractJournal.commit(AbstractJournal.java:2981)
        at com.bigdata.rdf.store.LocalTripleStore.commit(LocalTripleStore.java:80)
        at com.bigdata.rdf.store.AbstractTripleStore.create(AbstractTripleStore.java:1699)
        at com.bigdata.rdf.sail.BigdataSail.createLTS(BigdataSail.java:746)
        at com.bigdata.rdf.sail.CreateKBTask.doRun(CreateKBTask.java:158)
        at com.bigdata.rdf.sail.CreateKBTask.call(CreateKBTask.java:71)
        at com.bigdata.rdf.sail.CreateKBTask.call(CreateKBTask.java:1)
        at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
        at java.util.concurrent.FutureTask.run(FutureTask.java:166)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
        at java.lang.Thread.run(Thread.java:722)
        Caused by: java.lang.RuntimeException: java.lang.AssertionError: b.capacity=6254, checksumBuffer.capacity=1048576
        at com.bigdata.io.writecache.WriteCacheService.flush(WriteCacheService.java:2199)
        at com.bigdata.io.writecache.WriteCacheService.flush(WriteCacheService.java:2043)
        at com.bigdata.rwstore.RWStore.commit(RWStore.java:3099)
        at com.bigdata.journal.RWStrategy.commit(RWStrategy.java:448)
        at com.bigdata.journal.AbstractJournal.commitNow(AbstractJournal.java:3284)
        at com.bigdata.journal.AbstractJournal.commit(AbstractJournal.java:2979)
        ... 11 more
        Caused by: java.lang.AssertionError: b.capacity=6254, checksumBuffer.capacity=1048576
        at com.bigdata.io.writecache.WriteCache.getWholeBufferChecksum(WriteCache.java:803)
        at com.bigdata.io.writecache.WriteCache.newHAPackage(WriteCache.java:1651)
        at com.bigdata.io.writecache.WriteCacheService$WriteTask.writeCacheBlock(WriteCacheService.java:1403)
        at com.bigdata.io.writecache.WriteCacheService$WriteTask.doRun(WriteCacheService.java:1031)
        at com.bigdata.io.writecache.WriteCacheService$WriteTask.call(WriteCacheService.java:900)
        at com.bigdata.io.writecache.WriteCacheService$WriteTask.call(WriteCacheService.java:1)
        ... 5 more
        

        Committed revision r7189.

        Show
        bryanthompson bryanthompson added a comment - Disabling HALog compression. It is causing failures for me locally. Stack trace is below. ERROR: 19737 2013-06-03 10:03:19,871 com.bigdata.rwstore.RWStore$11 com.bigdata.io.writecache.WriteCacheService$WriteTask.call(WriteCacheService.java:953): java.lang.AssertionError: b.capacity=6254, checksumBuffer.capacity=1048576 java.lang.AssertionError: b.capacity=6254, checksumBuffer.capacity=1048576 at com.bigdata.io.writecache.WriteCache.getWholeBufferChecksum(WriteCache.java:803) at com.bigdata.io.writecache.WriteCache.newHAPackage(WriteCache.java:1651) at com.bigdata.io.writecache.WriteCacheService$WriteTask.writeCacheBlock(WriteCacheService.java:1403) at com.bigdata.io.writecache.WriteCacheService$WriteTask.doRun(WriteCacheService.java:1031) at com.bigdata.io.writecache.WriteCacheService$WriteTask.call(WriteCacheService.java:900) at com.bigdata.io.writecache.WriteCacheService$WriteTask.call(WriteCacheService.java:1) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334) at java.util.concurrent.FutureTask.run(FutureTask.java:166) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603) at java.lang.Thread.run(Thread.java:722) ERROR: 19737 2013-06-03 10:03:19,871 com.bigdata.journal.jini.ha.HAJournal.executorService1 com.bigdata.rdf.store.AbstractTripleStore.create(AbstractTripleStore.java:1719): java.lang.RuntimeException: java.lang.AssertionError: b.capacity=6254, checksumBuffer.capacity=1048576: lastRootBlock=rootBlock{ rootBlock=0, challisField=0, version=3, nextOffset=0, localTime=1370278981364 [Monday, June 3, 2013 10:03:01 AM PDT], firstCommitTime=0, lastCommitTime=0, commitCounter=0, commitRecordAddr={off=NATIVE:0,len=0}, commitRecordIndexAddr={off=NATIVE:0,len=0}, blockSequence=0, quorumToken=-1, metaBitsAddr=0, metaStartAddr=0, storeType=RW, uuid=61bb4399-69b7-49f4-b354-ed8bac09b2d0, offsetBits=42, checksum=23665086, createTime=1370278981358 [Monday, June 3, 2013 10:03:01 AM PDT], closeTime=0} java.lang.RuntimeException: java.lang.AssertionError: b.capacity=6254, checksumBuffer.capacity=1048576: lastRootBlock=rootBlock{ rootBlock=0, challisField=0, version=3, nextOffset=0, localTime=1370278981364 [Monday, June 3, 2013 10:03:01 AM PDT], firstCommitTime=0, lastCommitTime=0, commitCounter=0, commitRecordAddr={off=NATIVE:0,len=0}, commitRecordIndexAddr={off=NATIVE:0,len=0}, blockSequence=0, quorumToken=-1, metaBitsAddr=0, metaStartAddr=0, storeType=RW, uuid=61bb4399-69b7-49f4-b354-ed8bac09b2d0, offsetBits=42, checksum=23665086, createTime=1370278981358 [Monday, June 3, 2013 10:03:01 AM PDT], closeTime=0} at com.bigdata.journal.AbstractJournal.commit(AbstractJournal.java:2981) at com.bigdata.rdf.store.LocalTripleStore.commit(LocalTripleStore.java:80) at com.bigdata.rdf.store.AbstractTripleStore.create(AbstractTripleStore.java:1699) at com.bigdata.rdf.sail.BigdataSail.createLTS(BigdataSail.java:746) at com.bigdata.rdf.sail.CreateKBTask.doRun(CreateKBTask.java:158) at com.bigdata.rdf.sail.CreateKBTask.call(CreateKBTask.java:71) at com.bigdata.rdf.sail.CreateKBTask.call(CreateKBTask.java:1) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334) at java.util.concurrent.FutureTask.run(FutureTask.java:166) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603) at java.lang.Thread.run(Thread.java:722) Caused by: java.lang.RuntimeException: java.lang.AssertionError: b.capacity=6254, checksumBuffer.capacity=1048576 at com.bigdata.io.writecache.WriteCacheService.flush(WriteCacheService.java:2199) at com.bigdata.io.writecache.WriteCacheService.flush(WriteCacheService.java:2043) at com.bigdata.rwstore.RWStore.commit(RWStore.java:3099) at com.bigdata.journal.RWStrategy.commit(RWStrategy.java:448) at com.bigdata.journal.AbstractJournal.commitNow(AbstractJournal.java:3284) at com.bigdata.journal.AbstractJournal.commit(AbstractJournal.java:2979) ... 11 more Caused by: java.lang.AssertionError: b.capacity=6254, checksumBuffer.capacity=1048576 at com.bigdata.io.writecache.WriteCache.getWholeBufferChecksum(WriteCache.java:803) at com.bigdata.io.writecache.WriteCache.newHAPackage(WriteCache.java:1651) at com.bigdata.io.writecache.WriteCacheService$WriteTask.writeCacheBlock(WriteCacheService.java:1403) at com.bigdata.io.writecache.WriteCacheService$WriteTask.doRun(WriteCacheService.java:1031) at com.bigdata.io.writecache.WriteCacheService$WriteTask.call(WriteCacheService.java:900) at com.bigdata.io.writecache.WriteCacheService$WriteTask.call(WriteCacheService.java:1) ... 5 more Committed revision r7189.
        Hide
        bryanthompson bryanthompson added a comment -

        Martyn has committed a fix for HALog compaction for the WORM which also ensures that the HALog files for the WORM include both the HA message and the payload. We are awaiting feedback from CI. It is green for him locally.

        Show
        bryanthompson bryanthompson added a comment - Martyn has committed a fix for HALog compaction for the WORM which also ensures that the HALog files for the WORM include both the HA message and the payload. We are awaiting feedback from CI. It is green for him locally.
        Hide
        bryanthompson bryanthompson added a comment -

        HALog compression appears to be working for the RWStore. Martyn is still looking at an issue with the WORM HALog integration/

        Show
        bryanthompson bryanthompson added a comment - HALog compression appears to be working for the RWStore. Martyn is still looking at an issue with the WORM HALog integration/
        Hide
        bryanthompson bryanthompson added a comment -

        Enabled in r7170.

        Show
        bryanthompson bryanthompson added a comment - Enabled in r7170.
        Hide
        bryanthompson bryanthompson added a comment -

        This feature is integrated but still requires test on an HA3 cluster.

        Show
        bryanthompson bryanthompson added a comment - This feature is integrated but still requires test on an HA3 cluster.
        Hide
        bryanthompson bryanthompson added a comment -

        Write cache payload compression prior to replication and compacted HALog files.

        Changes in process to also provide payload storage for the WORM in the
        HALog files.

        Added a CompressorRegistry for configurable block compression schemes.

        WCS compaction still observed to fail for testStartAB_C_LiveResync and is disabled in the WCS constructor.

        Compression, WriteCache, WORM, RWJournal, HA, and SPARQL test suites are green locally.

        See https://sourceforge.net/apps/trac/bigdata/ticket/652 (Compress write cache blocks for replication and in HALogs)

        Committed revision r7161.

        Show
        bryanthompson bryanthompson added a comment - Write cache payload compression prior to replication and compacted HALog files. Changes in process to also provide payload storage for the WORM in the HALog files. Added a CompressorRegistry for configurable block compression schemes. WCS compaction still observed to fail for testStartAB_C_LiveResync and is disabled in the WCS constructor. Compression, WriteCache, WORM, RWJournal, HA, and SPARQL test suites are green locally. See https://sourceforge.net/apps/trac/bigdata/ticket/652 (Compress write cache blocks for replication and in HALogs) Committed revision r7161.

          People

          • Assignee:
            martyncutcher martyncutcher
            Reporter:
            bryanthompson bryanthompson
          • Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved: