Uploaded image for project: 'Blazegraph (by SYSTAP)'
  1. Blazegraph (by SYSTAP)
  2. BLZG-1128

Snapshot mechanism breaks with metabit demi-spaces

    Details

      Description

      BLZG-1003 (Support larger metabit allocations) introduced demi-spaces near the head of the RWStore that come into play once the backing store has a large number of allocations. It appears that the demi-space model requires additional isolation in order to have a deterministic snapshot of these demi-spaces. The absence of that isolation can result in errors in the RWStore VERSION (stored in the metabits header) or in the allocators (such as two allocators at the same address) if the RWStore roles through a commit during the snapshot.

      We should also implement a utility class to validate snapshots and HALog files. This could be essentially the same as the HARestore, perhaps with an open to test-open the Journal (perhaps in a read-only mode) after decompressing each snapshot and after applying each HALog file. There might be some additional tests that we could apply once we open that journal. If we are just running validation then the journal should be deleted afterwards.

      Bryan

        Activity

        Hide
        bryanthompson bryanthompson added a comment -

        Martyn is working to develop a test that replicates these conditions by forcing the use of the metabits demi-spaces on small files through a configuration option. Due to the race conditions involved, this test is likely to only stochastically fail. Further, since the store file will be small during the test runs, the failure rate may be relatively slow.

        Once we have a test that replicates the problem we can implement the fix. The fix will acquire the allocation lock while snapshotting the metabits demi-spaces. This will be a relatively fast operation and the cloned metabits could even be taken from RAM (array clone). The remainder of the snapshot operation should proceed normally.

        Show
        bryanthompson bryanthompson added a comment - Martyn is working to develop a test that replicates these conditions by forcing the use of the metabits demi-spaces on small files through a configuration option. Due to the race conditions involved, this test is likely to only stochastically fail. Further, since the store file will be small during the test runs, the failure rate may be relatively slow. Once we have a test that replicates the problem we can implement the fix. The fix will acquire the allocation lock while snapshotting the metabits demi-spaces. This will be a relatively fast operation and the cloned metabits could even be taken from RAM (array clone). The remainder of the snapshot operation should proceed normally.
        Hide
        martyncutcher martyncutcher added a comment -

        The problem is not restricted to demi-spaces but rather the atomicity of the rootblocks, metabits and allocation data in the snapshot while update transactions are committed.

        In addition to cloning the metabits data we must also clone the allocation data. All this is in memory so will be a fast operation
        - typically less than 1M of allocation data per 10G of journal data on disk.

        Show
        martyncutcher martyncutcher added a comment - The problem is not restricted to demi-spaces but rather the atomicity of the rootblocks, metabits and allocation data in the snapshot while update transactions are committed. In addition to cloning the metabits data we must also clone the allocation data. All this is in memory so will be a fast operation - typically less than 1M of allocation data per 10G of journal data on disk.
        Hide
        martyncutcher martyncutcher added a comment -

        The basic mechanism to clone the "core" data when the snapshot is started is now implemented.

        The existing tests appear to pass reliably, but I need to write a new stress test that replicates the problem scenario: where a snapshot is taken for a large store while multiple commits are made.

        Show
        martyncutcher martyncutcher added a comment - The basic mechanism to clone the "core" data when the snapshot is started is now implemented. The existing tests appear to pass reliably, but I need to write a new stress test that replicates the problem scenario: where a snapshot is taken for a large store while multiple commits are made.
        Hide
        bryanthompson bryanthompson added a comment -

        We have a test that fails when the new snapshot isolation mechanism is disabled. Martyn is preparing a branch for a code review so we can bring this fix back to the master.

        Show
        bryanthompson bryanthompson added a comment - We have a test that fails when the new snapshot isolation mechanism is disabled. Martyn is preparing a branch for a code review so we can bring this fix back to the master.
        Hide
        bryanthompson bryanthompson added a comment -

        - Remove the following files from the RWSTORE_SNAPSHOT branch:
        - TestApplyIncrementalSnapshot
        - TestEOFException967
        - TestReverserator
        - Replace snapshotData with an ISnapshotData. Pass through in place of Set or TreeMap. Provide documentation on this interface and IHABufferStrategy.writeOnStream().
        - Attempt to derive a version of TestHA1SnapshotPolicy by changing the RWStore properties that can be used as a regression test in CI.
        - Document the new/changed methods on FixedAllocator.
        - Remove old RWStore.writeOnStream() method (the version w/o the SnapshotData).
        - Document MergeStreamWithSortedSet and make logging in that class conditional.
        - TestFileChannelUtility.testReopenerInputStream(). The use of File.deleteOnExit() is a resource leak for CI. Please use an appropriate try/finally pattern instead.
        - TestRWStrategy.test_snapshotData()
        - should use try/finally pattern to destroy the journal.

        Show
        bryanthompson bryanthompson added a comment - - Remove the following files from the RWSTORE_SNAPSHOT branch: - TestApplyIncrementalSnapshot - TestEOFException967 - TestReverserator - Replace snapshotData with an ISnapshotData. Pass through in place of Set or TreeMap. Provide documentation on this interface and IHABufferStrategy.writeOnStream(). - Attempt to derive a version of TestHA1SnapshotPolicy by changing the RWStore properties that can be used as a regression test in CI. - Document the new/changed methods on FixedAllocator. - Remove old RWStore.writeOnStream() method (the version w/o the SnapshotData). - Document MergeStreamWithSortedSet and make logging in that class conditional. - TestFileChannelUtility.testReopenerInputStream(). The use of File.deleteOnExit() is a resource leak for CI. Please use an appropriate try/finally pattern instead. - TestRWStrategy.test_snapshotData() - should use try/finally pattern to destroy the journal.
        Hide
        martyncutcher martyncutcher added a comment -

        Have created an ISnapshotData interface and refactored the snapshot code to use it.

        Show
        martyncutcher martyncutcher added a comment - Have created an ISnapshotData interface and refactored the snapshot code to use it.
        Hide
        bryanthompson bryanthompson added a comment -

        TODO:


        - We have one error in CI for this branch that needs needs to be investigated.

        com.bigdata.journal.jini.ha.TestHAJournalServerOverride.test_AB_forceRemoveService_B
        


        - Please restore TestReverserator and TestAll in the com.bigdata.btree.filters test suite from the master. These classes should not have been touched within the snapshot bug fix branch.

        Once those issues have been resolved this branch can be merged back to the master.

        Show
        bryanthompson bryanthompson added a comment - TODO: - We have one error in CI for this branch that needs needs to be investigated. com.bigdata.journal.jini.ha.TestHAJournalServerOverride.test_AB_forceRemoveService_B - Please restore TestReverserator and TestAll in the com.bigdata.btree.filters test suite from the master. These classes should not have been touched within the snapshot bug fix branch. Once those issues have been resolved this branch can be merged back to the master.
        Hide
        bryanthompson bryanthompson added a comment -

        I restored the TestReverserator class (from the master) and relinked it from com.bigdata.btree.filters.TestAll. However, we appear to have lost the stress test written for the spin lock in the B+Tree reverse cursor. This should have been in the master.... maybe it is in the jetty client branch?

        Martyn is looking at a stochastic failure in WriteTask.run()/WriteCacheService.close(). However, it appears to be unrelated to this ticket even though it was observed in CI for this ticket.

        Merged RWSTORE_SNAPSHOT to master. 720fecc12d7a1306a60841a0c01f451a63bad6e6

        This ticket is closed.

        Show
        bryanthompson bryanthompson added a comment - I restored the TestReverserator class (from the master) and relinked it from com.bigdata.btree.filters.TestAll. However, we appear to have lost the stress test written for the spin lock in the B+Tree reverse cursor. This should have been in the master.... maybe it is in the jetty client branch? Martyn is looking at a stochastic failure in WriteTask.run()/WriteCacheService.close(). However, it appears to be unrelated to this ticket even though it was observed in CI for this ticket. Merged RWSTORE_SNAPSHOT to master. 720fecc12d7a1306a60841a0c01f451a63bad6e6 This ticket is closed.

          People

          • Assignee:
            martyncutcher martyncutcher
            Reporter:
            bryanthompson bryanthompson
          • Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved: