Uploaded image for project: 'Blazegraph (by SYSTAP)'
  1. Blazegraph (by SYSTAP)
  2. BLZG-401

WORMStrategy appears to double in size after reopen of the Journal.

    Details

    • Type: Bug
    • Status: Done
    • Resolution: Done
    • Affects Version/s: QUADS_QUERY_BRANCH
    • Fix Version/s: None
    • Component/s: Journal

      Description

      There appears to be a bug where the value of initialExtent on the journal when it is (re-)opened is being set to its current file extent rather than the value of the "com.bigdata.journal.AbstractJournal.initialExtent" property.

      If you look at AbstractBufferStrategyBLZG-511 you will see the code which imposes the policy for the file growth.

              /*
               * Increase by the initial extent or by 32M, whichever is greater, but
               * by no less that the requested amount.
               */
              long newExtent = userExtent
                      + Math.max(needed, Math.max(initialExtent,
                                      getMinimumExtension()));
      

      The [initialExtent] property default is 10M. However, we generally specify 100M or 200M for that property in configuration files. However, if you look at WORMStrategyBLZG-166 you will see that FileMetadata.extent (the current size of the file) is being passed in where it should be passing in the configured value for the initialExtent (probably fileMetadata.getProperty(Options.INITIAL_EXTENT,Options.DEFAULT_INITIAL_EXTENT)). The upshot of this is that the initial file growth is slower than is intended (32M per extension) while after a journal re-open the file growth is doubling the file extent as of the time when the file was reopened. I do not see anything in the code which would lead me to believe that it would extend the file until it was out of room for the next allocation, so it is not as if the allocated space remains unused.

      This "doubling" was probably introduced when the WORMStrategy was written to replace the old DiskOnlyStrategy. This should be easy enough to fix, we just need to pass in the "initalExtent" rather than the current file extent to the WORMStrategy and verify that nothing else had a dependency on those changed semantics for that constructor argument.

      I expect that the WORM also evidences growth in the data written, especially for the lexicon, as the size of the kb instance grows. Moving the large literals and URIs out of the ID2TERM index should fix that as new revisions will not involve making persistent copies of the literals/URIs involved. In fact, Mike has suggested that we do this for everything which is not inlined, effectively getting rid of ID2TERM and making all of the keys in the TERM2ID index based on a prefix (URI, Literal, BNode), a hash code, and a counter to break ties in the hash code. All URIs and Literals would be raw records allocated on the Journal. See [1].

      [1] https://sourceforge.net/apps/trac/bigdata/ticket/109

        Activity

        Hide
        bryanthompson bryanthompson added a comment -

        Matt,

        I've committed the change described above to the WORMStrategy constructor. Please try this change out on your side and let me know how it does for you.

        Thanks,
        Bryan

        Show
        bryanthompson bryanthompson added a comment - Matt, I've committed the change described above to the WORMStrategy constructor. Please try this change out on your side and let me know how it does for you. Thanks, Bryan
        Hide
        bryanthompson bryanthompson added a comment -

        I've back ported this fix to the trunk. Since the configured Properties are not exposed on FileMetadata in the trunk, I made the initialExtent property passed to the FileMetadata constructor into a field on the FileMetadata object and expose it to WORMStrategy.

        Committed revision 4177.

        Show
        bryanthompson bryanthompson added a comment - I've back ported this fix to the trunk. Since the configured Properties are not exposed on FileMetadata in the trunk, I made the initialExtent property passed to the FileMetadata constructor into a field on the FileMetadata object and expose it to WORMStrategy. Committed revision 4177.

          People

          • Assignee:
            bryanthompson bryanthompson
            Reporter:
            bryanthompson bryanthompson
          • Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved: