Uploaded image for project: 'Blazegraph (by SYSTAP)'
  1. Blazegraph (by SYSTAP)
  2. BLZG-192 Enable group commit by default
  3. BLZG-1312

More scalable management of deferred frees for allocation contexts.

    XMLWordPrintable

    Details

    • Type: Sub-task
    • Status: Reopened
    • Priority: Medium
    • Resolution: Unresolved
    • Affects Version/s: None
    • Fix Version/s: BLAZEGRAPH_2_X_BACKLOG
    • Component/s: RWStore
    • Labels:
      None

      Description

      The RWStore uses allocation contexts for group commit (one per task). There is also some discussion about introducing an allocation context for unisolated writes to provide a mechanism for denying alloc()/free() requests after a call top AbstractJournal.abort(). Such requests can appear after RWStore.reset() due to a data race between the invocation of RWStore.reset() and the interruption of the threads associated with an update. For example, we have 3-6 threads writing on the statement indices during update. If any one of those threads has not yet been interrupted until after RWStore.reset() then an alloc()/free() request for the unisolated connection could come through after the RWStore.reset(). See BLZG-1313

      The AllocationContext deferred free list is an Array<Long>

      The WikiData large load, with 20GB journal had ~ 50M inuse slots (46M 64 byte slots).

      So, in theory, say 50M * 8bytes per freed address for the AllocationContext when a namespace was deleted =~ 400M JVM heap for worst case.

      I needed a 10G JVM to load this without being GC bound.

      Options:

      1. Use non-heap memory
      2. Compression
      3. Convert to bits per allocator

      At scale the bits conversion would win, in this case. A simple test with random longs unsurprisingly showed no benefit from compression.

      We had around 10K allocators, so the memory overhead of a "free bits" bit map would be around 10k * 1K - 10M.

      After a few minutes thought I can see how this could be put together reasonably straightforwardly.

      ESTIMATE: 3 days (MC).

        Attachments

          Issue Links

            Activity

              People

              Assignee:
              martyncutcher martyncutcher
              Reporter:
              bryanthompson bryanthompson
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

                Dates

                Created:
                Updated:

                  Time Tracking

                  Estimated:
                  Original Estimate - 3 days
                  3d
                  Remaining:
                  Remaining Estimate - 3 days
                  3d
                  Logged:
                  Time Spent - Not Specified
                  Not Specified