Uploaded image for project: 'Blazegraph (by SYSTAP)'
  1. Blazegraph (by SYSTAP)
  2. BLZG-1245

Minimize GC OH and bound memory for analytic query mode (meta-ticket)



    • Type: New Feature
    • Status: Open
    • Resolution: Unresolved
    • Affects Version/s: BLAZEGRAPH_RELEASE_1_5_1
    • Fix Version/s: None
    • Component/s: Bigdata RDF Database
    • Labels:


      This is a meta-ticket that collects together several tickets related to sources of GC overhead and the ability to constrain the JVM heap memory associated with the analytic query mode.

      Note: One common source of GC is a slow disk leading to more retention on the heap.

      - BLZG-1102 Refactor quads mode access paths
      - enabler for quads mode RTO (BLZG-1240) and better query optimization. We need to refactor the quads access paths to (a) make it more transparent when reasoning about the query plans since the DISTINCT SPO will be exposed for default graph queries by this refactoring; (b) make it possible to use the runtime query optimizer with quads; and (c) making it possible to use the analytic query mode fully with quads (the DISTINCT SPO filter can not use the native heap right now due to the filter life cycle).
      - BLZG-537 Examine SCAN + FILTER versus EXPANDER pattern for default and named graph queries)
      - BLZG-604 Eliminate unnecessary dechunking and rechunking
      - BLZG-533 Vector the query engine on the native heap. Efficient storage of intermediate solutions, perhaps column-wise. Consider moving off the native heap so we can avoid a GC penalty. (Related: reorganize to allow vectoring of larger chunks. Modify to use a configured number of threads for query processing and do the work within the thread (if we are column-wise then we should have access to the data we need, though we might need some pre-fetch scheduling).
      - BLZG-772 Optimize exception patterns
      - BLZG-478 Reduce SPARQL parser heap churn using custom CharStream impl.
      - #XXXX Complete coverage for the analytic query mode to bound object heap for all query patterns.
      - BLZG-1250 Generalized Aggregation with HTree
      - BLZG-1249 Native memory implementation of ORDER BY
      - BLZG-1251 Allow analytic operators to spill to the disk.
      - BLZG-42 Per query memory limit for analytic query mode.
      - BLZG-43 Add System property to enable analytic query mode.

      - See bigdata/src/architecture/query-cost-model.xml
      - This document covers how we plan for quads query for named graphs and default graphs for the Journal and the IBigdataFederation.


          Issue Links



              bryanthompson bryanthompson
              bryanthompson bryanthompson
              0 Vote for this issue
              4 Start watching this issue