Uploaded image for project: 'Blazegraph (by SYSTAP)'
  1. Blazegraph (by SYSTAP)
  2. BLZG-1101

Feature request: pre-heat the journal on startup

    Details

      Description

      A start of the journal against cold disk will have significant IO latency when compared to hot performance. A solution for this is a warm up procedure.

      Workaround: on startup use the DumpJournal utility (which is also accessible from the REST API) to force a full scan of all index pages: See http://wiki.bigdata.com/wiki/index.php/IOOptimization#Branching_Factors

      Proposed solution: Use a modified version of the logic in DumpJournal to scan only the non-leaf nodes of the committed versions of the indices. This will force them into both the file system and Java heaps.

      Note: the warmup procedure is integrated into web.xml. It can be enabled using the warmupTimeout context-param in web.xml.

        Issue Links

          Activity

          beebs Brad Bebee created issue -
          Hide
          bryanthompson bryanthompson added a comment -

          Some notes:


          - We should use one thread-per index to reduce time.
          - We should support warm-up of all indices or warm-up of those associated with a bigdata namespace (kb.*).

          The method that I intend to reuse is the following. See DumpJournal for an example of its use. However, this method reads all pages of an index, not just the non-leaf nodes. It should be parameterized to support reading of just the non-leaf nodes.

                          final BaseIndexStats stats = ndx.dumpPages(dumpPages);
          

          If we pass in the current level during the index scan then we can conditionally recurse to the next level of a B+Tree iff we know that the next level will have non-leaf nodes. This is knowable for the B+Tree because it is a balanced tree and the leaves always appear at the same depth for along any path from the root (for a given snapshot of the tree).

                              // normal read following the node hierarchy, using cache, etc.
                              final AbstractNode<?> child = ((Node) node).getChild(i);
          
                              // recursive dump
                              dumpPages(ndx, child, stats);
          

          However, this is not true of the HTree. The HTree is not a balanced tree and we do not think that we have any mechanism to tell whether the child is a leaf or not based on inspection of the parent directory page. In fact, we are not using durable HTree indices for the triple/quad store at this time (just for the analytic query mode) so this is not really a pressing concern. And even if we had to warm up HTree indices we could always just scan the whole thing.

          ----

          Another approach to a warm up protocol would be to do breadth first reads of the index pages. So, we read the root. Then we use a pool of threads to read all immediate (non-leaf) children of the root. Then recursively descend in breadth first steps. The thread pool would obviously need to be shared and bounded. This approach could be used for a single index or for all indices. But I think that the simple approach of using one thread-per index and scanning the indices in parallel is probably sufficient. As long we we do not force the B+Tree leaf reads then it should do a very good job up warming up the indices. There will still be IO latency involved when we do queries, but not more than 1 IO per leaf on average.

          Show
          bryanthompson bryanthompson added a comment - Some notes: - We should use one thread-per index to reduce time. - We should support warm-up of all indices or warm-up of those associated with a bigdata namespace (kb.*). The method that I intend to reuse is the following. See DumpJournal for an example of its use. However, this method reads all pages of an index, not just the non-leaf nodes. It should be parameterized to support reading of just the non-leaf nodes. final BaseIndexStats stats = ndx.dumpPages(dumpPages); If we pass in the current level during the index scan then we can conditionally recurse to the next level of a B+Tree iff we know that the next level will have non-leaf nodes. This is knowable for the B+Tree because it is a balanced tree and the leaves always appear at the same depth for along any path from the root (for a given snapshot of the tree). // normal read following the node hierarchy, using cache, etc. final AbstractNode<?> child = ((Node) node).getChild(i); // recursive dump dumpPages(ndx, child, stats); However, this is not true of the HTree. The HTree is not a balanced tree and we do not think that we have any mechanism to tell whether the child is a leaf or not based on inspection of the parent directory page. In fact, we are not using durable HTree indices for the triple/quad store at this time (just for the analytic query mode) so this is not really a pressing concern. And even if we had to warm up HTree indices we could always just scan the whole thing. ---- Another approach to a warm up protocol would be to do breadth first reads of the index pages. So, we read the root. Then we use a pool of threads to read all immediate (non-leaf) children of the root. Then recursively descend in breadth first steps. The thread pool would obviously need to be shared and bounded. This approach could be used for a single index or for all indices. But I think that the simple approach of using one thread-per index and scanning the indices in parallel is probably sufficient. As long we we do not force the B+Tree leaf reads then it should do a very good job up warming up the indices. There will still be IO latency involved when we do queries, but not more than 1 IO per leaf on average.
          Hide
          bryanthompson bryanthompson added a comment -

          Refactored ICheckpointProtocol#dumpPages(boolean:recursive) into page visitor pattern with options for recursive and whether or not to visit the leaves. See 9151be866b58cad3f13665eed91487efafebf77e in TICKET_1050 branch.

          Merged branch TICKET_1050 to master. See https://github.com/SYSTAP/bigdata/pull/48

          done. Expose "warmup" method on NSS. Specify namespace and then request warmup. Or allow this to be configured in web.xml.

          Show
          bryanthompson bryanthompson added a comment - Refactored ICheckpointProtocol#dumpPages(boolean:recursive) into page visitor pattern with options for recursive and whether or not to visit the leaves. See 9151be866b58cad3f13665eed91487efafebf77e in TICKET_1050 branch. Merged branch TICKET_1050 to master. See https://github.com/SYSTAP/bigdata/pull/48 done. Expose "warmup" method on NSS. Specify namespace and then request warmup. Or allow this to be configured in web.xml.
          Hide
          bryanthompson bryanthompson added a comment -

          See web.xml for how to enable warmup.

          Warmup is disabled by default. Edit the warmupTimeout context-param to specify the timeout in milliseconds for the warmup procedure. The warmup procedure defaults to warmup all namespaces. The warmupNamespaceList context-param may be used to warm up only selected namespaces.

          Commit a4b94afe98597ba7ab7b49e8acfe8181a3201c79

          Show
          bryanthompson bryanthompson added a comment - See web.xml for how to enable warmup. Warmup is disabled by default. Edit the warmupTimeout context-param to specify the timeout in milliseconds for the warmup procedure. The warmup procedure defaults to warmup all namespaces. The warmupNamespaceList context-param may be used to warm up only selected namespaces. Commit a4b94afe98597ba7ab7b49e8acfe8181a3201c79
          Hide
          bryanthompson bryanthompson added a comment -

          I've documented all web.xml config-param entries on the NSS wiki page.

          Show
          bryanthompson bryanthompson added a comment - I've documented all web.xml config-param entries on the NSS wiki page.
          beebs Brad Bebee made changes -
          Field Original Value New Value
          Workflow Trac Import v2 [ 12937 ] Trac Import v3 [ 14414 ]
          beebs Brad Bebee made changes -
          Workflow Trac Import v3 [ 14414 ] Trac Import v4 [ 15743 ]
          beebs Brad Bebee made changes -
          Workflow Trac Import v4 [ 15743 ] Trac Import v5 [ 17129 ]
          beebs Brad Bebee made changes -
          Labels Issue_patch_20150625
          beebs Brad Bebee made changes -
          Status Closed - Won't Fix [ 6 ] Open [ 1 ]
          beebs Brad Bebee made changes -
          Status Open [ 1 ] Accepted [ 10101 ]
          beebs Brad Bebee made changes -
          Status Accepted [ 10101 ] In Progress [ 3 ]
          beebs Brad Bebee made changes -
          Status In Progress [ 3 ] Resolved [ 5 ]
          beebs Brad Bebee made changes -
          Status Resolved [ 5 ] In Review [ 10100 ]
          beebs Brad Bebee made changes -
          Resolution Fixed [ 1 ] Done [ 10000 ]
          Status In Review [ 10100 ] Done [ 10000 ]
          beebs Brad Bebee made changes -
          Workflow Trac Import v5 [ 17129 ] Trac Import v6 [ 18296 ]
          michaelschmidt michaelschmidt made changes -
          Fix Version/s BLAZEGRAPH_RELEASE_1_5_2 [ 10164 ]
          beebs Brad Bebee made changes -
          Workflow Trac Import v6 [ 18296 ] Trac Import v7 [ 19693 ]
          bryanthompson bryanthompson made changes -
          Link This issue relates to BLZG-1546 [ BLZG-1546 ]
          beebs Brad Bebee made changes -
          Workflow Trac Import v7 [ 19693 ] Trac Import v8 [ 21316 ]

            People

            • Assignee:
              bryanthompson bryanthompson
              Reporter:
              bryanthompson bryanthompson
            • Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: