• Type: New Feature
    • Status: Done
    • Resolution: Done
    • Affects Version/s: BIGDATA_RELEASE_1_2_0
    • Fix Version/s: None
    • Component/s: B+Tree


      Implement and evaluate a "finger" for the B+Tree and possibly the HTree index. A finger is an abstraction which retains a reference to the most recently visited leaf/tuple of the index. Multiple fingers are possible.

      The purpose of the finger is to avoid top-down search through the index nodes by testing the leaf which the "finger" is "on" first to see if the desired key is in that leaf. Since bigdata uses a conditional insert pattern for mutation to avoid driving IOs when the update would not change the data in the index, it is a very common pattern to hit the same tuple twice in a row (for update). Since we order access paths during query, it is also a common pattern to have tightly clustered reads against the index. A finger would provide a fast path which avoids comparison with the node keys when the desired key is lies within the current leaf.

      %time	Milis	Name
      100%	2540	com.bigdata.btree.raba.codec.FrontCodedRabaCoder$[])
      63%	1610	com.bigdata.btree.Node.findChild(byte[])
      30%	750	com.bigdata.btree.Leaf.indexOf(byte[])

      As the table above shows, we spend 2/3rds of our time searching the nodes.

      The finger needs to retain at least the reference of the leaf. It could retain other information as well, such as the offset of the tuple in the leaf or the offset of the leaf in the total index order.

      The indices already have a getRootOrFinger() method which is a placeholder for this functionality. It seems likely that we could realize the finger as an interface with different implementations for read-only and mutable indices. The most promising place to locate the finger is on the Tuple object which is passed through the top-down navigation methods on the index. The BTree already exposes variant methods for lookup(), insert(), contains(), etc which accept the caller's Tuple and otherwise use an internal Tuple object. This makes it possible for each process operating on an index to have its own Tuple (for efficiency in copying data out of the index pages) and its own finger as a hidden field on that Tuple. Note that the HTree lacks this abstraction.

      There will be some subtle interactions between a finger and a mutable index view due to the weak references maintained between parents and children and the bottom up incremental eviction logic of the B+Tree. We will need to hold the stack of references from the root to the leaf. Otherwise out of order eviction could occur with the result that the index attempts to write out a dirty parent before it has written out the dirty children. For example, this could show up as the eviction of a parent and the clearing of all references to that parent while we still have a dirty child. When the child was evicted, it would discover that some parent reference had been cleared. This would break the eviction logic.

      We already have a Stack class on BTree which maintains precisely the necessary state for a finger over a mutable BTree. That class is used to support the mutable Cursor, which allows forward and reverse scans on the index and is safe under mutation. However, the cursor is slower than the nested iterator scan pattern which is used for pure forward traversal of a key range without concurrent mutation.

      For a FusedView, this should wind up with one finger per index component in the view.

      Also, optimize rangeCount() when the fromKey and toKey are the same. This should turn into contains().




            bryanthompson bryanthompson
            bryanthompson bryanthompson
            0 Vote for this issue
            1 Start watching this issue