Uploaded image for project: 'Blazegraph (by SYSTAP)'
  1. Blazegraph (by SYSTAP)
  2. BLZG-382

Manage the B+Tree page size for the RWStore.




      The B+Tree currently manages the size of the nodes and leaves in order to satisfy the target branching factor. This works out well for the WORM and scale-out, since we are always using perfect fit data records on the backing WORM journal and IndexSegment instances.

      However, with the RWStore we are better off managing the size of the page instead of the branching factor since this makes it possible to be more efficient on a backing store which is organized around fixed size allocation slots.

      Support a modified leaf split policy which adjusts the #of tuples in order to target an "as-serialized" page size. Since we may apply record level compression as well, it may still be necessary to target a page size which is larger than the actual desired page size on the backing store and then trigger a split if the encoded and compressed leaf is larger than the desired page size.

      The HTree (Extendible Hash Tree) is already moving in this direction -- see https://sourceforge.net/apps/trac/bigdata/ticket/203. For that index structure, we are managing a buffer of a given size and moving large keys and/or values out of line into raw records if they are more than a reasonable percentage of the page size. However, we can not move the keys out of line (at least, not as easily) for the B+Tree since the key is integral to the search.

      For the B+Tree, we can explore the standard techniques of a 2/3 split (B*Tree) in order to maintain pages closer to 100% page capacity (tuples are rotated among siblings until 2 siblings are full, at which point the siblings are split into three siblings). This is easy to do when the data are managed in a buffer of a fixed capacity, but it is more complicated to do when the data are managed as objects for a mutable leaf.

      This issue interacts then with the data structure for the mutable node/leaf of the B+Tree. Other interesting things to examine are (a) leaving "gaps" in the data structure to minimize the copying up/down of tuples; and (b) using MonetDB style "cracking" on the page to impose an eventual order. Cracking might be harmonized with exact match search by hash codes (per the HTree).




            bryanthompson bryanthompson
            bryanthompson bryanthompson
            0 Vote for this issue
            1 Start watching this issue