Affects Version/s: None
Fix Version/s: BLAZEGRAPH_2_X_BACKLOG
Another approach would remove the serialization and insertion into the write cache latency from the touch() thread. 2/3rds of evicted pages are clean, so there is no work (based on the counters we just removed). For the remainder, it is important to remember that these are the least recently used (LRU) pages (we could also examin LIRS policies again and see if we can fit them into this cache model). Also note that the total number of nodes read is quite small so most nodes are remaining in memory. The hot spot is therefore the eviction of dirty leaves. Since all index writes duri data load are blocked, sorted and vectored, the chance of contention (dirty leaf is evicted at basically the same time that we need it again) is in general small, but there is probably some transition period when the likelihood of that contends goes up (length of retention queue is on order of number of touched leaves on an index for a blocked, sorted, write).
The problem with changing things in this partial of the code is that the opportunity for concurrent mutation arises. This leads to data corruption in the index if all paths are not safe. So it is a big issue and the reason why these code changes will not be in 2.0 (aside from the negative performance gain).
What we did with the write cache service was decouple the inserts from the disk writes. Insert has nearly no latency.
If we can do the same for touch() and the dirty page serialization and eviction, then we will have removed the touch() hotspot. Bearing in mind that we need to guard against concurrent mutation, one approach is to place these items into a second queue that feeds a worked thread pool which handles the evictions. That is more of less what the level set eviction algorithm would be doing in loadChild(). But the code we have been using so far ensures safety by handling eviction during a page miss on a read, and the latency of the eviction is clearly too large for this to work. It needs to be more completely decoupled. Is means at we need to actively guard against concurrent mutation.
There could be several ways of doing this. We might be able to extend the reference semantics used for top-to-bottom navigation in e Btree to yank a dirty page out of the work queue or invalid it if taken by a worker. Or the worker could do e work in an demote the fashion using a friend object to store the serialized page. Once the serialization was done, we could check some new field (mutation count on the index page) and verify (with a new lock) that the page had not been dirtied, at which point we could replaced the data pointer with the serialized page pointer and update the persistent address in the parent (which needs to be atomic if we allow concurrent writers to avoid data corruption).
The touch() method would be less contended if we did this because the high latency work of page serialization would be asynchronous and handles by threads supporting that eviction queue. I think that contention would be rare, so we would not have many cases where work was done by an eviction thread and then discarded because the mutation count on the dirty page had been updated concurrently. And we could track the occurrences of those events.