Affects Version/s: None
Fix Version/s: BLAZEGRAPH_RELEASE_1_5_2
The indexCache in AbstractJournal is used to retain access to recently used indices (ICheckpointProtocol is an index object) by (namespace,timestamp). Under heavy concurrent update pressure (as demonstrated by 16 or 32 client BSBM EXPLORE+UPDATE), the pace of updates causes this cache to grow quite large - as large as 1M entries after 120 runs of the cited benchmark.
The cache is not infinitely leaking memory for the indices. In those 1M entries, all but 14 (two commit points) will be weakly reachable and swept by GC once the timeout for those entries has expired. That timeout currently defaults to 60 seconds.
I believe that this timeout can be reduced to 5 seconds for such a workload, and that this low timeout might be a good default.
The use case for retaining a larger timeout is a workload where there are many namespaces and only occasional access to those namespaces. In this context, if a namespace is accessed again within 60 seconds then it will remain in memory and stay "hot".
The timeouts are imposed by the cleaner service in SynchronizedHardReferenceQueueWithTimeout. This service runs every 5 seconds. When it runs it scans the entries on the hard reference queue and clears the hard reference for any entry that has not been touched within the last 60 seconds.
These behaviors can be configured using com.bigdata.journal.Options. The relevant parameter is given below. While there are (many) other related parameters, I suspect that we only need to tune this one.
This ticket is to examine the performance impact of this parameter using BSBM EXPLORE+UPDATE with 1, 16, 32, and 64 threads and recommend a setting that provides higher throughput overall.