This issue has been tracked down to a failure to hold a lock in the HardReferenceGlobalLRU when testing the cache for an entry under a key. When there is a cache hit, it is possible for any cache entry need the end of the LRU chain to be concurrently 'recycled' and the value associated with the key in the entry replaced before the entry was returned to the caller.
Two variants of the HardReferenceGlobalLRU have been created: a "recycler" version, which recycles the LRU Entry but holds the lock across get(key) and a non-recycler version. The non-recycler version has higher throughput since it does not need to hold the lock in get(key). The recycler version is currently enabled by default, but we plan to enable the non-recycler version once we can verify that it does not cause GC problems on the clustered database.