Implement a scalable ORDER BY operator using the native process heap to buffer the solutions.
The easiest first approximation for this would buffer the solution sets on the memory store using a "Bucket" object to maintain the list of addresses for solutions having the same hash code, compute an order preserving hash code from the keys for the ORDER BY clause, and produce two correlated arrays:
- hash // the order preserving hash code
- Bucket // the collision bucket for that hash code
The arrays can then be sorted based on the hash code.
The solutions are delivered one "bucket" worth at a time. The data in the buckets are unordered, but the buckets themselves are ordered. Therefore, each bucket needs to undergo a secondary sort when it is delivered to the downstream operator in the pipeline.
A secondary refinement would replace the use of the Bucket object with the HTree to store the data.