Identify a series of stress tests and performance tests to run on a CI cluster. Setup a CI cluster, and automate the execution of those tests.
Cluster CI performance tests should report out high level performance metrics so we can easily identify and track performance regressions, but should also archive the detailed performance metrics for each activity.
Cluster based performance tests should include a series of data load performance tests (LUBM, Uniprot, billion triples challenge, etc), as well as standard query benchmarks (LUBM, BSBM).
Here is a series of things which we should be running for the cluster based CI:
Depending on the balance of change in query versus data load, data could be re-loaded for each CI pass or periodically.
In addition to these standard benchmarks, we should also high volume exercise edge conditions on the cluster. For example, queries such as the following can be run against any of those data sets.