We might benefit by choosing a non-SPO index when testing a fully bound triple. Consider:
query :- (a b ?c) AND (?c, d, e).
Assuming that (a b ?c) is more selective, we would normally choose the SPO index for both access paths. However, the POS index will have better locality for (?c d e), so perhaps we would do better by sending the binding sets to that index?
The guiding priciple would be, "when fully bound, choose the index with better locality based on the variable(s) in the triple pattern."
If you think this makes sense, let's file and issue for this. I would have to review LUBM/BSBM to be certain, but I would not be suprised if both of those benchmarks included queries which had the same characteristic. As the number of variables in the second triple pattern increases, there will be less locality in the index so this might work better for one unbound triple patterns than for two unbound triple patterns.
One other factor, especially in scale-out, is that we have a bloom filter in front of the primary index for the statement relation (SPO/SPOC). The bloom filter provides a significant performance boost. If we decide to choose indices other than SPO/SPOC, then we should make sure that we enable the bloom filter for the rest of the statement indices.