Affects Version/s: BLAZEGRAPH_2_1_0
Fix Version/s: BLAZEGRAPH_2_X_BACKLOG
I gave the scenario sketched at https://gist.github.com/ktk/a04e267dd776da2511692e96fc2b5d99 a quick try. First, I executed the following two count version of the queries (in a non-optimized, out-of the box blazegraph master > 2.1.0, triples mode, non-analytic), both reporting ~30M triples:
1a.) Query as is
The query took 4min 45s, with large intermediate results for some of the joins being produced (>30M). Still, the query plan is fully pipelined.
1b.) Hand optimized version
This query is way faster – it's essentially forcing a bushy plan in combination with an efficient merge join. This goes through in ~45s.
2.) Executing the original CONSTRUCT, according to Brad, takes about 40mins (similiar to the number reported on the Website).
Conclusion: query evaluation performance could definitely be improved by better planning (-> bushy plan), but doesn't seem to be the major bottleneck here. Also wondering where the time goes, unrequired materialization might be one root cause, but possibly not the only one. Estimating an insert ration of (only) 50k stmts/sec, what is what we get for loading, it should even be possible to get this down to 2-3 minutes.