Uploaded image for project: 'Blazegraph (by SYSTAP)'
  1. Blazegraph (by SYSTAP)
  2. BLZG-1089

SELECT COUNT(...) (DISTINCT|REDUCED) {single-triple-pattern} is slow.

    XMLWordPrintable

    Details

      Description

      We can optimize the following query pattern (at any level) using the ESTCARD API (fast-range counts).

        SELECT COUNT(...) (DISTINCT|REDUCED) {single-triple-pattern} 
      

      Currently the cost of this operation is O(N) where N is the number of triples. This is because it is written to use a key-range scan of the selected index (SPO(C)). For triple store instances that are neither scale-out nor configured for full read-write transactions (which is to say, nearly all deployments) an exact answered can be in 2 key probes.

      This can be achieved using an AST Optimizer in a manner very similar to BLZG-1087 (DISTINCT PREDICATE is slow).

      In fact, for SPARQL QUERY we can just rewrite this into BIND() against a constant where the AST rewrite probes the index to obtain the constant value. However, we could also support runtime resolution. This has the advantage that variables could be projected into a sub-select and their various range counts obtained using the fast range count mechanism.

        Attachments

          Activity

            People

            Assignee:
            bryanthompson bryanthompson
            Reporter:
            bryanthompson bryanthompson
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

              Dates

              Created:
              Updated:
              Resolved: