Uploaded image for project: 'Blazegraph (by SYSTAP)'
  1. Blazegraph (by SYSTAP)
  2. BLZG-828

performance impact of NSPIN

    XMLWordPrintable

    Details

    • Type: Bug
    • Status: Closed - Won't Fix
    • Resolution: Cannot Reproduce
    • Affects Version/s: BIGDATA_RELEASE_1_2_2
    • Fix Version/s: None
    • Component/s: Bigdata SAIL
    • Labels:
      None

      Description

      This ticket concerns the compile time constant:
      com.bigdata.relation.accesspath.BlockingBuffer.NSPIN

      The suggested resolution is to:
      a) replace this with two compile time constants, one for each of the two usages
      and
      EITHER
      b) substantial increase the value for the usage in _hasNext(long) (e.g. to 100000)
      OR
      c) add a configuration option for that usage
      OR
      d) both b) and c)

      ====

      The commentary is based on test results, and I will start with a description of the tests, then of the test hardware.

      The test dataset and queries

      The test dataset is approximately: 57,000 quads of test data conforming to Syapse proprietary ontologies, over 24 named graphs.

      The test queries are 11 moderately easy SPARQL queries involving only a few joins each.

      The test queries are generated by some python code, and the test harness asks the 11 queries three times over, in series, using the SPARQL end point.

      The test harness also allows the possibility of starting multiple parallel clients, and we consider
      a) a single client asking the 11 * 3 queries
      or
      b) six parallel clients asking a total of 6 * 11 * 3 queries: notice the queries are asked in the same order by each parallel client which maximizes both potential contention and potential cache hits

      The test hardware

      The primary hardware is my development machine which is a MacBook Pro with SSD and a quad core processor (with hyperthreading, for 8 threads).

      The secondary hardware is an AWS instance

      Linux/2.6.32-355-ec2 amd64
      Intel(R) Xeon(R) CPU           E5645  @ 2.40GHz Family 6 Model 44 Stepping 2, GenuineIntel #CPU=2
      Sun Microsystems Inc. 1.6.0_27
      freeMemory=509153376
      

      The issue

      Prior to any changes at r7390 we observed the following conundrum.

      For a single client the 33 queries take a wall-time of 13.3s, but the client CPU time is 1.6s and the server CPU time is 1.8s leaving 10 seconds missing
      - even assuming synchronized single threading (on a quad core box )

      Doing 6 parallel clients performance is somewhat better. The wall-time is 14.5s , the client CPU time is 14.3s and the server CPU time is 10.4s seconds. Still only about 20% load (8 threads x 14.5), but a lot better than in the one client case.

      Using yourkit, with wall time measurements and tracing, drew attention to the _hasNext() method above, and the

      With the adjustments described above, the system goes a lot faster.

      Single client: wall-time 2.4s client CPU 1.6s server CPU 1.9s

      Six parallel clients: wall-time 5.1s client CPU 14.5s server CPU 6.1s

        Attachments

          Activity

            People

            Assignee:
            martyncutcher martyncutcher
            Reporter:
            jeremycarroll jeremycarroll
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

              Dates

              Created:
              Updated:
              Resolved: