Details

    • Type: Task
    • Status: Done
    • Resolution: Done
    • Affects Version/s: BIGDATA_RELEASE_1_1_0
    • Fix Version/s: None
    • Component/s: Other

      Description

      This is a bug fix release for both the 1.0.x and 1.1.x branches.

      Tickets resolved against these releases include:

      1.0.4: Resolved.
      http://sourceforge.net/apps/trac/bigdata/ticket/443 (Logger for RWStore transaction service and recycler)
      http://sourceforge.net/apps/trac/bigdata/ticket/445 (RWStore does not track tx release correctly)
      http://sourceforge.net/apps/trac/bigdata/ticket/437 (Thread-local cache combined with unbounded thread pools causes effective memory leak: termCache memory leak & thread-local buffers)

      1.1.1: All issues for 1.0.4 plus:
      http://sourceforge.net/apps/trac/bigdata/ticket/433 (Cluster leaks threads under read-only index operations: DGC thread leak)
      http://sourceforge.net/apps/trac/bigdata/ticket/446 (HTTP Repostory broken with bigdata 1.1.0)

      In addition, substantial work has been done on [1]. While that issue remains open, these releases should reduce the chances of that rare condition being encountered.

      [1] http://sourceforge.net/apps/trac/bigdata/ticket/440 (BTree can not be cast to Name2Addr when accessing historical state on RWStore)

        Activity

        beebs Brad Bebee created issue -
        Hide
        bryanthompson bryanthompson added a comment -

        In benchmarking these releases, I noticed that there is a significant performance difference between the AMD Phenom II workstation (centos 5.4) and the i7 minis (ubantu/natty) on the LUBM U50 benchmark. The workstation achieves a throughput of ~10s on the U50 queries. The mini throughput is significantly less, on the order of 12s or more at the

        Another interesting observation is that it takes significantly longer for the 1.1.x release to reach its optimal throughput. This can be seen not only in the total time reported by "ant run-query" for LUBM U50, but also in the details reported out using the "-XX:+PrintCompilation" JVM flag. Because the JVM is incrementally optimizing the code, you need to run the benchmark several iterations before you reach the actual throughput which would be observed in a steady state server. This is an artifact of Java, but one that we must take into account when benchmarking the database.

        The mini seems to find LUBM U50 Q9 to be significantly more difficult than the AMD workstation. Q9 is a very interesting query. It is one of those queries that actually performs much better on a cluster due to the sharded bloom filter. There is probably something about the i7 vs AMD microarchitecture which accounts for this performance difference (the i7 has a 4M shared L3 cache while the AMD has a 6M shared L3 cache, which could also be relevant).

        Another interesting and significant performance difference was observed on the minis when using JDK 1.6.0_17 (slow) versus JDK 1.6.0_27 (faster).

        The LUBM U50 scores do eventually converge (more or less) for the i7 and AMD nodes. However, while the total times are similar, note that Q9 is MUCH faster on the AMD node (3.5s versus 4.9s). This means that the performance is being made up on the i7 across the balance of the queries.

        i7

             [java] BIGDATA_SPARQL_ENDPOINT     #trials=10      #parallel=1
             [java] query       Time    Result#
             [java] query1      36      4
             [java] query3      25      6
             [java] query4      37      34
             [java] query5      45      719
             [java] query7      26      61
             [java] query8      196     6463
             [java] query10     31      0
             [java] query11     34      0
             [java] query12     31      0
             [java] query13     31      0
             [java] query14     2183    393730
             [java] query6      2462    430114
             [java] query2      754     130
             [java] query9      4911    8627
             [java] Total       10802
        

        AMD

             [java] BIGDATA_SPARQL_ENDPOINT     #trials=10      #parallel=1
             [java] query       Time    Result#
             [java] query1      45      4
             [java] query3      28      6
             [java] query4      51      34
             [java] query5      44      719
             [java] query7      36      61
             [java] query8      162     6463
             [java] query10     31      0
             [java] query11     37      0
             [java] query12     36      0
             [java] query13     35      0
             [java] query14     2725    393730
             [java] query6      2982    430114
             [java] query2      565     130
             [java] query9      3541    8627
             [java] Total       10318
        
        Show
        bryanthompson bryanthompson added a comment - In benchmarking these releases, I noticed that there is a significant performance difference between the AMD Phenom II workstation (centos 5.4) and the i7 minis (ubantu/natty) on the LUBM U50 benchmark. The workstation achieves a throughput of ~10s on the U50 queries. The mini throughput is significantly less, on the order of 12s or more at the Another interesting observation is that it takes significantly longer for the 1.1.x release to reach its optimal throughput. This can be seen not only in the total time reported by "ant run-query" for LUBM U50, but also in the details reported out using the "-XX:+PrintCompilation" JVM flag. Because the JVM is incrementally optimizing the code, you need to run the benchmark several iterations before you reach the actual throughput which would be observed in a steady state server. This is an artifact of Java, but one that we must take into account when benchmarking the database. The mini seems to find LUBM U50 Q9 to be significantly more difficult than the AMD workstation. Q9 is a very interesting query. It is one of those queries that actually performs much better on a cluster due to the sharded bloom filter. There is probably something about the i7 vs AMD microarchitecture which accounts for this performance difference (the i7 has a 4M shared L3 cache while the AMD has a 6M shared L3 cache, which could also be relevant). Another interesting and significant performance difference was observed on the minis when using JDK 1.6.0_17 (slow) versus JDK 1.6.0_27 (faster). The LUBM U50 scores do eventually converge (more or less) for the i7 and AMD nodes. However, while the total times are similar, note that Q9 is MUCH faster on the AMD node (3.5s versus 4.9s). This means that the performance is being made up on the i7 across the balance of the queries. i7 [java] BIGDATA_SPARQL_ENDPOINT #trials=10 #parallel=1 [java] query Time Result# [java] query1 36 4 [java] query3 25 6 [java] query4 37 34 [java] query5 45 719 [java] query7 26 61 [java] query8 196 6463 [java] query10 31 0 [java] query11 34 0 [java] query12 31 0 [java] query13 31 0 [java] query14 2183 393730 [java] query6 2462 430114 [java] query2 754 130 [java] query9 4911 8627 [java] Total 10802 AMD [java] BIGDATA_SPARQL_ENDPOINT #trials=10 #parallel=1 [java] query Time Result# [java] query1 45 4 [java] query3 28 6 [java] query4 51 34 [java] query5 44 719 [java] query7 36 61 [java] query8 162 6463 [java] query10 31 0 [java] query11 37 0 [java] query12 36 0 [java] query13 35 0 [java] query14 2725 393730 [java] query6 2982 430114 [java] query2 565 130 [java] query9 3541 8627 [java] Total 10318
        Hide
        bryanthompson bryanthompson added a comment -

        I have validated that the memory leak via the termsCache is fixed in both branches. I ran the BSBM 100M benchmark for 120 presentations (50 warmups and 500 query mixes per presentation with 8 concurrent clients). The JVM heaps grew to approximately the target heap size (4G). At the end of these runs, I took and heap dump and verified that the strongly reachable object pool was quite small and that the termsCache was not wiring in large numbers of objects. All looks good.

        Show
        bryanthompson bryanthompson added a comment - I have validated that the memory leak via the termsCache is fixed in both branches. I ran the BSBM 100M benchmark for 120 presentations (50 warmups and 500 query mixes per presentation with 8 concurrent clients). The JVM heaps grew to approximately the target heap size (4G). At the end of these runs, I took and heap dump and verified that the strongly reachable object pool was quite small and that the termsCache was not wiring in large numbers of objects. All looks good.
        Hide
        bryanthompson bryanthompson added a comment -

        Stable throughput for BSBM 100M on the i7 / mini is very similar in both the 1.0.x and 1.1.x branch. When the data are plotted, the 1.1.1 branch can be observed to have very slightly higher throughput, but both are essentially 42,000 QMpH on the reduced query mix for 8 concurrent clients.

        41454.04	42036.23
        41790.7 	41990.92
        42040.26	42038.94
        41968.43	41886.94
        41600.43	42489.64
        38485.49	37945.26
        41406.8         41701.83
        41367.25	41795.27
        41554.62	42240.74
        41597.11	42088.43
        41783.99	42186.76
        
        Show
        bryanthompson bryanthompson added a comment - Stable throughput for BSBM 100M on the i7 / mini is very similar in both the 1.0.x and 1.1.x branch. When the data are plotted, the 1.1.1 branch can be observed to have very slightly higher throughput, but both are essentially 42,000 QMpH on the reduced query mix for 8 concurrent clients. 41454.04 42036.23 41790.7 41990.92 42040.26 42038.94 41968.43 41886.94 41600.43 42489.64 38485.49 37945.26 41406.8 41701.83 41367.25 41795.27 41554.62 42240.74 41597.11 42088.43 41783.99 42186.76
        Hide
        bryanthompson bryanthompson added a comment -

        1.0.4 has been released. I am still checking a few things on 1.1.1.

        Show
        bryanthompson bryanthompson added a comment - 1.0.4 has been released. I am still checking a few things on 1.1.1.
        Hide
        bryanthompson bryanthompson added a comment -

        I am moving the 1.1.1 release to a different ticket and closing this one. We are basically ready to go one 1.1.1, but there is more priority right now to get out the 1.0.5 for various customers.

        Show
        bryanthompson bryanthompson added a comment - I am moving the 1.1.1 release to a different ticket and closing this one. We are basically ready to go one 1.1.1, but there is more priority right now to get out the 1.0.5 for various customers.
        beebs Brad Bebee made changes -
        Field Original Value New Value
        Workflow Trac Import v2 [ 11961 ] Trac Import v3 [ 13592 ]
        beebs Brad Bebee made changes -
        Workflow Trac Import v3 [ 13592 ] Trac Import v4 [ 14921 ]
        beebs Brad Bebee made changes -
        Workflow Trac Import v4 [ 14921 ] Trac Import v5 [ 16309 ]
        beebs Brad Bebee made changes -
        Labels Issue_patch_20150625
        beebs Brad Bebee made changes -
        Status Closed - Won't Fix [ 6 ] Open [ 1 ]
        beebs Brad Bebee made changes -
        Status Open [ 1 ] Accepted [ 10101 ]
        beebs Brad Bebee made changes -
        Status Accepted [ 10101 ] In Progress [ 3 ]
        beebs Brad Bebee made changes -
        Status In Progress [ 3 ] Resolved [ 5 ]
        beebs Brad Bebee made changes -
        Status Resolved [ 5 ] In Review [ 10100 ]
        beebs Brad Bebee made changes -
        Resolution Fixed [ 1 ] Done [ 10000 ]
        Status In Review [ 10100 ] Done [ 10000 ]
        beebs Brad Bebee made changes -
        Workflow Trac Import v5 [ 16309 ] Trac Import v6 [ 17571 ]
        beebs Brad Bebee made changes -
        Workflow Trac Import v6 [ 17571 ] Trac Import v7 [ 18966 ]
        beebs Brad Bebee made changes -
        Workflow Trac Import v7 [ 18966 ] Trac Import v8 [ 20585 ]

          People

          • Assignee:
            bryanthompson bryanthompson
            Reporter:
            bryanthompson bryanthompson
          • Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved: