Details

    • Type: Task
    • Status: Done
    • Priority: Medium
    • Resolution: Done
    • Affects Version/s: None
    • Fix Version/s: BLAZEGRAPH_RELEASE_1_5_3
    • Component/s: None
    • Labels:
      None

      Description

      Assess correctness and performance of 1.5.3 release via baseline benchmark.

        Activity

        Hide
        michaelschmidt michaelschmidt added a comment -

        Summary of results from Sep 19-20 benchmark run:

        I’d say it’s all more or less statistical variance. As for SP2Bench, the slowest query was a little bit slower than usual in one of the runs, which might be due to background processes (in the worst case, GC). So as for these three benchmarks, nothing to worry.

        Regarding BSBM, the results also don’t look that bad in general:
        https://docs.google.com/spreadsheets/d/1bkBtgSuR1BpIo4jEtvbdpzOw5rHXVl4DoUOkaxgisKg/edit#gid=975570205

        • No issues for BSBM EXPLORE, all in the range of +/- 0.5% compared to the previous release
        • BSBM EXPLORE+UPDATE is a bit different:
          a. Results for the different #threads vary a bit more (<=~1% though), which is nothing I would be concerned about
          b. We observed (only one) GCOverheadExceeded exception, in the MT16 run (in the previous run with 1.5.3, I also observed one such exception for MT64)
          c. For the MT64 version (~1% slower), we observe a significant increase in the cumulative collection time (summing up the collection times: from 513240 in 1.5.2 to 929706 in 1.5.3); not sure if this is critical

        I consider b. the only critical blocker for the release.

        Show
        michaelschmidt michaelschmidt added a comment - Summary of results from Sep 19-20 benchmark run: SP2Bench: about 2% slower https://docs.google.com/spreadsheets/d/16XjMdI5PRjK7ODOWM_AS4u9Ma5tQBkWXOgZl8diaVYI/edit#gid=218566312 Govtrack: about 1% slower https://docs.google.com/spreadsheets/d/192kwVDT6p9fIdZQQP3TvNJDsyfoqTLRp2Bbg3e32l0M/edit#gid=553331241 LUBM: about 1% faster https://docs.google.com/spreadsheets/d/12rbe77GOqnRmi4yFjWE1D1hXnd2jB-WPUV2uVJZEmwE/edit#gid=258811125 I’d say it’s all more or less statistical variance. As for SP2Bench, the slowest query was a little bit slower than usual in one of the runs, which might be due to background processes (in the worst case, GC). So as for these three benchmarks, nothing to worry. Regarding BSBM, the results also don’t look that bad in general: https://docs.google.com/spreadsheets/d/1bkBtgSuR1BpIo4jEtvbdpzOw5rHXVl4DoUOkaxgisKg/edit#gid=975570205 No issues for BSBM EXPLORE, all in the range of +/- 0.5% compared to the previous release BSBM EXPLORE+UPDATE is a bit different: a. Results for the different #threads vary a bit more (<=~1% though), which is nothing I would be concerned about b. We observed (only one) GCOverheadExceeded exception, in the MT16 run (in the previous run with 1.5.3, I also observed one such exception for MT64) c. For the MT64 version (~1% slower), we observe a significant increase in the cumulative collection time (summing up the collection times: from 513240 in 1.5.2 to 929706 in 1.5.3); not sure if this is critical I consider b. the only critical blocker for the release.
        Hide
        michaelschmidt michaelschmidt added a comment -

        Re-run BSBM, limiting the memory consumption of the ant processes. None of the problems (a-c) could be reproduced: https://docs.google.com/spreadsheets/d/1bkBtgSuR1BpIo4jEtvbdpzOw5rHXVl4DoUOkaxgisKg/edit#gid=1831606474

        So from this perspective, we're ready to go for a release. Closing this ticket.

        Show
        michaelschmidt michaelschmidt added a comment - Re-run BSBM, limiting the memory consumption of the ant processes. None of the problems (a-c) could be reproduced: https://docs.google.com/spreadsheets/d/1bkBtgSuR1BpIo4jEtvbdpzOw5rHXVl4DoUOkaxgisKg/edit#gid=1831606474 So from this perspective, we're ready to go for a release. Closing this ticket.
        Hide
        michaelschmidt michaelschmidt added a comment - - edited

        Reran the benchmarks with the latest functional changes (logger improvements etc.), here is a summary of the results (commit point b80a74bc452868b157180cfcd90a70f90fe84654 in master, from Dec. 20):

        • https://docs.google.com/spreadsheets/d/192kwVDT6p9fIdZQQP3TvNJDsyfoqTLRp2Bbg3e32l0M/edit#gid=57515436
          We still have an 11% regression here, which can be attributed to two long-running queries (query0021c and query0021d), and possibly an observed overhead in GC. However, it could not be verified that any of the suspects (changes in IV resolution, analytic pipelined hash join, etc.) are responsible for this slow-down. Will give it a try and try to trace this down over the coming days, but again I would not consider these two very special queries as a blocker (what makes them unique is that both of them actually use a manually enabled hash join, which should occur not that often in practice).
        Show
        michaelschmidt michaelschmidt added a comment - - edited Reran the benchmarks with the latest functional changes (logger improvements etc.), here is a summary of the results (commit point b80a74bc452868b157180cfcd90a70f90fe84654 in master, from Dec. 20): LUBM: https://docs.google.com/spreadsheets/d/12rbe77GOqnRmi4yFjWE1D1hXnd2jB-WPUV2uVJZEmwE/edit#gid=1776796231 Is about 3.5% faster now. SP2B: https://docs.google.com/spreadsheets/d/16XjMdI5PRjK7ODOWM_AS4u9Ma5tQBkWXOgZl8diaVYI/edit#gid=1225597339 There's a regression now of 3.5%, which can be attributed to a single query though: Q5a, which is a BGP with a FILTER over literals. Not really critical though, I would say. https://docs.google.com/spreadsheets/d/192kwVDT6p9fIdZQQP3TvNJDsyfoqTLRp2Bbg3e32l0M/edit#gid=57515436 We still have an 11% regression here, which can be attributed to two long-running queries (query0021c and query0021d), and possibly an observed overhead in GC. However, it could not be verified that any of the suspects (changes in IV resolution, analytic pipelined hash join, etc.) are responsible for this slow-down. Will give it a try and try to trace this down over the coming days, but again I would not consider these two very special queries as a blocker (what makes them unique is that both of them actually use a manually enabled hash join, which should occur not that often in practice). BSBM: https://docs.google.com/spreadsheets/d/1i-JnEy_W5Pt4AWg87oxg564GYkz3zaxxmIS9H4OKssE/edit#gid=82882020 Results look OK now. For the ST threaded scenario and EXPLORE MT we are stable (+/- 0.5% variance at most). Same holds for the EXPLORE+UPDATE MT scenario with <64 threads. The only gap we can observe is EXPLORE+UPDATE MT 64, where we are now down to only 2% performance loss (we observed ~4% regression there in previous runs).

          People

          • Assignee:
            michaelschmidt michaelschmidt
            Reporter:
            michaelschmidt michaelschmidt
          • Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved: