Details

    • Type: Bug
    • Status: Done
    • Resolution: Done
    • Affects Version/s: BIGDATA_RELEASE_1_5_0
    • Fix Version/s: None
    • Component/s: Other

      Description

      See BLZG-188 (BlazeGraph release 1.5.0)

      Change log:
      - BLZG-670 Concurrent unisolated operations against multiple KBs on the same Journal
      - BLZG-882 Adding Optional removes solutions
      - BLZG-913 Query solutions are duplicated and increase by adding graph patterns
      - BLZG-1061 Property path operator should output solutions incrementally
      - BLZG-1065 Using a bound variable to refer to a graph
      - BLZG-1085 NPE if remote http server fails to provide a Content-Type header
      - BLZG-1119 problems with UNIONs + complex OPTIONAL groups
      - BLZG-1149 Executable Jar should bundle the BuildInfo class
      - BLZG-1151 SPARQL UPDATE should have nice error messages when namespace does not support named graphs
      - BLZG-1154 NSS startup error: java.lang.IllegalArgumentException: URI is not hierarchical
      - BLZG-1156 Data race in BackgroundGraphResult.run()/close()
      - BLZG-189 GPLv2 license header update with new contact information
      - BLZG-1158 Add hook to override the DefaultOptimizerList
      - BLZG-1159 startHAServices no longer respects environment variables
      - BLZG-1160 Build version in SF GIT master is wrong
      - BLZG-1161 README.md needs updating for Blazegraph transition
      - BLZG-1163 Optimized variable projection into subqueries/subgroups
      - BLZG-1168 OSX vm_stat output has changed
      - BLZG-1171 Concurrent modification problem with group commit
      - BLZG-1172 ClocksNotSynchronizedException (HA, GROUP_COMMIT)
      - BLZG-1173 DELETE-WITH-QUERY and UPDATE-WITH-QUERY (GROUP COMMIT)
      - BLZG-1174 GlobalRowStoreHelper can hold hard reference to GSR index (GROUP COMMIT)
      - BLZG-193 Code review on "instanceof Journal"
      - BLZG-1180 BigdataSailFactory.connect()
      - BLZG-1182 Isolation broken in NSS when groupCommit disabled
      - BLZG-1183 GROUP_COMMIT environment variable
      - BLZG-1185 SPARQL Federated Query uses too many HttpClient objects
      - BLZG-1186 DELETE DATA must not allow blank nodes
      - BLZG-1191 BigdataSailFactory must be moved to the client package
      - BLZG-1197 BuildInfo not in the WAR File

      Pushed to 1.5.2:
      - BLZG-1178 Bad Address: length requested greater than allocated slot (RWStore, GROUP COMMIT, HA-only)
      - BLZG-1193 Integrate filters into the ALP service

      See BLZG-195 BlazeGraph 1.5.2 release.

        Issue Links

          Activity

          Hide
          bryanthompson bryanthompson added a comment -

          Status update for the 1.5.1 release.

          We have an open ticket (BLZG-1178) against the group commit feature for HA. We will try to close this by COB Monday to meet the release schedule. This feature appears to work correctly in the standalone NSS mode and Martyn has a fix in the branch for BLZG-1178 that works for HA1. I have written a number of stress tests at the HA CI layer for this feature and (in the BLZG-1178 branch) those tests work for HA1. Since I needed to capture a refactoring in the HA test suites, I went ahead and brought everything except the edits by Martyn to RWStore.java, FixedAllocator.java, and TestRWStrategy.java back to master.

          Michael has brought in the change set for property path optimizations. This is the last major feature for the 1.5.1 release.

          Brad has a minor ticket open against the BigdataSailFactory.connect() method and will close that ASAP.

          At this point, I expect CI to settle down and we should be making the final updates leading into the QA for the 1.5.1 release.

          Show
          bryanthompson bryanthompson added a comment - Status update for the 1.5.1 release. We have an open ticket ( BLZG-1178 ) against the group commit feature for HA. We will try to close this by COB Monday to meet the release schedule. This feature appears to work correctly in the standalone NSS mode and Martyn has a fix in the branch for BLZG-1178 that works for HA1. I have written a number of stress tests at the HA CI layer for this feature and (in the BLZG-1178 branch) those tests work for HA1. Since I needed to capture a refactoring in the HA test suites, I went ahead and brought everything except the edits by Martyn to RWStore.java, FixedAllocator.java, and TestRWStrategy.java back to master. Michael has brought in the change set for property path optimizations. This is the last major feature for the 1.5.1 release. Brad has a minor ticket open against the BigdataSailFactory.connect() method and will close that ASAP. At this point, I expect CI to settle down and we should be making the final updates leading into the QA for the 1.5.1 release.
          Hide
          bryanthompson bryanthompson added a comment -

          I have written a number of stress tests at the HA CI layer for this feature and (in the BLZG-1178 branch) those tests work for HA1. Since I needed to capture a refactoring in the HA test suites, I went ahead and brought everything except the edits by Martyn to RWStore.java, FixedAllocator?.java, and TestRWStrategy.java back to master.

          Show
          bryanthompson bryanthompson added a comment - I have written a number of stress tests at the HA CI layer for this feature and (in the BLZG-1178 branch) those tests work for HA1. Since I needed to capture a refactoring in the HA test suites, I went ahead and brought everything except the edits by Martyn to RWStore.java, FixedAllocator?.java, and TestRWStrategy.java back to master.
          Hide
          bradbebee bradbebee added a comment -

          Code is frozen for release.

          Show
          bradbebee bradbebee added a comment - Code is frozen for release.
          Hide
          bryanthompson bryanthompson added a comment -

          - Pushed fix for BLZG-1186 to private master to be staged.
          - Added release notes to private master to be staged.

          Show
          bryanthompson bryanthompson added a comment - - Pushed fix for BLZG-1186 to private master to be staged. - Added release notes to private master to be staged.
          Hide
          bryanthompson bryanthompson added a comment -

          Release QA plans:


          - @bryanthompson: Long running NSS with BSBM 100M EXPLORE + UPDATE
          - @bryanthompson: Long running HA3 cluster with BSBM 100M EXPLORE + UPDATE
          - @bryanthompson: Load and performance tuning for wikidata RDF dump.
          - @michaelschmidt: Standard benchmarks (govtrack, BSBM 100M, LUBM)

          Show
          bryanthompson bryanthompson added a comment - Release QA plans: - @bryanthompson: Long running NSS with BSBM 100M EXPLORE + UPDATE - @bryanthompson: Long running HA3 cluster with BSBM 100M EXPLORE + UPDATE - @bryanthompson: Load and performance tuning for wikidata RDF dump. - @michaelschmidt: Standard benchmarks (govtrack, BSBM 100M, LUBM)
          Hide
          bradbebee bradbebee added a comment -

          @bryanthompson, @michaelschmidt : BLAZEGRAPH_RELEASE_1_5_1_RC_2 is the current branch for QA. It is setup in CI.

          Show
          bradbebee bradbebee added a comment - @bryanthompson, @michaelschmidt : BLAZEGRAPH_RELEASE_1_5_1_RC_2 is the current branch for QA. It is setup in CI.
          Hide
          michaelschmidt michaelschmidt added a comment -

          Summary of Benchmark Results

          Executed baseline test for BSBM, govtrack, LUBM and SP2Bench, comparing the results of BLAZEGRAPH_RELEASE_1_5_1_RC_2 with the 1.4.0 and 1.5.0 releases.

          In overall, the benchmarks suggest that there have been slight improvements, in particular for queries with complex join subgroups (and, going beyond the "official" benchmarks, also for queries with arbitrary length property paths, as confirmed by local tests and users). On the other side, no severe regressions could be identified.

          Major findings are as follows:

          BSBM

          EXPLORE Single Threaded

          Queries are stable, except for Q7, which experiences ~100% speedup (arguably due to projection pushing and changed evaluation strategy w.r.t. complex OPTIONAL subgroups).

          EXPLORE MT16/MT32

          Slight performance gains in overall (~1-2% compared to 1.4.0); speedup for Q7, in return some regression for selected queries, which could not be reproduced when running these queries standalone, so this is probably due to other global effects (e.g. GC running when evaluating these queries).

          EXPLORE+UPDATE MT16/MT32

          Results comparable to EXPLORE MT16/MT32 results, stable in overall.

          LUBM

          Quite stable in overall within the three releases; slight shift in performance in 1.5.1RC2 for single queries (0-5%), Q10 considerably improved (~10%) due to the OPTIONAL group optimization.

          GOVTRACK

          Stable in overall (~0.3% performance gain in overall since 1.4.0); slight shift in performance in 1.5.0 and 1.5.1RC2.

          SP2B

          Executed over 200k triples only (larger datasets to come). Observed significant gains of ~3.5% in 1.5.0 and ~11% again in 1.5.1RC2. Gains in 1.5.0 caused by queries involving complex OPTIONAL groups, so again this can probably be attributed to the projection pushing into complex join subgroups.

          In addition, the benchmark results suggest different performance optimization opportunities, for instance

          • Q2 -> probably suboptimal join order
          • Q5a -> filter pushing strategy definitely suboptimal
          • Q6 -> probably same here, the query is still very slow (although it considerably improved in 1.5.1)

          More optimization approaches opportunities might pop up here once we have the 1M triples result for SP2Bench.

          Show
          michaelschmidt michaelschmidt added a comment - Summary of Benchmark Results Executed baseline test for BSBM, govtrack, LUBM and SP2Bench, comparing the results of BLAZEGRAPH_RELEASE_1_5_1_RC_2 with the 1.4.0 and 1.5.0 releases. In overall, the benchmarks suggest that there have been slight improvements, in particular for queries with complex join subgroups (and, going beyond the "official" benchmarks, also for queries with arbitrary length property paths, as confirmed by local tests and users). On the other side, no severe regressions could be identified. Major findings are as follows: BSBM EXPLORE Single Threaded Queries are stable, except for Q7, which experiences ~100% speedup (arguably due to projection pushing and changed evaluation strategy w.r.t. complex OPTIONAL subgroups). EXPLORE MT16/MT32 Slight performance gains in overall (~1-2% compared to 1.4.0); speedup for Q7, in return some regression for selected queries, which could not be reproduced when running these queries standalone, so this is probably due to other global effects (e.g. GC running when evaluating these queries). EXPLORE+UPDATE MT16/MT32 Results comparable to EXPLORE MT16/MT32 results, stable in overall. LUBM Quite stable in overall within the three releases; slight shift in performance in 1.5.1RC2 for single queries (0-5%), Q10 considerably improved (~10%) due to the OPTIONAL group optimization. GOVTRACK Stable in overall (~0.3% performance gain in overall since 1.4.0); slight shift in performance in 1.5.0 and 1.5.1RC2. SP2B Executed over 200k triples only (larger datasets to come). Observed significant gains of ~3.5% in 1.5.0 and ~11% again in 1.5.1RC2. Gains in 1.5.0 caused by queries involving complex OPTIONAL groups, so again this can probably be attributed to the projection pushing into complex join subgroups. In addition, the benchmark results suggest different performance optimization opportunities, for instance Q2 -> probably suboptimal join order Q5a -> filter pushing strategy definitely suboptimal Q6 -> probably same here, the query is still very slow (although it considerably improved in 1.5.1) More optimization approaches opportunities might pop up here once we have the 1M triples result for SP2Bench.
          Hide
          bryanthompson bryanthompson added a comment -

          We have run the BSBM 100M UPDATE workload out over 1M commit points on an HA3 cluster. Update throughput (w/o group commit) is approximately 2600 QMpH for this workload.

          QMpH:                   2612.88 query mixes per hour (1080068 commits, 122746 HALog files, 1 snapshot @ commitCounter=957329)
          

          We are separately running out the HA3 cluster with group commit per BLZG-1178 against the GROUP_COMMIT_1136 branch. Several issues were fixed, so group commit (for HA) is not in the RC2 branch but will be in a subsequent release (1.5.2) and will be available in the SF master at the time of the 1.5.1 release (just not part of the tagged release). HA with group commit performance is documented at BLZG-1178. It is currently at:

          QMpH:                   2439.84 query mixes per hour 301,483 commits
          

          Note that BSBM UPDATE does not present any opportunities for group commit since there is only a single client thread performing updates. Further, it only addresses a single blazegraph namespace. As a follow up we can try running concurrent UPDATE clients (providing an potential opportunity for updates to be melded into fewer than one commit point per update). We can also run concurrent UPDATE clients against distinct BSBM 100M namespaces. Both situations should provide a significant increase in the update throughput.

          Note: Group commit appears to be fine for non-HA modes.

          Show
          bryanthompson bryanthompson added a comment - We have run the BSBM 100M UPDATE workload out over 1M commit points on an HA3 cluster. Update throughput (w/o group commit) is approximately 2600 QMpH for this workload. QMpH: 2612.88 query mixes per hour (1080068 commits, 122746 HALog files, 1 snapshot @ commitCounter=957329) We are separately running out the HA3 cluster with group commit per BLZG-1178 against the GROUP_COMMIT_1136 branch. Several issues were fixed, so group commit (for HA) is not in the RC2 branch but will be in a subsequent release (1.5.2) and will be available in the SF master at the time of the 1.5.1 release (just not part of the tagged release). HA with group commit performance is documented at BLZG-1178 . It is currently at: QMpH: 2439.84 query mixes per hour 301,483 commits Note that BSBM UPDATE does not present any opportunities for group commit since there is only a single client thread performing updates. Further, it only addresses a single blazegraph namespace. As a follow up we can try running concurrent UPDATE clients (providing an potential opportunity for updates to be melded into fewer than one commit point per update). We can also run concurrent UPDATE clients against distinct BSBM 100M namespaces. Both situations should provide a significant increase in the update throughput. Note: Group commit appears to be fine for non-HA modes.

            People

            • Assignee:
              beebs Brad Bebee
              Reporter:
              bryanthompson bryanthompson
            • Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: