Uploaded image for project: 'Blazegraph (by SYSTAP)'
  1. Blazegraph (by SYSTAP)
  2. BLZG-2025

Look into geospatial performance regressions for 2.2.0 RC

    Details

    • Type: Task
    • Status: Closed - Won't Fix
    • Priority: Medium
    • Resolution: Won't Fix
    • Affects Version/s: None
    • Fix Version/s: BLAZEGRAPH_2_2_0
    • Component/s: Geospatial Query
    • Labels:
      None

      Description

      Latest geospatial benchmarks at https://docs.google.com/spreadsheets/d/1ULC9oGZ1npc8md0CdphaPNBRHM6--eAtQrGDyclWeHs/edit#gid=1019906563 show a ~10% performance regression.

      We need to look at the counters to understand what's different, can't 100% exclude that this is a configuration issue for the benchmark (such as storing the journal on a different disk or the like), but probably it's more likely that someting underneath changed. In order to verify, I'll execute one of the long-running queries showing regressions with a previous 2.1 release and see whether that makes a difference.

        Activity

        Hide
        michaelschmidt michaelschmidt added a comment -

        I've added the performance counters for a selected query at the bottom of https://docs.google.com/spreadsheets/d/1ULC9oGZ1npc8md0CdphaPNBRHM6--eAtQrGDyclWeHs/edit#gid=1019906563. Actually, the only notable differences I could spot are slightly increased GC and read from disk time (for which I do not have a good explanation yet).

        I started a new benchmarks run with the 2.1.0 RC to see if I can reproduce faster processing (depending on whether this is faster, this would show us whether it's actually a configuration issue or a regression).

        Show
        michaelschmidt michaelschmidt added a comment - I've added the performance counters for a selected query at the bottom of https://docs.google.com/spreadsheets/d/1ULC9oGZ1npc8md0CdphaPNBRHM6--eAtQrGDyclWeHs/edit#gid=1019906563 . Actually, the only notable differences I could spot are slightly increased GC and read from disk time (for which I do not have a good explanation yet). I started a new benchmarks run with the 2.1.0 RC to see if I can reproduce faster processing (depending on whether this is faster, this would show us whether it's actually a configuration issue or a regression).
        Hide
        michaelschmidt michaelschmidt added a comment -

        Having reran the GeoSpatial benchmarks using the 2.1.0 snapshot release (the same which was 10% faster in our previous baseline run), I was not able to reproduce the performance that we had observed back then, see https://docs.google.com/spreadsheets/d/1ULC9oGZ1npc8md0CdphaPNBRHM6--eAtQrGDyclWeHs/edit#gid=1610715386.

        So the most reasonable explanation is that there's been a difference in configuration back then. One possible explanation would be that in the baseline we did run the experiments in analytics mode back then – unfortunately, I can't verify that based on the output files and there's no note in the document either. Just restarted the old experiments in analytics mode to see if that makes a difference.

        In any way, there seems to be no regression in GeoSpatial benchmark, but the issue seems to be causes by differences in the baseline setting.

        Show
        michaelschmidt michaelschmidt added a comment - Having reran the GeoSpatial benchmarks using the 2.1.0 snapshot release (the same which was 10% faster in our previous baseline run), I was not able to reproduce the performance that we had observed back then, see https://docs.google.com/spreadsheets/d/1ULC9oGZ1npc8md0CdphaPNBRHM6--eAtQrGDyclWeHs/edit#gid=1610715386 . So the most reasonable explanation is that there's been a difference in configuration back then. One possible explanation would be that in the baseline we did run the experiments in analytics mode back then – unfortunately, I can't verify that based on the output files and there's no note in the document either. Just restarted the old experiments in analytics mode to see if that makes a difference. In any way, there seems to be no regression in GeoSpatial benchmark, but the issue seems to be causes by differences in the baseline setting.
        Hide
        michaelschmidt michaelschmidt added a comment - - edited

        I now also reran the benchmark in analytic mode for 2.1. This didn't solve the performance gap (in contrary, analytic mode is significantly slower for all three scenarios, probably due to too small chunk sizes for the long-running queries). See https://docs.google.com/spreadsheets/d/1ULC9oGZ1npc8md0CdphaPNBRHM6--eAtQrGDyclWeHs/edit#gid=716538173 for the results.

        I currently do not have good ideas on what configuration change could have caused this behavior. Happy to have a look at the performance counters together.

        Show
        michaelschmidt michaelschmidt added a comment - - edited I now also reran the benchmark in analytic mode for 2.1. This didn't solve the performance gap (in contrary, analytic mode is significantly slower for all three scenarios, probably due to too small chunk sizes for the long-running queries). See https://docs.google.com/spreadsheets/d/1ULC9oGZ1npc8md0CdphaPNBRHM6--eAtQrGDyclWeHs/edit#gid=716538173 for the results. I currently do not have good ideas on what configuration change could have caused this behavior. Happy to have a look at the performance counters together.
        Hide
        bryanthompson bryanthompson added a comment -

        The source of the regression appears to be some undiscoverable change in the baseline and/or h/w. Closing as not relevant to the release.

        Show
        bryanthompson bryanthompson added a comment - The source of the regression appears to be some undiscoverable change in the baseline and/or h/w. Closing as not relevant to the release.

          People

          • Assignee:
            michaelschmidt michaelschmidt
            Reporter:
            michaelschmidt michaelschmidt
          • Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved: