Uploaded image for project: 'Blazegraph (by SYSTAP)'
  1. Blazegraph (by SYSTAP)
  2. BLZG-1341

performance of dumping single graph

    XMLWordPrintable

    Details

    • Type: Bug
    • Status: Done
    • Priority: Medium
    • Resolution: Done
    • Affects Version/s: None
    • Fix Version/s: BLAZEGRAPH_RELEASE_1_5_2
    • Component/s: None
    • Labels:
      None

      Description

      I am wishing to dump the content of a current blazegraph instance (running in quads mode)

      I initially tried:

      $ curl -X POST http://localhost:2333/bigdata/sparql --data-urlencode 'query=CONSTRUCT { ?s ?p ?o } WHERE { ?s ?p ?o }' -H 'Accept:text/plain' > /dev/null
      

      which did not work because after about 2.5G of data we entered GC hell, and progress slowed dramatically. This is not surprising because of the implicit duplicates check.

      So I found the name of the single graph with the bulk of the data and tried:

      $ curl -X POST http://localhost:2333/bigdata/sparql --data-urlencode 'query=CONSTRUCT { ?s ?p ?o } WHERE { graph <https://test-similarpatients.syapse.com/graph/diagnostics-inc/abox> { ?s ?p ?o } }' -H 'Accept:text/plain' > /dev/null
      

      This faired better, but still entered GC hell after 4.5G of data. This query should simply be streaming the data ??? and hence should not be challenging, other than simply the execution time.

        Attachments

          Issue Links

            Activity

              People

              Assignee:
              beebs Brad Bebee
              Reporter:
              jjc Jeremy Carroll
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

                Dates

                Created:
                Updated:
                Resolved: