Uploaded image for project: 'Blazegraph (by SYSTAP)'
  1. Blazegraph (by SYSTAP)
  2. BLZG-697

Manage truth maintenance in SPARQL UPDATE

    XMLWordPrintable

    Details

    • Type: New Feature
    • Status: Done
    • Resolution: Done
    • Affects Version/s: BIGDATA_RELEASE_1_2_1
    • Fix Version/s: BLAZEGRAPH_2_0_0
    • Component/s: SPARQL UPDATE
    • Labels:
      None

      Description

      Make it easier for people to manage truth maintenance, including disabling it, dropping the computed entailments, recomputing the database-at-once closure efficiently, and re-enabling truth maintenance.

      The following extensions to SPARQL UPDATE are proposed to manage the materialized entailments.

      DROP ENTAILMENTS
      Drop the entailments. This is only required if you have removed some statements from the database.  If you are only adding statements, then just execute "CREATE ENTAILMENTS". 
      
      CREATE ENTAILMENTS
      (Re-)compute the entailments using an efficient "database-at-once" closure operation.  This is much more efficient than incremental truth maintenance if you are loading a large amount of data into the database.  It is not necessary to "DROP ENTAILMENTS" before calling "CREATE ENTAILMENTS" unless you have retracted some assertions.  If you do not "DROP ENTAILMENTS" first, then "CREATE ENTAILMENTS" will have the semantics of "updating" the current entailments.  Entailments which can be re-proven will have no impact and new entailments will be inserted into the KB.  This is significantly more efficient than re-computing the fixed point closure of the entailments from scratch (that is, after a "DROP ENTAILMENTS").
      
      ENABLE ENTAILMENTS
      Enable incremental truth maintenance.  
      
      DISABLE ENTAILMENTS
      Disable incremental truth maintenance.
      
      The following pattern illustrates a valid use of this feature when assertions are not retracted. This sequence of operations is ACID against a Journal. Clients will never observe an intermediate state where the full set of entailments are not available.
      
      # mutations before this point are tracked by truth maintenance.
      DISABLE ENTAILMENTS; # disable truth maintenance.
      # mutations do not update entailments.
      LOAD file1;
      LOAD file2;
      INSERT DATA { triples };
      CREATE ENTAILMENTS; # create new entailments using the database-at-once closure.
      ENABLE ENTAILMENTS; # reenable truth maintenance.
      # mutations after this point are tracked by truth maintenance.
      

      The following pattern illustrates a valid use of this feature when some assertions are retracted. This sequence of operations is ACID against a Journal. Clients will never observe an intermediate state where the full set of entailments are not available.

      # mutations before this point are tracked by truth maintenance.
      DISABLE ENTAILMENTS; # disable truth maintenance.
      # mutations do not update entailments.
      DELETE DATA { triples };
      LOAD file1;
      LOAD file2;
      INSERT DATA { triples };
      DROP ENTAILMENTS; # drop existing entailments and proof chains
      CREATE ENTAILMENTS; # create new entailments using the database-at-once closure.
      ENABLE ENTAILMENTS; # reenable truth maintenance.
      # mutations after this point are tracked by truth maintenance.
      

      I am also wondering if there is any reason to "ENABLE ENTAILMENTS" or it that should be automatic when we call CREATE ENTAILMENTS.

      Or ENABLE ENTAILMENTS could do the database-at-once closure and CREATE ENTAILMENTS could specify the set of rules to be maintained.

      CREATE ENTAILMENTS "RDFS Plus"

      CREATE ENTAILMENTS could default to the existing set of rules, but we also have an opportunity to change the rules that are being maintained at this point.

      ----

      Mike and I also discussed some options to support this throught the NanoSparqlServer's REST API. The main concept was to add the following to the NanoSparqlServer API for methods that perform mutations.

      ?suppressTruthMaintenance
      

      And add a suppressTruthMaintenance method to the RemoteRepository, probably encapsulating it within the AddOp and RemoveOp classes by extracting a common base class and also refactoring the update() method to accept an UpdateOp that extends that common base class and inherits that boolean option. If you are using the REST API and the suppressTruthMaintenance URL query parameter to suppress incremental truth maintenance, the you can issue the "CREATE ENTAILMENTS" UPDATE REQUEST afterwards to update the entailments for the KB. If you have also retracted statements, then you would want to issue "DROP ENTAILMENTS; CREATE ENTAILMENTS;" to remove the old entailments before (re-)computing the entailments for the KB.

      I am not yet convinced that it is a good idea to expose this feature through the REST API. Doing so makes it basically certain that the database will be exposed to mutation during a period when truth maintenance is disabled and that people will be able to read on states of the database that are not coherent in terms of the available entailments. It is much easier to encapsulate a series of changes in a single SPARQL UPDATE script. When that script runs, the entire process will be ACID. So long as the script restores entailments before it finishes, it will be impossible for clients to observe the intermediate database states.

      Note: This issue was forked from BLZG-693 (SPARQL UPDATE "LOAD")

      See also BLZG-918 (Turn on and off incremental truth maintenance and kick off database at once closure)

        Attachments

          Issue Links

            Activity

              People

              Assignee:
              igorkim igorkim
              Reporter:
              bryanthompson bryanthompson
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

                Dates

                Created:
                Updated:
                Resolved: