Uploaded image for project: 'Blazegraph (by SYSTAP)'
  1. Blazegraph (by SYSTAP)
  2. BLZG-301

Failure to bulk load leaves partially created store

    Details

    • Type: Bug
    • Status: Closed - Won't Fix
    • Resolution: Incomplete
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: Bigdata RDF Database
    • Labels:
      None

      Description

      Failure to provide an ontology file during bulk load (file not found) reasonably generates an error.

      java.util.concurrent.ExecutionException: java.lang.RuntimeException: Could not load: /opt/bigdata/nas/fredfed/lubm/config/univ-bench.owl
      

      Doing the same thing again ought to produce the same error, but does not:

      java.util.concurrent.ExecutionException: java.lang.IllegalStateException: Required property not found: namespace=U100, property=com.bigdata.relation.class
      

      The triple store created should have been destroyed and removed from the global row table, but was not.

        Activity

        Hide
        bryanthompson bryanthompson added a comment -

        I agree that you should be seeing the same error reported. Could you attach a copy of the job run, the detail log, and the full stack traces to this issue?

        The bulk loader does not presume that it is creating a new triple store each time and operations are only shard-wise ACID in scale-out rather than fully isolated transactions. This makes it somewhat dangerous to simply delete the triple store and somewhat difficult to simply repeat the job if the store had been partially created. However, it is unclear why you would have observed a "partial create" of the triple store.

        Let me suggest that it may make sense to add a job option which allows you to specify that the triple store should be destroyed first if it exists (there is currently a deleteJob option which destroys the job state in zookeeper, but not an option to destroy the KB before beginning the bulk load).

        Show
        bryanthompson bryanthompson added a comment - I agree that you should be seeing the same error reported. Could you attach a copy of the job run, the detail log, and the full stack traces to this issue? The bulk loader does not presume that it is creating a new triple store each time and operations are only shard-wise ACID in scale-out rather than fully isolated transactions. This makes it somewhat dangerous to simply delete the triple store and somewhat difficult to simply repeat the job if the store had been partially created. However, it is unclear why you would have observed a "partial create" of the triple store. Let me suggest that it may make sense to add a job option which allows you to specify that the triple store should be destroyed first if it exists (there is currently a deleteJob option which destroys the job state in zookeeper, but not an option to destroy the KB before beginning the bulk load).
        Hide
        fkoliver fkoliver added a comment -

        The IllegalStateException resulted from incorrect configuration (not having icu.jar in the test client classpath), though fixing that does not fix the underlying problem.

        I think the fixes are in MappedRDFDataLoadMaster:

            public AbstractTripleStore openTripleStore() throws ConfigurationException {
             ...
             tripleStore = (AbstractTripleStore) fed.getResourceLocator().locate(
                        jobState.namespace, ITx.UNISOLATED);
        
                if (tripleStore == null) {
                    ...
                    tripleStore = createTripleStore();
                    ...
                    try {
                        loadOntology(tripleStore);
                    } catch (Exception ex) {
        /* ADD */       tripleStore.destroy(); // Don't leave badly configured store.
                        throw new RuntimeException("Could not load: "
                                + jobState.ontology, ex);
                    }
        

        and AbstractTripleStore:

            public void destroy() {
                ...
                try {
                    if (lexicon) {
                        ...
        /* ADD */       // Remove the triple store from the global row store.
        /* ADD */       getIndexManager().getGlobalRowStore().delete(
        /* ADD */          RelationSchema.INSTANCE, getNamespace());
                        lexiconRelation = null;
                        valueFactory = null;
                        axioms = null;
                        vocab = null;
                    }
        

        In this case, the destroy only occurs if the triple store failed to create (because the ontology was missing).

        Show
        fkoliver fkoliver added a comment - The IllegalStateException resulted from incorrect configuration (not having icu.jar in the test client classpath), though fixing that does not fix the underlying problem. I think the fixes are in MappedRDFDataLoadMaster: public AbstractTripleStore openTripleStore() throws ConfigurationException { ... tripleStore = (AbstractTripleStore) fed.getResourceLocator().locate( jobState.namespace, ITx.UNISOLATED); if (tripleStore == null) { ... tripleStore = createTripleStore(); ... try { loadOntology(tripleStore); } catch (Exception ex) { /* ADD */ tripleStore.destroy(); // Don't leave badly configured store. throw new RuntimeException("Could not load: " + jobState.ontology, ex); } and AbstractTripleStore: public void destroy() { ... try { if (lexicon) { ... /* ADD */ // Remove the triple store from the global row store. /* ADD */ getIndexManager().getGlobalRowStore().delete( /* ADD */ RelationSchema.INSTANCE, getNamespace()); lexiconRelation = null; valueFactory = null; axioms = null; vocab = null; } In this case, the destroy only occurs if the triple store failed to create (because the ontology was missing).
        Hide
        fkoliver fkoliver added a comment -

        Also in DefaultResourceLocator, an empty set of properties is returned rather than null if the store had been deleted:

        @@ -239,7 +239,7 @@
                     final Properties properties = locateResource(namespace, timestamp,
                             foundOn);
                     
        -            if (properties == null) {
        +            if (properties == null || properties.isEmpty()) {
         
                         // Not found by this locator.
                         
        @@ -425,7 +425,8 @@
                 final Properties properties = locateResourceOn(indexManager, namespace,
                         timestamp);
         
        -        if (properties != null) {
        +        // Empty properties may refer to deleted resource.
        +        if (properties != null && !properties.isEmpty()) {
         
                     if (INFO) {
        
        Show
        fkoliver fkoliver added a comment - Also in DefaultResourceLocator, an empty set of properties is returned rather than null if the store had been deleted: @@ -239,7 +239,7 @@ final Properties properties = locateResource(namespace, timestamp, foundOn); - if (properties == null) { + if (properties == null || properties.isEmpty()) { // Not found by this locator. @@ -425,7 +425,8 @@ final Properties properties = locateResourceOn(indexManager, namespace, timestamp); - if (properties != null) { + // Empty properties may refer to deleted resource. + if (properties != null && !properties.isEmpty()) { if (INFO) {
        Hide
        bryanthompson bryanthompson added a comment -

        Closed. Not relevant to the new architecture.

        Show
        bryanthompson bryanthompson added a comment - Closed. Not relevant to the new architecture.

          People

          • Assignee:
            fkoliver fkoliver
            Reporter:
            fkoliver fkoliver
          • Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved: