Details

      Description

      Dear Bigdata community

      How can this problem be solved...

      I have loaded several time DBPedia 2014 English without closure, and now I would like bigdata to compute a closure on a journal containing DBPedia 2014 English. I started by loading every file in a dir at a time using the bigdata loader with subsequent closure computation, but this turned out to be a bad idea since some data materialization blowed the tmp journal so that no disk left on device stopped the computation forcing me to discard the journal and restart from scratch. Then I decided to load one file with closure at a time which produced a much smaller tmp journal file but lasted longer. After a number of files successfully loaded and materialised the computation broke with the following exception (see below). What did I do wrong? How can this problem be solved? In which state is now the journal? Is the journal corrupt (3 days of loading...) or can one work with that journal? Thank you in advance for any hint.

      Regards

      PS: An excerpt of budget's log loading the files from closure computation on:
      INFO : 3681994 main com.bigdata.rdf.store.DataLoader.loadData3(DataLoader.java:1071): Computing closure.
      loading: 32721296 stmts added in 3680.409 secs, rate= 3341, commitLatency=0ms
      ClosureStats{mutationCount=3155904, elapsed=6345943ms, rate=497}
      INFO : 9793056 main com.bigdata.rdf.store.DataLoader.loadData3(DataLoader.java:1103): file:: 4636225 stmts added in 1560.926 secs, rate= 604, commitLatency=0ms
      ClosureStats{mutationCount=3155904, elapsed=6345943ms, rate=497}; totals:: 32721296 stmts added in 3680.409 secs, rate= 3341, commitLatency=0ms
      ClosureStats{mutationCount=3155904, elapsed=6345943ms, rate=497}; baseURL=file:/Applications/XAMPP/xamppfiles/htdocs/DBPutilities/app/load/../../rep/DBPEDIA_TTL/DBPEDIA_EN_TTL_LOADING/long_abstracts_en.ttl
      Load: 32721296 stmts added in 3680.409 secs, rate= 3341, commitLatency=0ms
      ClosureStats{mutationCount=3155904, elapsed=6345943ms, rate=497}
      ERROR: 9793532 main com.bigdata.Banner$1.uncaughtException(Banner.java:109): Uncaught exception in thread
      java.lang.RuntimeException: Problem with entry at -395923687506706010: lastRootBlock=rootBlock{ rootBlock=0, challisField=6, version=3, nextOffset=2280228284864020, localTime=1421088250219 [January 12, 2015 7:44:10 PM CET|Monday,], firstCommitTime=1420801833442 [January 9, 2015 12:10:33 PM CET|Friday,], lastCommitTime=1421088245783 [January 12, 2015 7:44:05 PM CET|Monday,], commitCounter=6, commitRecordAddr={off=NATIVE:-116778246,len=422}, commitRecordIndexAddr={off=NATIVE:-127538068,len=220}, blockSequence=43424, quorumToken=-1, metaBitsAddr=1297354128884389, metaStartAddr=538192, storeType=RW, uuid=6befce90-b025-4ea9-ae3b-6401710f9aa3, offsetBits=42, checksum=209201624, createTime=1420801832807 [January 9, 2015 12:10:32 PM CET|Friday,], closeTime=0}
      at com.bigdata.journal.AbstractJournal.commit(AbstractJournal.java:3084)
      at com.bigdata.rdf.store.DataLoader.main(DataLoader.java:1521)
      Caused by: java.lang.RuntimeException: Problem with entry at -395923687506706010
      at com.bigdata.rwstore.RWStore.freeDeferrals(RWStore.java:4961)
      at com.bigdata.rwstore.RWStore.checkDeferredFrees(RWStore.java:3533)
      at com.bigdata.journal.RWStrategy.checkDeferredFrees(RWStrategy.java:775)
      at com.bigdata.journal.AbstractJournal$CommitState.writeCommitRecord(AbstractJournal.java:3448)
      at com.bigdata.journal.AbstractJournal$CommitState.access$2(AbstractJournal.java:3431)
      at com.bigdata.journal.AbstractJournal.commitNow(AbstractJournal.java:4055)
      at com.bigdata.journal.AbstractJournal.commit(AbstractJournal.java:3082)
      ... 1 more
      Caused by: java.lang.RuntimeException: addr=-88929880 : cause=java.lang.IllegalStateException: Bad Address: length requested greater than allocated slot
      at com.bigdata.rwstore.RWStore.getData(RWStore.java:2190)
      at com.bigdata.rwstore.RWStore.getData(RWStore.java:1989)
      at com.bigdata.rwstore.RWStore.getData(RWStore.java:2033)
      at com.bigdata.rwstore.RWStore.getData(RWStore.java:1989)
      at com.bigdata.rwstore.RWStore.freeDeferrals(RWStore.java:4851)
      at com.bigdata.rwstore.RWStore.freeDeferrals(RWStore.java:4948)
      ... 7 more
      Caused by: java.lang.IllegalStateException: Bad Address: length requested greater than allocated slot
      at com.bigdata.rwstore.RWStore.getData(RWStore.java:2082)
      ... 12 more

        Activity

        Hide
        bryanthompson bryanthompson added a comment -

        What version of bigdata are you using? Please provide the command line used to execute the software and attach the log file.

        Also, please provide a summary of the operations that had been used against this database instance and the s/w upgrades that had been applied. There were some 1.3.x releases where isolation for the dictionary indices could be broken leading to an error reported by the RWStore.

        Show
        bryanthompson bryanthompson added a comment - What version of bigdata are you using? Please provide the command line used to execute the software and attach the log file. Also, please provide a summary of the operations that had been used against this database instance and the s/w upgrades that had been applied. There were some 1.3.x releases where isolation for the dictionary indices could be broken leading to an error reported by the RWStore.
        Hide
        bryanthompson bryanthompson added a comment -

        1. I am pretty sure that the allocation error issue is resolved.
        2. The correct way to do this is to use database-at-once closure rather than incremental truth maintenance. See http://wiki.blazegraph.com/wiki/index.php/InferenceAndTruthMaintenance for information about the truth maintenance system. The DataLoader utility class is the best way to do database-at-once inference since it makes it possible for you to manage this explicitly. While we have looked at a SPARQL UPDATE extension that would support this, one has not yet been implemented. With large data sets database-at-once inference will be MUCH MUCH faster when compared to performing truth maintenance for each data set after it is loaded (one-by-one). Note that you can use the DataLoader to load into a Journal that is used by the NSS. However, the Journal can be used by only one process at a time. Therefore, load into the Journal with the DataLoader. Then start the NSS once the load is done and the closure has been completed.

        Show
        bryanthompson bryanthompson added a comment - 1. I am pretty sure that the allocation error issue is resolved. 2. The correct way to do this is to use database-at-once closure rather than incremental truth maintenance. See http://wiki.blazegraph.com/wiki/index.php/InferenceAndTruthMaintenance for information about the truth maintenance system. The DataLoader utility class is the best way to do database-at-once inference since it makes it possible for you to manage this explicitly. While we have looked at a SPARQL UPDATE extension that would support this, one has not yet been implemented. With large data sets database-at-once inference will be MUCH MUCH faster when compared to performing truth maintenance for each data set after it is loaded (one-by-one). Note that you can use the DataLoader to load into a Journal that is used by the NSS. However, the Journal can be used by only one process at a time. Therefore, load into the Journal with the DataLoader. Then start the NSS once the load is done and the closure has been completed.

          People

          • Assignee:
            martyncutcher martyncutcher
            Reporter:
            fabioricci fabioricci
          • Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved: