Details

    • Type: Task
    • Status: Done
    • Resolution: Done
    • Affects Version/s: BLAZEGRAPH_RELEASE_1_5_1
    • Fix Version/s: None
    • Component/s: Other

      Description

      We have > 588 test failures per http://ci.bigdata.com/job/GIT_DEVELOPMENT/1290/

      I investigated a few of these tests failures. For example:

      com.bigdata.rdf.sail.webapp.HashDistinctNamedGraphUpdateTest.test_t_1
      

      I see what appears to be a SAX related issue (see below my signature). I have vague memories that there might be something about which XML parser is being used. These tests are known to work correctly under eclipse.

      If I run just that one test using ant, it works fine:

      ant -DtestName=com.bigdata.rdf.sail.webapp.HashDistinctNamedGraphUpdateTest junit
      ...
          [junit] Tests run: 8, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 5.825 sec
      

      Here is the recurring error that is showing up in CI for the GIT_DEVELOPMENT branch.

      JAXP00010001: The parser has encountered more than "64000" entity expansions in this document; this is the limit imposed by the JDK.
      Stacktrace
      
      org.xml.sax.SAXParseException; systemId: http://www.eclipse.org/jetty/configure.dtd; lineNumber: 1; columnNumber: 1; JAXP00010001: The parser has encountered more than "64000" entity expansions in this document; this is the limit imposed by the JDK.
      	at com.sun.org.apache.xerces.internal.util.ErrorHandlerWrapper.createSAXParseException(ErrorHandlerWrapper.java:198)
      	at com.sun.org.apache.xerces.internal.util.ErrorHandlerWrapper.fatalError(ErrorHandlerWrapper.java:177)
      	at com.sun.org.apache.xerces.internal.impl.XMLErrorReporter.reportError(XMLErrorReporter.java:441)
      	at com.sun.org.apache.xerces.internal.impl.XMLErrorReporter.reportError(XMLErrorReporter.java:368)
      	at com.sun.org.apache.xerces.internal.impl.XMLErrorReporter.reportError(XMLErrorReporter.java:325)
      	at com.sun.org.apache.xerces.internal.impl.XMLEntityManager.startEntity(XMLEntityManager.java:1302)
      	at com.sun.org.apache.xerces.internal.impl.XMLEntityManager.startDTDEntity(XMLEntityManager.java:1257)
      	at com.sun.org.apache.xerces.internal.impl.XMLDTDScannerImpl.setInputSource(XMLDTDScannerImpl.java:262)
      	at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl$DTDDriver.dispatch(XMLDocumentScannerImpl.java:1162)
      	at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl$DTDDriver.next(XMLDocumentScannerImpl.java:1050)
      	at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl$PrologDriver.next(XMLDocumentScannerImpl.java:964)
      	at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl.next(XMLDocumentScannerImpl.java:606)
      	at com.sun.org.apache.xerces.internal.impl.XMLNSDocumentScannerImpl.next(XMLNSDocumentScannerImpl.java:117)
      	at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanDocument(XMLDocumentFragmentScannerImpl.java:510)
      	at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configuration.java:848)
      	at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configuration.java:777)
      	at com.sun.org.apache.xerces.internal.parsers.XMLParser.parse(XMLParser.java:141)
      	at com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.parse(AbstractSAXParser.java:1213)
      	at com.sun.org.apache.xerces.internal.jaxp.SAXParserImpl$JAXPSAXParser.parse(SAXParserImpl.java:649)
      	at com.sun.org.apache.xerces.internal.jaxp.SAXParserImpl.parse(SAXParserImpl.java:333)
      	at org.eclipse.jetty.xml.XmlParser.parse(XmlParser.java:210)
      	at org.eclipse.jetty.xml.XmlConfiguration.<init>(XmlConfiguration.java:177)
      	at com.bigdata.rdf.sail.webapp.NanoSparqlServer.newInstance(NanoSparqlServer.java:724)
      	at com.bigdata.rdf.sail.webapp.NanoSparqlServer.newInstance(NanoSparqlServer.java:633)
      	at com.bigdata.rdf.sail.webapp.NanoSparqlServer.newInstance(NanoSparqlServer.java:591)
      	at com.bigdata.rdf.sail.webapp.AbstractTestNanoSparqlClient.newFixture(AbstractTestNanoSparqlClient.java:260)
      	at com.bigdata.rdf.sail.webapp.AbstractTestNanoSparqlClient.setUp(AbstractTestNanoSparqlClient.java:281)
      	at com.bigdata.rdf.sail.webapp.AbstractProtocolTest.setUp(AbstractProtocolTest.java:134)
      

        Activity

        Hide
        bryanthompson bryanthompson added a comment -

        Looks like maybe we are hitting this issue per [1]. Have we changed the JVM version that is running in CI? Anything else that could be causing this?

        From [1] ==>

        Answering my own question here for the ages.
        There's currently an XML expansion limit processing bug in Oracle and OpenJDK's Java that results in a shared counter hitting the default upper bound when parsing multiple XML documents.
        
        https://blogs.oracle.com/joew/entry/jdk_7u45_aws_issue_123
        https://bugs.openjdk.java.net/browse/JDK-8028111
        https://github.com/aws/aws-sdk-java/issues/123
        Although I thought that our version (6b27-1.12.6-1ubuntu0.12.04.4) wasn't affected, running the sample code given in the OpenJDK bug report did indeed verify that we were susceptible to the bug.
        
        To work around the issue, I needed to pass jdk.xml.entityExpansionLimit=0 to the Storm workers. By adding the following to storm.yaml across my cluster, I was able to mitigate this problem.
        

        [1] http://stackoverflow.com/questions/20482331/whats-causing-these-parseerror-exceptions-when-reading-off-an-aws-sqs-queue-in

        Show
        bryanthompson bryanthompson added a comment - Looks like maybe we are hitting this issue per [1] . Have we changed the JVM version that is running in CI? Anything else that could be causing this? From [1] ==> Answering my own question here for the ages. There's currently an XML expansion limit processing bug in Oracle and OpenJDK's Java that results in a shared counter hitting the default upper bound when parsing multiple XML documents. https://blogs.oracle.com/joew/entry/jdk_7u45_aws_issue_123 https://bugs.openjdk.java.net/browse/JDK-8028111 https://github.com/aws/aws-sdk-java/issues/123 Although I thought that our version (6b27-1.12.6-1ubuntu0.12.04.4) wasn't affected, running the sample code given in the OpenJDK bug report did indeed verify that we were susceptible to the bug. To work around the issue, I needed to pass jdk.xml.entityExpansionLimit=0 to the Storm workers. By adding the following to storm.yaml across my cluster, I was able to mitigate this problem. [1] http://stackoverflow.com/questions/20482331/whats-causing-these-parseerror-exceptions-when-reading-off-an-aws-sqs-queue-in
        Hide
        bryanthompson bryanthompson added a comment -

        The problem appears to go back to this CI build (this is the first one with a large jump in the #of errors):

        http://ci.bigdata.com/view/All/job/GIT_DEVELOPMENT/1281/

        This included a refactor of the REST client API.

        So a possible explanation is a memory leak.

        I downloaded the console output from that CI job
        - see below. This appears to suggest that we have a memory leak.

           [junit] ERROR: 5050534      main com.bigdata.rdf.sail.webapp.ProxyTestCase.tearDown(ProxyTestCase.java:211): Threads left active after task: test=test_POST_INSERT_LOAD_FROM_URIs, delegate=com.bigdata.rdf.sail.webapp.\
        TestNanoSparqlServerWithProxyIndexManager, startupCount=556, teardownCount=558, thisThread=main, threads: [main],[Thread-41],[Thread-40],[Thread-43],[Thread-42],[FileWatchdog],[pool-2030-thread-1],[com.bigdata.ha.pipelin\
        e.HAReceiveService@9376682{addrSelf=0.0.0.0/0.0.0.0:56954}],[pool-2031-thread-1],[pool-2042-thread-1],[com.bigdata.ha.pipeline.HAReceiveService@541888711{addrSelf=0.0.0.0/0.0.0.0:58964}],[pool-2043-thread-1],[pool-2057-t\
        hread-1],[com.bigdata.ha.pipeline.HAReceiveService@1455879797{addrSelf=0.0.0.0/0.0.0.0:56191}],[pool-2058-thread-1],[com.bigdata.ha.pipeline.HAReceiveService@807263681{addrSelf=0.0.0.0/0.0.0.0:36434}],[pool-2059-thread-1\
        ],[pool-2060-thread-1],[pool-2077-thread-1],[com.bigdata.ha.pipeline.HAReceiveService@914029943{addrSelf=0.0.0.0/0.0.0.0:47677}],[pool-2078-thread-1],[pool-2107-thread-1],[com.bigdata.ha.pipeline.HAReceiveService@3230143\
        91{addrSelf=0.0.0.0/0.0.0.0:51680}],[pool-2108-thread-1],[pool-2122-thread-1],[com.bigdata.ha.pipeline.HAReceiveService@814130670{addrSelf=0.0.0.0/0.0.0.0:56163}],[pool-2123-thread-1],[com.bigdata.ha.pipeline.HAReceiveSe\
        rvice@26527824{addrSelf=0.0.0.0/0.0.0.0:47470}],[pool-2124-thread-1],[pool-2125-thread-1],[pool-2148-thread-1],[com.bigdata.ha.pipeline.HAReceiveService@1548666630{addrSelf=0.0.0.0/0.0.0.0:32975}],[pool-2149-thread-1],[p\
        ool-2172-thread-1],[com.bigdata.ha.pipeline.HAReceiveService@2136020482{addrSelf=0.0.0.0/0.0.0.0:57230}],[pool-2173-thread-1],[pool-2184-thread-1],[com.bigdata.ha.pipeline.HAReceiveService@165431830{addrSelf=0.0.0.0/0.0.\
        0.0:54585}],[pool-2185-thread-1],[pool-1-thread-684],[pool-1-thread-685],[pool-1-thread-686],[pool-1-thread-687],[pool-1-thread-688],[pool-1-thread-689],[pool-1-thread-690],[pool-1-thread-691],[pool-1-thread-692],[pool-1\
        -thread-693],[pool-1-thread-694],[pool-1-thread-695],[pool-1-thread-696],[pool-1-thread-697],[pool-1-thread-698],[pool-1-thread-699],[pool-1-thread-700],[pool-1-thread-701],[pool-1-thread-702],[pool-1-thread-703],[com.bi\
        gdata.rwstore.RWStore$11],[com.bigdata.journal.WriteExecutorService$MyLockManager1],[com.bigdata.journal.ConcurrencyManager.sampleService1],[com.bigdata.journal.ConcurrencyManager.writeService1],[com.bigdata.journal.Conc\
        urrencyManager.writeService2],[com.bigdata.journal.ConcurrencyManager.writeService3],[com.bigdata.rwstore.RWStore$11],[com.bigdata.journal.WriteExecutorService$MyLockManager1],[com.bigdata.journal.ConcurrencyManager.samp\
        leService1],[com.bigdata.journal.WriteExecutorService$MyLockManager1],[com.bigdata.journal.ConcurrencyManager.sampleService1],[com.bigdata.journal.WriteExecutorService$MyLockManager1],[com.bigdata.journal.ConcurrencyMana\
        ger.sampleService1],[com.bigdata.journal.WriteExecutorService$MyLockManager1],[com.bigdata.journal.ConcurrencyManager.sampleService1],[com.bigdata.journal.ConcurrencyManager.writeService1],[com.bigdata.journal.Concurrenc\
        yManager.writeService2],[com.bigdata.journal.ConcurrencyManager.writeService3],[com.bigdata.journal.ConcurrencyManager.writeService1],[com.bigdata.journal.ConcurrencyManager.writeService2],[com.bigdata.journal.Concurrenc\
        yManager.writeService3],[com.bigdata.journal.ConcurrencyManager.writeService4],[com.bigdata.journal.ConcurrencyManager.writeService5],[com.bigdata.journal.ConcurrencyManager.writeService6],[com.bigdata.journal.Concurrenc\
        yManager.writeService7],[com.bigdata.journal.ConcurrencyManager.writeService8],[com.bigdata.journal.ConcurrencyManager.writeService9],[com.bigdata.journal.ConcurrencyManager.writeService10],[com.bigdata.journal.Concurren\
        cyManager.writeService1],[com.bigdata.journal.Journal.executorService5],[com.bigdata.journal.ConcurrencyManager.writeService2],[com.bigdata.journal.ConcurrencyManager.writeService3],[com.bigdata.journal.ConcurrencyManage\
        r.writeService4],[com.bigdata.journal.ConcurrencyManager.writeService5],[com.bigdata.journal.ConcurrencyManager.writeService6],[com.bigdata.journal.ConcurrencyManager.writeService7],[com.bigdata.journal.ConcurrencyManage\
        r.writeService8],[com.bigdata.journal.ConcurrencyManager.writeService9],[com.bigdata.journal.ConcurrencyManager.writeService10],[com.bigdata.journal.Journal.executorService45],[com.bigdata.journal.Journal.executorService\
        291],[com.bigdata.journal.Journal.executorService322],[com.bigdata.journal.Journal.executorService323],[com.bigdata.journal.Journal.executorService346],[com.bigdata.journal.Journal.executorService358],[com.bigdata.journa\
        l.Journal.executorService369],[class com.bigdata.bop.engine.QueryEngine.engineService1],[class com.bigdata.bop.engine.QueryEngine.engineService1],[HttpClient@413371815-132303],[HttpClient@413371815-132304-selector-Client\
        SelectorManager@6e6bd74d/1],[HttpClient@413371815-132305],[HttpClient@413371815-132306],[HttpClient@413371815-132307],[HttpClient@413371815-132308],[HttpClient@413371815-132309-selector-ClientSelectorManager@6e6bd74d/0],\
        [HttpClient@413371815-132310],[HttpClient@413371815-scheduler],[com.bigdata.journal.ConcurrencyManager.writeService1],[com.bigdata.journal.Journal.executorService2],[com.bigdata.journal.Journal.executorService3],[com.big\
        data.journal.Journal.executorService4],[com.bigdata.journal.Journal.executorService5],[com.bigdata.journal.Journal.executorService6],[com.bigdata.journal.Journal.executorService7],[com.bigdata.journal.ConcurrencyManager.\
        writeService2],[com.bigdata.journal.ConcurrencyManager.writeService3],[com.bigdata.journal.ConcurrencyManager.writeService4],[com.bigdata.journal.ConcurrencyManager.writeService5],[com.bigdata.journal.ConcurrencyManager.\
        writeService6],[com.bigdata.journal.ConcurrencyManager.writeService7],[com.bigdata.journal.ConcurrencyManager.writeService8],[com.bigdata.journal.ConcurrencyManager.writeService9],[com.bigdata.journal.ConcurrencyManager.\
        writeService10],[com.bigdata.journal.Journal.executorService8],[com.bigdata.journal.Journal.executorService9],[com.bigdata.journal.Journal.executorService10],[com.bigdata.journal.Journal.executorService11],[com.bigdata.j\
        ournal.Journal.executorService12],[com.bigdata.journal.Journal.executorService13],[com.bigdata.journal.Journal.executorService14],[class com.bigdata.bop.engine.QueryEngine.engineService1],[com.bigdata.journal.Journal.exe\
        
        Show
        bryanthompson bryanthompson added a comment - The problem appears to go back to this CI build (this is the first one with a large jump in the #of errors): http://ci.bigdata.com/view/All/job/GIT_DEVELOPMENT/1281/ This included a refactor of the REST client API. So a possible explanation is a memory leak. I downloaded the console output from that CI job - see below. This appears to suggest that we have a memory leak. [junit] ERROR: 5050534 main com.bigdata.rdf.sail.webapp.ProxyTestCase.tearDown(ProxyTestCase.java:211): Threads left active after task: test=test_POST_INSERT_LOAD_FROM_URIs, delegate=com.bigdata.rdf.sail.webapp.\ TestNanoSparqlServerWithProxyIndexManager, startupCount=556, teardownCount=558, thisThread=main, threads: [main],[Thread-41],[Thread-40],[Thread-43],[Thread-42],[FileWatchdog],[pool-2030-thread-1],[com.bigdata.ha.pipelin\ e.HAReceiveService@9376682{addrSelf=0.0.0.0/0.0.0.0:56954}],[pool-2031-thread-1],[pool-2042-thread-1],[com.bigdata.ha.pipeline.HAReceiveService@541888711{addrSelf=0.0.0.0/0.0.0.0:58964}],[pool-2043-thread-1],[pool-2057-t\ hread-1],[com.bigdata.ha.pipeline.HAReceiveService@1455879797{addrSelf=0.0.0.0/0.0.0.0:56191}],[pool-2058-thread-1],[com.bigdata.ha.pipeline.HAReceiveService@807263681{addrSelf=0.0.0.0/0.0.0.0:36434}],[pool-2059-thread-1\ ],[pool-2060-thread-1],[pool-2077-thread-1],[com.bigdata.ha.pipeline.HAReceiveService@914029943{addrSelf=0.0.0.0/0.0.0.0:47677}],[pool-2078-thread-1],[pool-2107-thread-1],[com.bigdata.ha.pipeline.HAReceiveService@3230143\ 91{addrSelf=0.0.0.0/0.0.0.0:51680}],[pool-2108-thread-1],[pool-2122-thread-1],[com.bigdata.ha.pipeline.HAReceiveService@814130670{addrSelf=0.0.0.0/0.0.0.0:56163}],[pool-2123-thread-1],[com.bigdata.ha.pipeline.HAReceiveSe\ rvice@26527824{addrSelf=0.0.0.0/0.0.0.0:47470}],[pool-2124-thread-1],[pool-2125-thread-1],[pool-2148-thread-1],[com.bigdata.ha.pipeline.HAReceiveService@1548666630{addrSelf=0.0.0.0/0.0.0.0:32975}],[pool-2149-thread-1],[p\ ool-2172-thread-1],[com.bigdata.ha.pipeline.HAReceiveService@2136020482{addrSelf=0.0.0.0/0.0.0.0:57230}],[pool-2173-thread-1],[pool-2184-thread-1],[com.bigdata.ha.pipeline.HAReceiveService@165431830{addrSelf=0.0.0.0/0.0.\ 0.0:54585}],[pool-2185-thread-1],[pool-1-thread-684],[pool-1-thread-685],[pool-1-thread-686],[pool-1-thread-687],[pool-1-thread-688],[pool-1-thread-689],[pool-1-thread-690],[pool-1-thread-691],[pool-1-thread-692],[pool-1\ -thread-693],[pool-1-thread-694],[pool-1-thread-695],[pool-1-thread-696],[pool-1-thread-697],[pool-1-thread-698],[pool-1-thread-699],[pool-1-thread-700],[pool-1-thread-701],[pool-1-thread-702],[pool-1-thread-703],[com.bi\ gdata.rwstore.RWStore$11],[com.bigdata.journal.WriteExecutorService$MyLockManager1],[com.bigdata.journal.ConcurrencyManager.sampleService1],[com.bigdata.journal.ConcurrencyManager.writeService1],[com.bigdata.journal.Conc\ urrencyManager.writeService2],[com.bigdata.journal.ConcurrencyManager.writeService3],[com.bigdata.rwstore.RWStore$11],[com.bigdata.journal.WriteExecutorService$MyLockManager1],[com.bigdata.journal.ConcurrencyManager.samp\ leService1],[com.bigdata.journal.WriteExecutorService$MyLockManager1],[com.bigdata.journal.ConcurrencyManager.sampleService1],[com.bigdata.journal.WriteExecutorService$MyLockManager1],[com.bigdata.journal.ConcurrencyMana\ ger.sampleService1],[com.bigdata.journal.WriteExecutorService$MyLockManager1],[com.bigdata.journal.ConcurrencyManager.sampleService1],[com.bigdata.journal.ConcurrencyManager.writeService1],[com.bigdata.journal.Concurrenc\ yManager.writeService2],[com.bigdata.journal.ConcurrencyManager.writeService3],[com.bigdata.journal.ConcurrencyManager.writeService1],[com.bigdata.journal.ConcurrencyManager.writeService2],[com.bigdata.journal.Concurrenc\ yManager.writeService3],[com.bigdata.journal.ConcurrencyManager.writeService4],[com.bigdata.journal.ConcurrencyManager.writeService5],[com.bigdata.journal.ConcurrencyManager.writeService6],[com.bigdata.journal.Concurrenc\ yManager.writeService7],[com.bigdata.journal.ConcurrencyManager.writeService8],[com.bigdata.journal.ConcurrencyManager.writeService9],[com.bigdata.journal.ConcurrencyManager.writeService10],[com.bigdata.journal.Concurren\ cyManager.writeService1],[com.bigdata.journal.Journal.executorService5],[com.bigdata.journal.ConcurrencyManager.writeService2],[com.bigdata.journal.ConcurrencyManager.writeService3],[com.bigdata.journal.ConcurrencyManage\ r.writeService4],[com.bigdata.journal.ConcurrencyManager.writeService5],[com.bigdata.journal.ConcurrencyManager.writeService6],[com.bigdata.journal.ConcurrencyManager.writeService7],[com.bigdata.journal.ConcurrencyManage\ r.writeService8],[com.bigdata.journal.ConcurrencyManager.writeService9],[com.bigdata.journal.ConcurrencyManager.writeService10],[com.bigdata.journal.Journal.executorService45],[com.bigdata.journal.Journal.executorService\ 291],[com.bigdata.journal.Journal.executorService322],[com.bigdata.journal.Journal.executorService323],[com.bigdata.journal.Journal.executorService346],[com.bigdata.journal.Journal.executorService358],[com.bigdata.journa\ l.Journal.executorService369],[class com.bigdata.bop.engine.QueryEngine.engineService1],[class com.bigdata.bop.engine.QueryEngine.engineService1],[HttpClient@413371815-132303],[HttpClient@413371815-132304-selector-Client\ SelectorManager@6e6bd74d/1],[HttpClient@413371815-132305],[HttpClient@413371815-132306],[HttpClient@413371815-132307],[HttpClient@413371815-132308],[HttpClient@413371815-132309-selector-ClientSelectorManager@6e6bd74d/0],\ [HttpClient@413371815-132310],[HttpClient@413371815-scheduler],[com.bigdata.journal.ConcurrencyManager.writeService1],[com.bigdata.journal.Journal.executorService2],[com.bigdata.journal.Journal.executorService3],[com.big\ data.journal.Journal.executorService4],[com.bigdata.journal.Journal.executorService5],[com.bigdata.journal.Journal.executorService6],[com.bigdata.journal.Journal.executorService7],[com.bigdata.journal.ConcurrencyManager.\ writeService2],[com.bigdata.journal.ConcurrencyManager.writeService3],[com.bigdata.journal.ConcurrencyManager.writeService4],[com.bigdata.journal.ConcurrencyManager.writeService5],[com.bigdata.journal.ConcurrencyManager.\ writeService6],[com.bigdata.journal.ConcurrencyManager.writeService7],[com.bigdata.journal.ConcurrencyManager.writeService8],[com.bigdata.journal.ConcurrencyManager.writeService9],[com.bigdata.journal.ConcurrencyManager.\ writeService10],[com.bigdata.journal.Journal.executorService8],[com.bigdata.journal.Journal.executorService9],[com.bigdata.journal.Journal.executorService10],[com.bigdata.journal.Journal.executorService11],[com.bigdata.j\ ournal.Journal.executorService12],[com.bigdata.journal.Journal.executorService13],[com.bigdata.journal.Journal.executorService14],[class com.bigdata.bop.engine.QueryEngine.engineService1],[com.bigdata.journal.Journal.exe\
        Hide
        bryanthompson bryanthompson added a comment -

        Pushing commit with a possible fix to RemoteRepositoryManager by adding finalize() method.

         /**
            * {@inheritDoc}
            * <p>
            * Ensure resource is closed.
            * 
            * @see AutoCloseable
            * @see <a href="http://trac.bigdata.com/ticket/1207" > Memory leak in CI?
            *      </a>
            */
           @Override
           protected void finalize() throws Throwable {
        
              close();
        
              super.finalize();
        
           }
        

        Commit ac9face3ddf78087098cb87e0bcbad348ea00590 to master on github.

        Show
        bryanthompson bryanthompson added a comment - Pushing commit with a possible fix to RemoteRepositoryManager by adding finalize() method. /** * {@inheritDoc} * <p> * Ensure resource is closed. * * @see AutoCloseable * @see <a href="http://trac.bigdata.com/ticket/1207" > Memory leak in CI? * </a> */ @Override protected void finalize() throws Throwable { close(); super.finalize(); } Commit ac9face3ddf78087098cb87e0bcbad348ea00590 to master on github.
        Hide
        bryanthompson bryanthompson added a comment -

        In local test I can see that the RemoteRepositoryManager is still being leaked. This is detected using yourkit and attempting to force GC of RemoteRepositoryManager instances by triggering GC events.

        TestBigdataSailRemoteRepository was leaking references. One line fix in tearDown()

           public void tearDown() throws Exception {
        
              if (cxn != null) {
        
                 cxn.close();
        
                 cxn = null;
        
              }
        
              // See BLZG-202 (Memory leak in CI).
              repo = null;
              
              super.tearDown();
        
           }
        

        Still looking for other leaks.

        Show
        bryanthompson bryanthompson added a comment - In local test I can see that the RemoteRepositoryManager is still being leaked. This is detected using yourkit and attempting to force GC of RemoteRepositoryManager instances by triggering GC events. TestBigdataSailRemoteRepository was leaking references. One line fix in tearDown() public void tearDown() throws Exception { if (cxn != null) { cxn.close(); cxn = null; } // See BLZG-202 (Memory leak in CI). repo = null; super.tearDown(); } Still looking for other leaks.
        Hide
        bryanthompson bryanthompson added a comment -

        The memory leaks in the NSS test suite are fixed by the following commits

        96f94783f43fd5a04c617e2b8d1f7918de0e0d4f (TestBigdataSailRemoteRepository.java)
        ac9face3ddf78087098cb87e0bcbad348ea00590 (RemoteRepositoryManager.java)

        This is in CI now.

        Hopefully this was the root cause for the original ticket (JAXP problem).

        Show
        bryanthompson bryanthompson added a comment - The memory leaks in the NSS test suite are fixed by the following commits 96f94783f43fd5a04c617e2b8d1f7918de0e0d4f (TestBigdataSailRemoteRepository.java) ac9face3ddf78087098cb87e0bcbad348ea00590 (RemoteRepositoryManager.java) This is in CI now. Hopefully this was the root cause for the original ticket (JAXP problem).
        Hide
        bryanthompson bryanthompson added a comment -

        Unfortunately the code fixes above are NOT linked to the original (JAXP) error.


        - Have we installed updates on the jenkins server?


        - Has the JVM for jenkins itself changes?


        - Have we changed to a newer version of jenkins?


        - Have we changed the jetty dependency version (the problem occurs when reading the jetty.xml configuration file)?


        - Have we changed the junit dependency (some it outputs XML)?

        Note that the error is being thrown out of the following code in NanoSparqlServer. This occurs when it reads the jetty.xml file.

                final Server server;
                {
        
                    // Find the effective jetty.xml URL.
                    final URL jettyXmlURL = getEffectiveJettyXmlURL(classLoader,
                            jettyXml);
        
                    // Build the server configuration from that jetty.xml resource.
                    final XmlConfiguration configuration;
                    {
                        // Open jetty.xml resource.
                        final Resource jettyConfig = Resource.newResource(jettyXmlURL);
                        InputStream is = null;
                        try {
                            is = jettyConfig.getInputStream();
                            // Build configuration.
                            configuration = new XmlConfiguration(is); // <== THIS LINE IS THROWING AN EXCEPTION.
                        } finally {
                            if (is != null) {
                                is.close();
                            }
                        }
                    }
        
                    // Configure/apply jetty.resourceBase overrides.
                    configureEffectiveResourceBase(classLoader);
        
                    // Configure the jetty server.
                    server = (Server) configuration.configure();
        
                }
        
        Show
        bryanthompson bryanthompson added a comment - Unfortunately the code fixes above are NOT linked to the original (JAXP) error. - Have we installed updates on the jenkins server? - Has the JVM for jenkins itself changes? - Have we changed to a newer version of jenkins? - Have we changed the jetty dependency version (the problem occurs when reading the jetty.xml configuration file)? - Have we changed the junit dependency (some it outputs XML)? Note that the error is being thrown out of the following code in NanoSparqlServer. This occurs when it reads the jetty.xml file. final Server server; { // Find the effective jetty.xml URL. final URL jettyXmlURL = getEffectiveJettyXmlURL(classLoader, jettyXml); // Build the server configuration from that jetty.xml resource. final XmlConfiguration configuration; { // Open jetty.xml resource. final Resource jettyConfig = Resource.newResource(jettyXmlURL); InputStream is = null; try { is = jettyConfig.getInputStream(); // Build configuration. configuration = new XmlConfiguration(is); // <== THIS LINE IS THROWING AN EXCEPTION. } finally { if (is != null) { is.close(); } } } // Configure/apply jetty.resourceBase overrides. configureEffectiveResourceBase(classLoader); // Configure the jetty server. server = (Server) configuration.configure(); }
        Hide
        bryanthompson bryanthompson added a comment -

        I have killed the current CI run of the GIT_DEVELOPMENT job.

        I added -Djdk.xml.entityExpansionLimit=0 to the JVM invocation in the job configuration. We may simply be running "too many tests" and hitting the documented bug in 1.7_u45 per https://blogs.oracle.com/joew/entry/jdk_7u45_aws_issue_123.

        CI is restarting on that job. We will know in 2 hours.

        If this is the fix, then we need to get that option into each of the jobs -or- switch to a different JVM release.

        Show
        bryanthompson bryanthompson added a comment - I have killed the current CI run of the GIT_DEVELOPMENT job. I added -Djdk.xml.entityExpansionLimit=0 to the JVM invocation in the job configuration. We may simply be running "too many tests" and hitting the documented bug in 1.7_u45 per https://blogs.oracle.com/joew/entry/jdk_7u45_aws_issue_123 . CI is restarting on that job. We will know in 2 hours. If this is the fix, then we need to get that option into each of the jobs -or- switch to a different JVM release.
        Hide
        bryanthompson bryanthompson added a comment -

        Fixed. Down to 22 well known errors.

         com.bigdata.rdf.sail.sparql.TestReificationDoneRightParser.test_triple_ref_pattern_all_vars	14 ms	1
         com.bigdata.rdf.sail.sparql.TestReificationDoneRightParser.test_update_insert_data_RDR	14 ms	1
         com.bigdata.rdf.sail.sparql.Bigdata2ASTSPARQL11SyntaxTest."syntax-BINDscope6.rq"	96 ms	1
         com.bigdata.rdf.sail.sparql.Bigdata2ASTSPARQL11SyntaxTest."syntax-BINDscope7.rq"	13 ms	1
         com.bigdata.rdf.sail.sparql.Bigdata2ASTSPARQL11SyntaxTest."syntax-BINDscope8.rq"	14 ms	1
         com.bigdata.rdf.sail.sparql.Bigdata2ASTSPARQL11SyntaxTest."syntax-SELECTscope2"	13 ms	1
         com.bigdata.rdf.sail.sparql.Bigdata2ASTSPARQL11SyntaxTest."syntax-update-25.ru"	15 ms	1
         com.bigdata.rdf.sail.sparql.Bigdata2ASTSPARQL11SyntaxTest."syntax-update-31.ru"	15 ms	1
         com.bigdata.rdf.sail.sparql.Bigdata2ASTSPARQL11SyntaxTest."syntax-update-54.ru"	13 ms	1
         com.bigdata.rdf.sail.sparql.TestReificationDoneRightParser.test_triple_ref_pattern_all_vars	14 ms	1
         com.bigdata.rdf.sail.sparql.TestReificationDoneRightParser.test_update_insert_data_RDR	14 ms	1
         com.bigdata.rdf.sail.sparql.Bigdata2ASTSPARQL11SyntaxTest."syntax-BINDscope6.rq"	96 ms	1
         com.bigdata.rdf.sail.sparql.Bigdata2ASTSPARQL11SyntaxTest."syntax-BINDscope7.rq"	13 ms	1
         com.bigdata.rdf.sail.sparql.Bigdata2ASTSPARQL11SyntaxTest."syntax-BINDscope8.rq"	14 ms	1
         com.bigdata.rdf.sail.sparql.Bigdata2ASTSPARQL11SyntaxTest."syntax-SELECTscope2"	13 ms	1
         com.bigdata.rdf.sail.sparql.Bigdata2ASTSPARQL11SyntaxTest."syntax-update-25.ru"	15 ms	1
         com.bigdata.rdf.sail.sparql.Bigdata2ASTSPARQL11SyntaxTest."syntax-update-31.ru"	15 ms	1
         com.bigdata.rdf.sail.sparql.Bigdata2ASTSPARQL11SyntaxTest."syntax-update-54.ru"	13 ms	1
         com.bigdata.ha.pipeline.TestHASendAndReceive3Nodes.testPipelineChange_smallMessage	9 ms	2
         com.bigdata.ha.pipeline.TestHASendAndReceive3Nodes.testPipelineChange_smallMessage	9 ms	2
         com.bigdata.rdf.sparql.ast.eval.reif.TestReificationDoneRightEval.test_reificationDoneRight_05a	53 ms	3
         com.bigdata.rdf.sparql.ast.eval.reif.TestReificationDoneRightEval.test_reificationDoneRight_05a
        
        Show
        bryanthompson bryanthompson added a comment - Fixed. Down to 22 well known errors. com.bigdata.rdf.sail.sparql.TestReificationDoneRightParser.test_triple_ref_pattern_all_vars 14 ms 1 com.bigdata.rdf.sail.sparql.TestReificationDoneRightParser.test_update_insert_data_RDR 14 ms 1 com.bigdata.rdf.sail.sparql.Bigdata2ASTSPARQL11SyntaxTest."syntax-BINDscope6.rq" 96 ms 1 com.bigdata.rdf.sail.sparql.Bigdata2ASTSPARQL11SyntaxTest."syntax-BINDscope7.rq" 13 ms 1 com.bigdata.rdf.sail.sparql.Bigdata2ASTSPARQL11SyntaxTest."syntax-BINDscope8.rq" 14 ms 1 com.bigdata.rdf.sail.sparql.Bigdata2ASTSPARQL11SyntaxTest."syntax-SELECTscope2" 13 ms 1 com.bigdata.rdf.sail.sparql.Bigdata2ASTSPARQL11SyntaxTest."syntax-update-25.ru" 15 ms 1 com.bigdata.rdf.sail.sparql.Bigdata2ASTSPARQL11SyntaxTest."syntax-update-31.ru" 15 ms 1 com.bigdata.rdf.sail.sparql.Bigdata2ASTSPARQL11SyntaxTest."syntax-update-54.ru" 13 ms 1 com.bigdata.rdf.sail.sparql.TestReificationDoneRightParser.test_triple_ref_pattern_all_vars 14 ms 1 com.bigdata.rdf.sail.sparql.TestReificationDoneRightParser.test_update_insert_data_RDR 14 ms 1 com.bigdata.rdf.sail.sparql.Bigdata2ASTSPARQL11SyntaxTest."syntax-BINDscope6.rq" 96 ms 1 com.bigdata.rdf.sail.sparql.Bigdata2ASTSPARQL11SyntaxTest."syntax-BINDscope7.rq" 13 ms 1 com.bigdata.rdf.sail.sparql.Bigdata2ASTSPARQL11SyntaxTest."syntax-BINDscope8.rq" 14 ms 1 com.bigdata.rdf.sail.sparql.Bigdata2ASTSPARQL11SyntaxTest."syntax-SELECTscope2" 13 ms 1 com.bigdata.rdf.sail.sparql.Bigdata2ASTSPARQL11SyntaxTest."syntax-update-25.ru" 15 ms 1 com.bigdata.rdf.sail.sparql.Bigdata2ASTSPARQL11SyntaxTest."syntax-update-31.ru" 15 ms 1 com.bigdata.rdf.sail.sparql.Bigdata2ASTSPARQL11SyntaxTest."syntax-update-54.ru" 13 ms 1 com.bigdata.ha.pipeline.TestHASendAndReceive3Nodes.testPipelineChange_smallMessage 9 ms 2 com.bigdata.ha.pipeline.TestHASendAndReceive3Nodes.testPipelineChange_smallMessage 9 ms 2 com.bigdata.rdf.sparql.ast.eval.reif.TestReificationDoneRightEval.test_reificationDoneRight_05a 53 ms 3 com.bigdata.rdf.sparql.ast.eval.reif.TestReificationDoneRightEval.test_reificationDoneRight_05a

          People

          • Assignee:
            beebs Brad Bebee
            Reporter:
            bryanthompson bryanthompson
          • Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved: