Uploaded image for project: 'Blazegraph (by SYSTAP)'
  1. Blazegraph (by SYSTAP)
  2. BLZG-1030

Replace Apache Http Components with jetty http client (was Roll forward the apache http components dependency to 4.3.3)

    Details

      Description

      Some errors have been observed by a customer using the current bundled version of the Apache http components. At least some of these exceptions would appear to be related to fixed issues between the current bundled version and 4.3.3.

      Failed to get namespace list
      java.lang.IndexOutOfBoundsException: endIndex: 1 > length: 0
      	at org.apache.http.util.CharArrayBuffer.substringTrimmed(CharArrayBuffer.java:445) ~[httpcore-4.3.2.jar:4.3.2]
      	at org.apache.http.impl.io.ChunkedInputStream.getChunkSize(ChunkedInputStream.java:249) ~[httpcore-4.3.2.jar:4.3.2]
      	at org.apache.http.impl.io.ChunkedInputStream.nextChunk(ChunkedInputStream.java:206) ~[httpcore-4.3.2.jar:4.3.2]
      	at org.apache.http.impl.io.ChunkedInputStream.read(ChunkedInputStream.java:169) ~[httpcore-4.3.2.jar:4.3.2]
      	at org.apache.http.impl.io.ChunkedInputStream.read(ChunkedInputStream.java:198) ~[httpcore-4.3.2.jar:4.3.2]
      	at org.apache.http.impl.io.ChunkedInputStream.close(ChunkedInputStream.java:287) ~[httpcore-4.3.2.jar:4.3.2]
      	at org.apache.http.impl.execchain.ResponseEntityWrapper.streamClosed(ResponseEntityWrapper.java:120) ~[httpclient-4.3.3.jar:4.3.3]
      	at org.apache.http.conn.EofSensorInputStream.checkClose(EofSensorInputStream.java:227) ~[httpclient-4.3.3.jar:4.3.3]
      	at org.apache.http.conn.EofSensorInputStream.close(EofSensorInputStream.java:174) ~[httpclient-4.3.3.jar:4.3.3]
      	at com.bigdata.rdf.sail.webapp.client.BackgroundGraphResult.close(BackgroundGraphResult.java:87) ~[bigdata-1.3.0.8345.20140516.jar:na]
      	at com.bigdata.rdf.sail.webapp.client.RemoteRepository$2.close(RemoteRepository.java:2014) ~[bigdata-1.3.0.8345.20140516.jar:na]
      	at com.inforbix.backend.storage.rdf.bigdata.benchmark.LoadTest$3.run(LoadTest.java:120) ~[benchmark.jar:na]
      	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) [na:1.7.0_55]
      	at java.util.concurrent.FutureTask.run(FutureTask.java:262) [na:1.7.0_55]
      	at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:178) [na:1.7.0_55]
      	at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:292) [na:1.7.0_55]
      	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) [na:1.7.0_55]
      	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) [na:1.7.0_55]
      	at java.lang.Thread.run(Thread.java:744) [na:1.7.0_55]
      
      Failed to get namespace list
      org.openrdf.query.QueryEvaluationException: java.net.SocketException: Socket closed
      	at com.bigdata.rdf.sail.webapp.client.BackgroundGraphResult.close(BackgroundGraphResult.java:89) ~[bigdata-1.3.0.8345.20140516.jar:na]
      	at com.bigdata.rdf.sail.webapp.client.RemoteRepository$2.close(RemoteRepository.java:2014) ~[bigdata-1.3.0.8345.20140516.jar:na]
      	at com.inforbix.backend.storage.rdf.bigdata.benchmark.LoadTest$3.run(LoadTest.java:120) ~[benchmark.jar:na]
      	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) [na:1.7.0_55]
      	at java.util.concurrent.FutureTask.run(FutureTask.java:262) [na:1.7.0_55]
      	at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:178) [na:1.7.0_55]
      	at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:292) [na:1.7.0_55]
      	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) [na:1.7.0_55]
      	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) [na:1.7.0_55]
      	at java.lang.Thread.run(Thread.java:744) [na:1.7.0_55]
      Caused by: java.net.SocketException: Socket closed
      	at java.net.SocketInputStream.socketRead0(Native Method) ~[na:1.7.0_55]
      	at java.net.SocketInputStream.read(SocketInputStream.java:152) ~[na:1.7.0_55]
      	at java.net.SocketInputStream.read(SocketInputStream.java:122) ~[na:1.7.0_55]
      	at org.apache.http.impl.io.SessionInputBufferImpl.streamRead(SessionInputBufferImpl.java:136) ~[httpcore-4.3.2.jar:4.3.2]
      	at org.apache.http.impl.io.SessionInputBufferImpl.fillBuffer(SessionInputBufferImpl.java:152) ~[httpcore-4.3.2.jar:4.3.2]
      	at org.apache.http.impl.io.SessionInputBufferImpl.readLine(SessionInputBufferImpl.java:270) ~[httpcore-4.3.2.jar:4.3.2]
      	at org.apache.http.impl.io.AbstractMessageParser.parseHeaders(AbstractMessageParser.java:192) ~[httpcore-4.3.2.jar:4.3.2]
      	at org.apache.http.impl.io.AbstractMessageParser.parseHeaders(AbstractMessageParser.java:145) ~[httpcore-4.3.2.jar:4.3.2]
      	at org.apache.http.impl.io.ChunkedInputStream.parseTrailerHeaders(ChunkedInputStream.java:264) ~[httpcore-4.3.2.jar:4.3.2]
      	at org.apache.http.impl.io.ChunkedInputStream.nextChunk(ChunkedInputStream.java:214) ~[httpcore-4.3.2.jar:4.3.2]
      	at org.apache.http.impl.io.ChunkedInputStream.read(ChunkedInputStream.java:169) ~[httpcore-4.3.2.jar:4.3.2]
      	at org.apache.http.impl.io.ChunkedInputStream.read(ChunkedInputStream.java:198) ~[httpcore-4.3.2.jar:4.3.2]
      	at org.apache.http.impl.io.ChunkedInputStream.close(ChunkedInputStream.java:287) ~[httpcore-4.3.2.jar:4.3.2]
      	at org.apache.http.impl.execchain.ResponseEntityWrapper.streamClosed(ResponseEntityWrapper.java:120) ~[httpclient-4.3.3.jar:4.3.3]
      	at org.apache.http.conn.EofSensorInputStream.checkClose(EofSensorInputStream.java:227) ~[httpclient-4.3.3.jar:4.3.3]
      	at org.apache.http.conn.EofSensorInputStream.close(EofSensorInputStream.java:174) ~[httpclient-4.3.3.jar:4.3.3]
      	at com.bigdata.rdf.sail.webapp.client.BackgroundGraphResult.close(BackgroundGraphResult.java:87) ~[bigdata-1.3.0.8345.20140516.jar:na]
      	... 9 common frames omitted
      
      Failed to get namespace list
      org.openrdf.query.QueryEvaluationException: org.apache.http.MalformedChunkCodingException: Unexpected content at the end of chunk
      	at com.bigdata.rdf.sail.webapp.client.BackgroundGraphResult.close(BackgroundGraphResult.java:89) ~[bigdata-1.3.0.8345.20140516.jar:na]
      	at com.bigdata.rdf.sail.webapp.client.RemoteRepository$2.close(RemoteRepository.java:2014) ~[bigdata-1.3.0.8345.20140516.jar:na]
      	at com.inforbix.backend.storage.rdf.bigdata.benchmark.LoadTest$3.run(LoadTest.java:120) ~[benchmark.jar:na]
      	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) [na:1.7.0_55]
      	at java.util.concurrent.FutureTask.run(FutureTask.java:262) [na:1.7.0_55]
      	at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:178) [na:1.7.0_55]
      	at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:292) [na:1.7.0_55]
      	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) [na:1.7.0_55]
      	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) [na:1.7.0_55]
      	at java.lang.Thread.run(Thread.java:744) [na:1.7.0_55]
      Caused by: org.apache.http.MalformedChunkCodingException: Unexpected content at the end of chunk
      	at org.apache.http.impl.io.ChunkedInputStream.getChunkSize(ChunkedInputStream.java:233) ~[httpcore-4.3.2.jar:4.3.2]
      	at org.apache.http.impl.io.ChunkedInputStream.nextChunk(ChunkedInputStream.java:206) ~[httpcore-4.3.2.jar:4.3.2]
      	at org.apache.http.impl.io.ChunkedInputStream.read(ChunkedInputStream.java:169) ~[httpcore-4.3.2.jar:4.3.2]
      	at org.apache.http.impl.io.ChunkedInputStream.read(ChunkedInputStream.java:198) ~[httpcore-4.3.2.jar:4.3.2]
      	at org.apache.http.impl.io.ChunkedInputStream.close(ChunkedInputStream.java:287) ~[httpcore-4.3.2.jar:4.3.2]
      	at org.apache.http.impl.execchain.ResponseEntityWrapper.streamClosed(ResponseEntityWrapper.java:120) ~[httpclient-4.3.3.jar:4.3.3]
      	at org.apache.http.conn.EofSensorInputStream.checkClose(EofSensorInputStream.java:227) ~[httpclient-4.3.3.jar:4.3.3]
      	at org.apache.http.conn.EofSensorInputStream.close(EofSensorInputStream.java:174) ~[httpclient-4.3.3.jar:4.3.3]
      	at com.bigdata.rdf.sail.webapp.client.BackgroundGraphResult.close(BackgroundGraphResult.java:87) ~[bigdata-1.3.0.8345.20140516.jar:na]
      	... 9 common frames omitted
      
      Failed to get namespace list
      org.openrdf.query.QueryEvaluationException: org.apache.http.TruncatedChunkException: Truncated chunk ( expected size: 0; actual size: 0)
      	at com.bigdata.rdf.sail.webapp.client.BackgroundGraphResult.close(BackgroundGraphResult.java:89) ~[bigdata-1.3.0.8345.20140516.jar:na]
      	at com.bigdata.rdf.sail.webapp.client.RemoteRepository$2.close(RemoteRepository.java:2014) ~[bigdata-1.3.0.8345.20140516.jar:na]
      	at com.inforbix.backend.storage.rdf.bigdata.benchmark.LoadTest$3.run(LoadTest.java:120) ~[benchmark.jar:na]
      	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) [na:1.7.0_55]
      	at java.util.concurrent.FutureTask.run(FutureTask.java:262) [na:1.7.0_55]
      	at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:178) [na:1.7.0_55]
      	at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:292) [na:1.7.0_55]
      	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) [na:1.7.0_55]
      	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) [na:1.7.0_55]
      	at java.lang.Thread.run(Thread.java:744) [na:1.7.0_55]
      Caused by: org.apache.http.TruncatedChunkException: Truncated chunk ( expected size: 0; actual size: 0)
      	at org.apache.http.impl.io.ChunkedInputStream.read(ChunkedInputStream.java:183) ~[httpcore-4.3.2.jar:4.3.2]
      	at org.apache.http.impl.io.ChunkedInputStream.read(ChunkedInputStream.java:198) ~[httpcore-4.3.2.jar:4.3.2]
      	at org.apache.http.impl.io.ChunkedInputStream.close(ChunkedInputStream.java:287) ~[httpcore-4.3.2.jar:4.3.2]
      	at org.apache.http.impl.execchain.ResponseEntityWrapper.streamClosed(ResponseEntityWrapper.java:120) ~[httpclient-4.3.3.jar:4.3.3]
      	at org.apache.http.conn.EofSensorInputStream.checkClose(EofSensorInputStream.java:227) ~[httpclient-4.3.3.jar:4.3.3]
      	at org.apache.http.conn.EofSensorInputStream.close(EofSensorInputStream.java:174) ~[httpclient-4.3.3.jar:4.3.3]
      	at com.bigdata.rdf.sail.webapp.client.BackgroundGraphResult.close(BackgroundGraphResult.java:87) ~[bigdata-1.3.0.8345.20140516.jar:na]
      	... 9 common frames omitted
      

      See http://www.apache.org/dist/httpcomponents/httpclient/RELEASE_NOTES-4.3.x.txt

      See BLZG-1007 (HA LBS Gateway errors under heavy load) which was closed as a duplicate of this ticket.

      Notes for merge of branch:
      - There are new and distinct test classes for jetty. We need to make sure that any new tests written to the pre-existing http layer test suites have been captured in the jetty variations or they will be lost when the non-jetty variations are removed. This will require either diffing the apache test classes against the jetty test classes to find the differences and making sure that they are captured.
      - See http://wiki.bigdata.com/wiki/index.php/JettyHttpClient for notes on the jetty client integration.

        Issue Links

          Activity

          Hide
          bryanthompson bryanthompson added a comment -

          Commit in sync to martyn as part of code review.

          1. Create a system property to override the IHttpClientFactory.
          2. Define AutoCloseHttpClient (rename) and simplify code. Update
          JettyRemoteReposutory to be aware of this.
          3. Clean up code references that are directly to the JettyHttpClient,
          especially those that set autoClose:=true (there are two such call
          paths). In general, I think that the tests should be using a default
          factory pattern rather than explicitly creating a JettyHttpClient.
          4. We need to look at whether the webapp.client package is still
          self-enclosed
          - it used to be that we could build a jar from it. Not a
          100% concern, but still.

          Show
          bryanthompson bryanthompson added a comment - Commit in sync to martyn as part of code review. 1. Create a system property to override the IHttpClientFactory. 2. Define AutoCloseHttpClient (rename) and simplify code. Update JettyRemoteReposutory to be aware of this. 3. Clean up code references that are directly to the JettyHttpClient, especially those that set autoClose:=true (there are two such call paths). In general, I think that the tests should be using a default factory pattern rather than explicitly creating a JettyHttpClient. 4. We need to look at whether the webapp.client package is still self-enclosed - it used to be that we could build a jar from it. Not a 100% concern, but still.
          Hide
          bryanthompson bryanthompson added a comment -

          - Improved the factory mechanisms for the HttpClient.
          - Added options for configuring the default HttpClient behavior, including the SSL keystore path and whether it supports redirects.
          - Supported autoclose of the HttpClient by the RemoteRepositoryManager.
          - Renamed JettyRemoteRepositoryXXX => RemoteRepositoryXXX to keep the configuration option names the same as before the jetty refactor.
          - Tested blueprints, webapp, remote GOM layer, and some HA tests locally.

          Committed revision a7289d7575a435c3760e8d808e39236235e23ece.

          Note: The "ant bigdata-client" client task is pulling in a lot of things outside of the webapp client package. This needs to be chased down.

          Show
          bryanthompson bryanthompson added a comment - - Improved the factory mechanisms for the HttpClient. - Added options for configuring the default HttpClient behavior, including the SSL keystore path and whether it supports redirects. - Supported autoclose of the HttpClient by the RemoteRepositoryManager. - Renamed JettyRemoteRepositoryXXX => RemoteRepositoryXXX to keep the configuration option names the same as before the jetty refactor. - Tested blueprints, webapp, remote GOM layer, and some HA tests locally. Committed revision a7289d7575a435c3760e8d808e39236235e23ece. Note: The "ant bigdata-client" client task is pulling in a lot of things outside of the webapp client package. This needs to be chased down.
          Hide
          bryanthompson bryanthompson added a comment -

          Addressing issues with bloat in the bigdata-client jar:


          - Moved the BigdataSailNSSWrapper into the webapp directory. This class accesses the BigdataSail, NanoSparqlServer, etc. These are not client classes and can not be referenced from within the webapp.client package. This reduced the jar size from 5.4MB to 94kb.


          - Martyn is going to review the JettyResponseListener. It appears to have broken semantics for getInputStream(boolean).


          - We also need to do a review on the wiki page: http://wiki.bigdata.com/wiki/index.php/JettyHttpClient#The_Jetty_HttpClient


          - We should schedule a review with Brad on the security and delegation aspects.

          Commit f59514f05e347bf1ac505d8514aad8f3bb24f850

          Show
          bryanthompson bryanthompson added a comment - Addressing issues with bloat in the bigdata-client jar: - Moved the BigdataSailNSSWrapper into the webapp directory. This class accesses the BigdataSail, NanoSparqlServer, etc. These are not client classes and can not be referenced from within the webapp.client package. This reduced the jar size from 5.4MB to 94kb. - Martyn is going to review the JettyResponseListener. It appears to have broken semantics for getInputStream(boolean). - We also need to do a review on the wiki page: http://wiki.bigdata.com/wiki/index.php/JettyHttpClient#The_Jetty_HttpClient - We should schedule a review with Brad on the security and delegation aspects. Commit f59514f05e347bf1ac505d8514aad8f3bb24f850
          Hide
          bryanthompson bryanthompson added a comment -

          Work on error handling in the jetty listener, proper decoding of the MIME Type and charset, and error handling in the RemoteRepository and the classes that do background graph and tuple result processing. Commit to JETTY_CLIENT_BRANCH2 since not all tests were working after this refactor. Martyn is continuing to work on these points.

          We should develop a stress test that also allows us to examine the error handling behavior (when the client as an error while processing the response and needs to abort the http connection).

          Show
          bryanthompson bryanthompson added a comment - Work on error handling in the jetty listener, proper decoding of the MIME Type and charset, and error handling in the RemoteRepository and the classes that do background graph and tuple result processing. Commit to JETTY_CLIENT_BRANCH2 since not all tests were working after this refactor. Martyn is continuing to work on these points. We should develop a stress test that also allows us to examine the error handling behavior (when the client as an error while processing the response and needs to abort the http connection).
          Hide
          bryanthompson bryanthompson added a comment -

          Martyn fixed the refactor last week (default content encoding and also a re-ordering of the code leading to a deadlock in RemoteRepository.tupleResults()) and submitted to CI... last run was 36 errors, similar to master runs.

          Next steps:
          - done. code review of recent commit leading to sign off on the entire refactor (made changes to tupleResult() error handling, javadoc cleanup of JettyResultListener).
          - done. merge from master. recheck the NSS test suite to ensure that openrdf 2.7 test suite changes were correctly brought forward.
          - done. performance runs (HA run of BSBM Explore). I have verified no regression on BSBM 100M with this change using the Explore query mixture and the HA3 mode.
          - done. pull request and merge back to master.

          Show
          bryanthompson bryanthompson added a comment - Martyn fixed the refactor last week (default content encoding and also a re-ordering of the code leading to a deadlock in RemoteRepository.tupleResults()) and submitted to CI... last run was 36 errors, similar to master runs. Next steps: - done. code review of recent commit leading to sign off on the entire refactor (made changes to tupleResult() error handling, javadoc cleanup of JettyResultListener). - done. merge from master. recheck the NSS test suite to ensure that openrdf 2.7 test suite changes were correctly brought forward. - done. performance runs (HA run of BSBM Explore). I have verified no regression on BSBM 100M with this change using the Explore query mixture and the HA3 mode. - done. pull request and merge back to master.

            People

            • Assignee:
              martyncutcher martyncutcher
              Reporter:
              bryanthompson bryanthompson
            • Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: