Uploaded image for project: 'Blazegraph (by SYSTAP)'
  1. Blazegraph (by SYSTAP)
  2. BLZG-337

ResourceService should use NIO for file and buffer transfers

    Details

      Description

      The ResourceService was originally developed to send index segment files from one node to another when an index partition was moved. As part of the QUADS_QUERY_BRANCH it has been extended so that services may publish ByteBuffer objects containing intermediate results for distributed query processing. Both files and ByteBuffers are identified by UUIDs. Resource transfers currently use the InputStream and OutputStream interfaces for both files and ByteBuffers.

      Java provides mechanisms for transferring data directly from one channel to another. Those mechanisms may be used to transfer the content of a file directly over a socket without moving the data through the JVM. This can result in a significant increase in the throughput of the transfer with a corresponding reduction in the resource demand (CPU, RAM, GC).

      Likewise, Java provides mechanisms for directly transferring the contents of a ByteBuffer onto a socket channel -- in fact, this was the original purpose of the Java NIO facility.

      The resource service should be modified to use NIO transfers for both files and ByteBuffers. This will improve the performance of the distributed query facility. It will also improve the stability of the distributed query facility by removing the high thread counts associated with large numbers of concurrent transfers among nodes.

      @see https://sourceforge.net/apps/trac/bigdata/ticket/486 (NIO solution set interchange on the cluster)

        Activity

        Hide
        bryanthompson bryanthompson added a comment -

        Disabling the NIOChunkMessage code paths. I think that this code path might be causing a problem with larger chunk sizes so I want to test with it disabled. I am going to leave this disabled until it has been more throughly validated.

        Committed revision r6025.

        Show
        bryanthompson bryanthompson added a comment - Disabling the NIOChunkMessage code paths. I think that this code path might be causing a problem with larger chunk sizes so I want to test with it disabled. I am going to leave this disabled until it has been more throughly validated. Committed revision r6025.
        Hide
        bryanthompson bryanthompson added a comment -

        I have verified that there is a problem with the NIO chunk message / resource service / direct allocator code by disabling it and running LUBM Q9 with chunkCapacity=1000. Without the "NIO" code path in FederationChunkHandler, the query runs fine. With it, bigdata16 gets red hot.


        - This code should have its own test suite and stress test
        (maybe based on the TestMessage class that I started?)


        - The problem could also be a deadlock arising through
        exhaustion of the direct buffers (but I have no evidence for
        this).


        - The problem might be that the NIO messages are queued in
        bufferReady() and perhaps even being processed long after
        the LIMIT has cancelled the query.


        - Look at the profiler capture for bigdata16. This was with
        NIO enabled.

        Show
        bryanthompson bryanthompson added a comment - I have verified that there is a problem with the NIO chunk message / resource service / direct allocator code by disabling it and running LUBM Q9 with chunkCapacity=1000. Without the "NIO" code path in FederationChunkHandler, the query runs fine. With it, bigdata16 gets red hot. - This code should have its own test suite and stress test (maybe based on the TestMessage class that I started?) - The problem could also be a deadlock arising through exhaustion of the direct buffers (but I have no evidence for this). - The problem might be that the NIO messages are queued in bufferReady() and perhaps even being processed long after the LIMIT has cancelled the query. - Look at the profiler capture for bigdata16. This was with NIO enabled.
        Hide
        bryanthompson bryanthompson added a comment -

        Closed. Not relevant to the new architecture.

        Show
        bryanthompson bryanthompson added a comment - Closed. Not relevant to the new architecture.

          People

          • Assignee:
            Unassigned
            Reporter:
            bryanthompson bryanthompson
          • Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved: