Martyn and I have discussed this deadlock in some depth. A few observations:
- The basic issue is that the QuorumWatcher can call through to the QuorumClient interface through either the QuorumMember interfaces (e.g., QuorumPipelineImpl) or the QuorumListener interface. All of those calls are made synchronously. The called methods can not reliably request locks without the potential of creating a deadlock.
- ZKQuorumImpl could tunnel and obtain the QuorumPipelineImpl lock before applying the changes from the zk event. However, this is pretty ugly.
- Another approach is to have the pipelineAdd(), pipelineRemove(), and pipelineChange() methods use non-blocking techniques. E.g., those methods could drop a message onto a queue. The write replication methods on that class could then spin through the queue, applying the messages once they take the QuorumPipelineImpl lock and then again when they release that lock. This behavior could be encapsulated in a lock() and unlock() pattern. This would decouple the pipeline change notification messages from the write replication messages while still maintaining the critical sections imposed by the QuorumPipelineImpl lock today.
- Martyn points out that we could use a ReadWriteLock in AbstractQuorum, taking only the read lock if we are testing the Quorum state and taking the write lock if we are modifying it. This could potentially increase concurrency and seems reasonable. We would have to review the public API since a read lock can not be upgraded into a write lock, thus the write lock would have to be taken at the top-level or we could create another kind of lock ordering problem.
- We observed a potential issue where the last follower in the pipeline would hold the QuorumPipelineImpl lock while receiving the payload from the upstream service. We have pushed the logic into a static ReceiveTask (similar to the SendBufferTask and the ReceiveAndReplicateTask). The lock is now held only when setting up that task, but the task itself runs without the lock.