There is a potential race condition in LatchedExecutor, which I suspect is the cause of the issue.
T1 has permit on semaphore.
T2 has called execute(task).
T1: queue.poll() – returns null
T2: semaphore.tryAcquire() – fails since T2 has permit
If the execution is interleaved in that order, the new work availability will be missed and the LatchedExecutor will fail to execute the pending task.
The following code reproduces the issue: