Type: New Feature
Affects Version/s: BIGDATA_RELEASE_1_2_2
Fix Version/s: None
The HAJournal startup procedure should be changed to be faster and more robust.
The current procedure has some (non-fatal) weaknesses:
- Reading the existing HALog files on startup and building the in-memory index of those HALog files is a time-consuming, sequential process. This process currently blocks the NSS startup since it occurs within the HAJournal constructor. That means that the /status page is not available during startup.
1. If an HALog file is corrupt (bad root blocks, checksum errors, disk rot, etc), then the HAJournal will refuse to start. Instead, it should automatically re-replicate the bad/missing HALog files from the quorum leader.
2. The startup procedure is willing to allow the last HALog file to be bad, in which case it will silently delete that HALog file. This covers the case where an open transaction was never committed on that service, such that the closing root block was not written on the HALog file.
3. done: There is also one case when the current startup procedure would be incorrect, effectively forcing human operator intervention during a case which should otherwise be handled in the same manner as (1) above. The silent removal of the last HALog (when bad) would be an error if a sudden kill or power failure occurred after writing the root block on the journal and before writing the root block on the HALog for the same commit point. In this case, the only correct actions are to either (a) re-replicate the missing HALog from the quorum leader; or (b) rollback to the alternative root block on the service, which will force the service to RESYNC and RESYNC will obtain the missing HALog from the quorum leader. (Thus, both options amount to the same thing and I would prefer the former since that also handles cases with bad HALog files for other commit points.)
4. During the HALog scan, the jetty end point is not up. This means that the HA load balancer is not operating on that service yet. The jetty end point is not started until we have the Journal object since it needs to do some initialization for the servlet container with that IIndexManager reference. It also has logic in BigdataRDFServletContextListener that is responsible for the default KB creation. This needs to be untangled to allow the NSS to come up as early as possible such that the HA LBS can at least proxy requests to other services before this service is fully online.
The incorrect behavior for (3) can be fixed by constraining the code to only perform that procedure if the HALog file is for a commit point that is not captured within the current root block on the journal. This makes the startup procedure correct, but still fails to handle bad/missing HALog files for historical transactions automatically and still causes high latency in the startup procedure when there are a lot of HALog files in the file system.
Note: If there are HALog files for commit points beyond the most recent commit point on the journal, then those HALog files will be applied to roll forward the journal. This is done by HAJournalServer in its RESTORE state.
Note: The HALog progress is not displayed on the /status page during RESTART either, e.g., when rolling forward from a snapshot through a large number of commit points before entering RESYNC and then RUNMET.
See BLZG-1027 (Support auto-replication of bad or missing HALogs during startup).