Uploaded image for project: 'Blazegraph (by SYSTAP)'
  1. Blazegraph (by SYSTAP)
  2. BLZG-307

MetadataService & DataService do not restart if configured for the same node as the TransactionService

    Details

    • Type: Bug
    • Status: Closed - Won't Fix
    • Resolution: Incomplete
    • Affects Version/s: trunk
    • Fix Version/s: None
    • Component/s: Bigdata Federation
    • Labels:
      None

      Description

      In each of the provided configuration files (bigdataCluster16.config,
      bigdataCluster.config, and bigdataStandalone.config), both the
      MetadataService and the DataService are each configured with an
      IServiceConstraint array that includes a TXRunningConstraint
      element; which means that the ServicesManagerService will not
      start either service if a TransactionService cannot be discovered.

      When a bigdata federation is restarted, the runOnce() method of
      RestartPersistentServices is called, which attempts to restart
      each of the previously started services. To do this, that method
      first retrieves from zookeeper, the configurations of each service
      to be restarted (line 160); after which it loops through the
      retrieved list of configurations, calling the restartIfNotRunning()
      method of MonitorCreatePhysicalServiceLocksTask for each
      physicalServiceZNode of a given configuration.

      If the service configurations for the MetadatService and/or the
      DataService are located in the retrieved list at a position that
      is prior to the position of the service configuration for the
      TransactionService, and if the TransactionService is configured
      to be restarted on the same node as the MetadataService and/or
      the DataService, then the MetadataService and/or the DataService
      will not be restarted. This is because when the method
      MonitorCreatePhysicalServiceLocksTask.restartIfNotRunning() is
      called with either the MetadataService configuration or the
      DataService configuration (before being called with the
      TransactionService configuration), the call to
      serviceConfig.canStartService() that is made at line 742
      returns false because that call, among other things, tests
      whether a TransactionService has been discovered. And since
      the TransactionService is to be started on the same node, but
      will be started only after the MetadataService and DataService
      have been dealt with, the TransactionService cannot be discovered;
      which results in the MetadataService and/or the DataService not
      being restarted.

      Note that when the services are started from a clean install
      (no zookeeper state from a previous start), the MetadataService
      and DataService are started successfully. This is because a
      different path is taken for starting the services from a
      clean install than is taken when restarting the federation
      (zookeeper is not consulted).

        Activity

        Hide
        bryanthompson bryanthompson added a comment -
        Show
        bryanthompson bryanthompson added a comment - This is the same as http://sourceforge.net/apps/trac/bigdata/ticket/94 .
        Hide
        bryanthompson bryanthompson added a comment -

        Committed revision 3418 implements a workaround for the problem reported by
        https://sourceforge.net/apps/trac/bigdata/ticket/111 and https://sourceforge.net/apps/trac/bigdata/ticket/94. The underlying problem is described in https://sourceforge.net/apps/trac/bigdata/ticket/111 and is a problem in the rules-based service startup logic. The workaround removes the TXRunningConstraint from the configuration files and modifies the StoreManager to wait for the transaction service to be discovered before proceeding with its startup. The changes made by this workaround are appropriate and close out https://sourceforge.net/apps/trac/bigdata/ticket/94, but I am going to leave https://sourceforge.net/apps/trac/bigdata/ticket/111 open since it documents the underlying problem which has not been resolved.

        Show
        bryanthompson bryanthompson added a comment - Committed revision 3418 implements a workaround for the problem reported by https://sourceforge.net/apps/trac/bigdata/ticket/111 and https://sourceforge.net/apps/trac/bigdata/ticket/94 . The underlying problem is described in https://sourceforge.net/apps/trac/bigdata/ticket/111 and is a problem in the rules-based service startup logic. The workaround removes the TXRunningConstraint from the configuration files and modifies the StoreManager to wait for the transaction service to be discovered before proceeding with its startup. The changes made by this workaround are appropriate and close out https://sourceforge.net/apps/trac/bigdata/ticket/94 , but I am going to leave https://sourceforge.net/apps/trac/bigdata/ticket/111 open since it documents the underlying problem which has not been resolved.
        Hide
        bryanthompson bryanthompson added a comment -

        Closed. Not relevant to the new architecture.

        Show
        bryanthompson bryanthompson added a comment - Closed. Not relevant to the new architecture.

          People

          • Assignee:
            Unassigned
            Reporter:
            btmurphy btmurphy
          • Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved: