PP-685: Provide a coherent interface for managing configuration and other data


#1

This message is to inform the community that we will commence discussion of PP-685 covering the topic of providing a coherent interface for managing configuration and other data. A design document does not yet exist for this functionality, but it had been mentioned in a seperate thread.


PP-288: Asynchronous logging option for the daemons
PP-288: Asynchronous logging option for the daemons
#2

I will attempt to summarize some developer discussions that have taken place outside of the forum regarding the future direction of configuration data in PBS Pro. As @billnitzberg mentioned in this thread, the intent is to maintain the vast majority of configuration data using qmgr. That doesn’t mean that pbs.conf will be going away completely, but it does mean that much of the data that is present in pbs.conf, mom_priv/config, sched_priv/sched_config would migrate to a central store maintained by the PBS Pro server.

Credit to @subhasisb and others for their input, much of which has been cut-and-pasted in this message.

It has been proposed that PBS Pro standardize on a single interface to manage nearly all of its configuration data. The qmgr interface has emerged as the best candidate given its existing support for a variety of configuration objects pertaining to the server, scheduler, queues, and nodes. There are several benefits to this approach:

  1. A single interface and syntax to manage configuration parameters relating to various PBS Pro objects (e.g. server, nodes, queues, etc.) provides for a consistent design, ease in learning, understanding, and support. This is the primary use case. In cases where configuration is distributed across files, it often follows that the format/syntax used in these files differ significantly from each other – making learning and maintaining more difficult. A single interface and a common object oriented syntax makes things uniform.

  2. Architecturally, this provides a common place where several common validation routines can be executed on updates, improving overall organization of the code, modularity, and maintainability.

  3. A change via qmgr is immediately actionable (i.e., PBS Pro immediately knows about the configuration change and can take immediate action), whereas a configuration file based approach is usually polled by daemons for changes (or a signal required to notify). Thus, while other mechanisms are available for the same, the current implementations vary between daemons. Using qmgr makes this uniform. Note that there are many configuration parameters that do not change frequently (or at least within the lifetime of an object) that would not benefit from this aspect of utilizing qmgr.

  4. When a configuration parameter change is initiated via qmgr, it can be immediately validated and feedback can be provided to the user/caller. This is not so straightforward in case of using a file based approach. The daemon must be signaled to reload the configuration and can usually emit an error message to the log files (indicating a configuration error), which needs to be parsed to determine the cause of the error. With qmgr, the caller immediately knows whether the change succeeded, or gets to know the error-code.

  5. The configuration is centrally managed and can be automatically distributed by PBS to wherever they need to be distributed to, for example to a specific host (in case the configuration belongs to a host) or to a specific scheduler (in case of multiple schedulers). This alleviates the need to make individual changes to a particular host or to use an external tool to propagate configuration changes to a large number of objects (say hosts).

  6. Since the configuration is centrally stored, replicating or backing it up becomes much easier. This allows for future architectures involving multi-master designs where configuration is easily read from a central repository instead of having to make a copy of such files for the replicated daemons. As an example, this is how kubernetes uses a central repository for configuration in a etcd backed database.

  7. qmgr is easily scriptable, so tooling is simple.

  8. Any configuration changes made via qmgr may be logged immediately, which could help when diagnosing problems.

Additional notes:

  • Not all configuration data would be involved, but rather the vast majority. MoMs still need to know where to “phone home”, so a minimal pbs.conf is still required (e.g. PBS_SERVER=foo)
  • PBS Pro used to have the ability to do a qmgr list of all config data that could later be imported back into PBS via qmgr. It would be convenient to run a qmgr “list all” and then feed that back into a fresh install to recreate the previous configuration. Today, we have to list many different things (e.g. hooks, sched, server, etc.) to capture all config data. We don’t handle quoting of certain values properly like default qsub args. This would be very valuable when the schema changes, since we could recreate everything using an unpopulated DB with a new schema.

Community feedback is both welcome and encouraged!