next up previous

11.1 LoadLeveler Overview

The entire collection of machines available for LoadLeveler scheduling is called a ``pool". Note that this is NOT the same as the processor, or node pools discussed earlier. The LoadLeveler ``pool" is the group of nodes which the LoadLeveler manages. On Cornell's SP2 these nodes are the nodes in the subpools called ``batch". (A subpool is a smaller division of a larger node pool.) On some smaller machines, such as the 16 node SP2 at ORNL, the LoadLeveler pool includes every node on the machine.

Every machine in the pool has one or more LoadLeveler daemons running on it.

The LoadLeveler pool has one Central Manager (CM) machine, whose principal function is to coordinate LoadLeveler related activities on all machines in the pool. This CM maintains status information on all machines and jobs, making decisions about where jobs should run. If the Central Manager machine goes down, job information is not lost. Jobs executing on other machines will continue to run, while jobs waiting to run will start when CM is again restarted, and other jobs may continue to be submitted from other machines. (Such jobs will be dispatched when the Central Manager is restarted.) Normally, users do not even need to know about the Central Manager.