Batch, Cluster Processing

Introduction

While TORQUE is flexible enough to handle scheduling a conference room, it is primarily used in batch systems. Batch systems are a collection of computers and other resources (networks, storage systems, license servers, and so forth) that operate under the notion that the whole is greater than the sum of the parts. Some batch systems consist of just a handful of machines running single-processor jobs, minimally managed by the users themselves. Other systems have thousands and thousands of machines executing users' jobs simultaneously while tracking software licenses and access to hardware equipment and storage systems.

Pooling resources in a batch system typically reduces technical administration of resources while offering a uniform view to users. Once configured properly, batch systems abstract away many of the details involved with running and managing jobs, allowing higher resource utilization. For example, users typically only need to specify the minimal constraints of a job and do not need to know the individual machine names of each host on which they are running. With this uniform abstracted view, batch systems can execute thousands and thousands of jobs simultaneously.

What is a Resource Manager?

  1. Master Node - A batch system will have a master node where pbs_server runs. Depending on the needs of the systems, a master node may be dedicated to this task, or it may fulfill the roles of other components as well.
  2. Submit/Interactive Nodes - Submit or interactive nodes provide an entry point to the system for users to manage their workload. For these nodes, users are able to submit and track their jobs. Additionally, some sites have one or more nodes reserved for interactive use, such as testing and troubleshooting environment problems. These nodes have client commands (such as qsub and qhold).
  3. Compute Nodes - Compute nodes are the workhorses of the system. Their role is to execute submitted jobs. On each compute node, pbs_mom runs to start, kill, and manage submitted jobs. It communicates with pbs_server on the master node. Depending on the needs of the systems, a compute node may double as the master node (or more).
  4. Resources - Some systems are organized for the express purpose of managing a collection of resources beyond compute nodes. Resources can include high-speed networks, storage systems, license managers, and so forth. Availability of these resources is limited and needs to be managed intelligently to promote fairness and increased utilization.

Master Node Install -- with make

  1. Download the TORQUE distribution file from http://clusterresources.com/downloads/torque
  2. Extract and build the distribution on the machine that will act as the “TORQUE server” - the machine that will monitor and control all compute nodes by running the pbs_server daemon.
    1. tar -xzvf torqueXXX.tar.gz
    2. cd torqueXXX
    3. ./configure
    4. make
    5. make install
  3. To configure torque on each compute node run make packages. Copy the tpackages to any other machines and execute them with the following...
    > cp torque-package-mom-linux-i686.sh /shared/storage
    > cp torque-package-clients-linux-i686.sh /shared/storage
    > dsh /shared/storage/torque-package-mom-linux-i686.sh --install
    > dsh /shared/storage/torque-package-clients-linux-i686.sh --install
    
  4. Copy the pbs_mom, pbs_server and pbs_sched from the $BUILDROOT/contrib/init.d to /etc/init.d/.
  5. symlink the startup/shutdown scipts to /etc/rc.d/rc3.d/
    sudo ln -s /etc/init.d/pbs_server /etc/rc.d/rc3.d/S97pbs_mom
    sudo ln -s /etc/init.d/pbs_server /etc/rc.d/rc3.d/S98pbs_server
    sudo ln -s /etc/init.d/pbs_server /etc/rc.d/rc3.d/S99pbs_sched
    sudo ln -s /etc/init.d/pbs_server /etc/rc.d/rc3.d/K14pbs_server
    sudo ln -s /etc/init.d/pbs_server /etc/rc.d/rc3.d/K15pbs_mom
    sudo ln -s /etc/init.d/pbs_server /etc/rc.d/rc3.d/K13pbs_sched
    

Master Node Install -- with RPMs

This is the CCR preffered method as it avoids root setuid problems. All steps must be run while *loged in* as root.

  1. Download the TORQUE distribution file from http://clusterresources.com/downloads/torque
  2. Ensure kernel-devel and rpmb-build packages are installed.
  3. Untar and copy the tar ball to /usr/src/redhat/SOURCES
  4. Configure torque with sudo ./configure --prefix=/usr --exec-prefix=/usr --libdir=/usr/lib64 --with-server-home=/var/spool/pbs --with-default-server=magic.buffalo.edu.
  5. Build the RPMs with sudo rpmbuild -ba torque.spec.
  6. Create the torque-addons.spec file in /usr/src/redhat/SPECS/. Copy the following files (and tweak as needed in to /usr/src/redhat/SOURCES.
    1. mom_config
    2. nodefile
    3. pbs.sh
    4. pbs.csh
  7. Build the Add-on RPM with /usr/bin/rpmbuild -ba torque-addons.spec.
  8. Install all RPMs located in /usr/src/redhat/RPMS/x86_64 on the head node.
  9. Ne sure to turn off pbs_mom and pbs_sched services with /sbin/chkconfig if you are going to be running maui (and no workers) on the head node)
  10. Install the torque-mom-2.3.5-1cri torque-2.3.5-1cri torque-addons-mom-1.00-20.CSE.u2 torque-client-2.3.5-1cri and torque-devel-2.3.5-1cri on each intended node.

Compute Node Install

  1. Update /etc/hosts (as per the CADI Compute Cluster Administration Documentation) and /var/spool/torque/server_priv/nodes files on acl-primary to reflect the addition of a new host.
  2. Run the shell script /var/local/adm/distributed_scripts/torque/install_torque_client. This will...
    1. Install the self-extracting packages from /shared-space/torque/.
    2. Copy the startup script in to /etc/init.d.
    3. Make symbolic links (S99mom and K15mom, for example) in desired runtimes
    4. Enable TORQUE as a Service
    5. Copy the configuration file from /var/spool/torque/mom_priv/config.
    6. Start the pbs_mom service


Queue Setup

Run the following commands as root form acl-primary, substituting values as necessary.

  1. sudo qmgr -c "set queue cuda queue_type = Execution"
  2. sudo qmgr -c "set queue cuda resources_max.ncpus = 6"
  3. sudo qmgr -c "set queue cuda resources_max.nodect = 3"
  4. sudo qmgr -c "set queue cuda resources_default.nodes = 3"
  5. sudo qmgr -c "set queue cuda enabled = True"
  6. sudo qmgr -c "set queue cuda started = True"
  7. sudo qmgr -c "set queue cuda resources_default.neednodes=cuda"


Usage



Testing

# verify all queues are properly configured
> qstat -q
# view additional server configuration
> qmgr -c 'p s'
# verify all nodes are correctly reporting
> pbsnodes -a
# submit a basic job - DO NOT RUN AS ROOT
> su - testuser
> echo "sleep 30" | qsub
# verify jobs display
> qstat

References