This file is from sasn100.sandia.gov and is
/usr/local/FAQ/NQS_general_info
=======================================================================
Network Queueing System (NQS) Usage Notes 12/05/2002
-----------------------------------------------------------------------
Janus supports the use of NQS for submitting batch jobs. The system
divides the nodes into an NQS batch partition and an interactive
partition. The interactive partition is 140 nodes, and is available
at all times (Please refer to sasn100:/usr/local/FAQ/janus_good_citizen
for usage policy on interactive nodes).
Jobs are submitted to the NQS system using the qsub command. At
present, you must be logged in to janus when you execute the qsub
command. You can check on jobs by using the qstat command.
Information on the NQS commands qsub, qstat, and qdel can be obtained
by viewing the man pages for each command.
The unclassified system (janus) is configured with multiple sets of
3 queues to which a single user may submit jobs. The assignment
to a particular set of these queues is determined when you are
assigned a user-id on janus. For Sandia users, these queues would be
"snl", "snl.day", and "snl.big".
For users at the other two DOE labs these queues are "lanl", "lanl.day",
"lanl.big", "llnl", "llnl.day", and "llnl.big".
For users at other DOE sites, the relevant queues are "doe", "doe.day",
and "doe.big".
For university users with the ASAP program these queues are
"edu", "edu.day", and "edu.big"
For projects not directly funded by the ASCI program these queues are
"naq", "naq.day", and "naq.big"
All queues run at all times regardless of the day or time when the
middle section is present (big mode). When the middle section is not
present (small mode), the .day queues still run at all times, but the
.big queues do not run at all. Queues with no .day qualifier in the name
are disabled from 9:45AM MT to 3:00PM MT Monday through Thursday in
small mode. This policy is intended to facilitate code development
efforts using the .day queues when the system is in small mode. Jobs
may be submitted to any queue at any time, though jobs submitted to a
queue that is not running will not start until the queue next becomes
active. The batch scheduling algorithm in effect for small mode can
be viewed pictorially at
http://www.sandia.gov/ASCI/Red/usage/ASCIRed_batch_sched_b.pdf
A special queue called "express" exists for emergency situations.
Access to this queue is only for the duration of the emergency for
a maximum of one week, and will be granted only by special request
to "janus-managers@sandia.gov" The request must be made by a manager
or site point of contact. Jobs in this queue are given a very high
initial priority and therefore run as soon as the requested number of
nodes is available.
A special set of queues called "priority" queues exists for completing
a set of jobs for priority projects. These queues are called
"priority" and "priority.big" and operate with the same limits
as the equivalent queues associated with a normal usage
group [snl,llnl,lanl,doe,naq,edu]. However, jobs in each of the priority
queue types start with a higher queueing priority than jobs in a
"normal" equivalent queue. Access to these queues is based on
established ASCI/ASAP project priorities, and will be granted only
by special request to "janus-managers@sandia.gov" for the duration
needed to complete the priority jobs to a maximum of three months.
The request must be made by a manager or site point of contact.
The "janus-managers" will regularly re-visit whether the users with
current access to the "priority" queues should continue to be granted
that access. Until completion of the priority jobs, users will be
granted access to these queues instead of their normal queues, and
shall notify "janus-help" upon completion of the work so they may be
switched back to their normal queues.
Each user is allowed a maximum of one job at a time in the .day
queues, and two jobs at a time (running or not) in the other queues.
(See summary of Queue limits in table below). The limit is enforced
by NQS and will abort a qsub request if this limit is exceeded.
More details about usage rules, including rules governing the
interactive nodes, may be found in the FAQ file janus_good_citizen
The classified system runs with fewer rules and limits. The .big
queues only run in big mode, but the queues with no qualifier run
at all times. There are no .day queues on the classified system.
The priority and express queues operate the same as on the
unclassified side.
An NQS job consists of a short script, which contains commands to
launch your job. The script may also contain general OSF commands
which run on the service partition. For example, a simple application
is usually started on 16 nodes using this yod command:
yod -sz 16 hello.world
The NQS script ("doit") to launch the same application is as follows:
#! /bin/sh
date
cd cougar-src/hello
yod hello.world
The NQS system starts all jobs in your home directory, so that you may
need to issue a cd command from within your NQS script to change to the
proper directory. This script may be submitted to NQS to run on 16
nodes using this command:
qsub -q snl -lP 16 -lT 300 doit
Instead of including the specific directory from which to run your job
in your script, you may use the environment variable QSUB_WORKDIR, which
stores the directory from which the job was submitted to NQS. Using the
following "doit" script,
#! /bin/sh
date
cd $QSUB_WORKDIR
yod hello.world
this job can be run with the commands
cd cougar-src/hello
qsub -q snl -lP 16 -lT 300 doit
NQS will start the job from cougar-src/hello. Once the job has been
submitted to NQS, you may change directories and continue with your work.
The system will require you to use the -q, -lP and -lT options to qsub
when submitting your job. The -lP switch will allow you to specify
the number of nodes you need. The -lT switch will set the amount of
time you would like your job to run. The yod command will determine
the correct number of nodes from NQS so that the "-sz" option is superfluous,
unless you wish to run on an odd number of nodes. For various reasons
related to the design of the system, NQS only allocates even numbers
of nodes; if you specify -lP 7, for example, it will actually allocate
8 nodes, and without a -sz flag your yod would in fact run on 8 nodes
instead of the 7 you expected.
The date command is an example of a general OSF command included in the
batch script.
Some additional qsub examples are
qsub -q snl -lP 576 -lT 2:00:00 doit run on 576 nodes for 2 hours max
qsub -q snl.day -lP 22 -lT 300 doit run on 22 nodes for 300 seconds
qsub -q snl -lP 1024 -lT 2:00:00 doit run on 1024 nodes for 2 hours max
Jobs that are queued (but not yet executing) are preserved across
system crashes and will run when the system returns to service.
Output from the script appears in two files: one for standard output,
and one for standard error. For the example script above, the files
are named "doit.oXX" and "doit.eXX". Where, XX is the job number
returned by NQS. These files appear in the directory that the qsub
command was executed.
NQS QUEUES:
NQS is currently configured so that in big mode jobs in any queue will
start once sufficient priority is gained and the requested number of
nodes is available. In small mode, the normal queues (those with no
qualifier after the name) are disabled from 9:45AM to 3:00PM MT, Monday
through Thursday, and the .big queues are disabled at all times. Further,
jobs submitted to the normal queues will not start unless the specified
time limit will guarantee that the job will complete no later than
11:05AM MT, Monday through Thursday. This means that jobs that request
a time limit greater than 20 hours can only start between 3:00PM MT on
Thursday and 3:00PM MT on Sunday.
When either "express" or ".big" jobs are queued for long periods of
time, the systems staff may manually start the highest priority
job by preventing smaller jobs from starting.
Here is a summary of the available queues:
Maximum Maximum Jobs Initial
Queue Name Nodes Maximum Time per User Priority
------------ ------- ------------ ------------ --------
snl.day 1028 2.0 hours 1 10
lanl.day 1028 2.0 hours 1 10
llnl.day 1028 2.0 hours 1 10
doe.day 1028 2.0 hours 1 10
edu.day 1028 2.0 hours 1 10
naq.day 1028 2.0 hours 1 10
snl 1028 24.0 hours 2* 10
lanl 1028 24.0 hours 2* 10
llnl 1028 24.0 hours 2* 10
doe 1028 24.0 hours 2* 10
edu 1028 24.0 hours 2* 10
naq 1028 24.0 hours 2* 10
priority 1028 24.0 hours 2* 20
snl.big** 3204 24.0 hours 1 15
lanl.big** 3204 24.0 hours 1 15
llnl.big** 3204 24.0 hours 1 15
doe.big** 3204 24.0 hours 1 15
edu.big** 3204 24.0 hours 1 15
naq.big** 3204 24.0 hours 1 15
priority.big** 3204 24.0 hours 2* 25
express 3204 unlimited 2* 60
byhand indicates a manually started job 63
* Users are permitted to have either two queued or one queued and
one running (or two running) in the queue.
**In order to submit to the *.big queues, you must request at
least 1029 nodes or the submission will be rejected.
NQS TIDBITS:
Job priority increases by a value of 0.05 per hour in the queue. When
free nodes exist, NQS will attempt to run the queued job with the
highest priority. If insufficient nodes are free to run the highest
priority job, NQS will attempt to start the next highest priority job,
continuing in this fashion until the highest priority job that can
start is found.
Jobs with finite time limits are run before jobs with unlimited time
limits. You will get better turnaround time on your jobs by providing
accurate run time estimates using the -lT parameter to qsub.
Specifying the actual time limit needed rather than the maximum available
may allow a the job to start as the system nears a time wall.
A local user utility, /usr/community/bin/qwall, is available to
assist users in obtaining information on the the maximum time that an
NQS jobs may request and still potentially start. When configured, the
NQS time walls factor into whether an NQS request is started (along
with the request's priority and available node resources). The NQS
walls permit user requests to start if the request will finish in the
window of time remaining until the next NQS time wall. The qwall
utility provides the user with information regarding the time remaining
until the next configured NQS time wall.
Jobs should accurately specify the number of nodes needed with the
-lP parameter. Requesting more than you need will prevent the job
from running when it might have been able to, and will unnecessarily
tie up nodes that others might use when the job does start. Users
should seldom request the maximum number of possible nodes as a few
nodes may be down for maintenance at any time.
You can use the -v option to qstat which will show the number of nodes
associated with each request. This requires a wide screen (132
columns) to display cleanly.
Thank you for your patience while we converge on a NQS solution that
performs well for our workload. As usual, submit problems and
suggestions to janus-help@sandia.gov
--------------------------------------------------------------------------
Last Updated: August 19, 2003 by Gerry Quinlan
==========================================================================
Acknowledgement and Disclaimer