NAME
yod - Allocate a SUNMOS mesh partition, load and run a SUNMOS user program.
SYNOPSIS
yod [-D level] [-comm size] [-heap size] [-stack size] [base node] [-proc mode] [-allocation mode] [-help] [-load] [-retry] [[-size n] [-sz n] [-exitonfault] filename [command line arguments] | -F loadfile]
DESCRIPTION
Yod performs mesh allocation, program load/execution, and provides file I/O including stdin, stdout, and stderr. Rudimentary job control is also provided (Ctrl-C). DO NOT issue a kill -9 to yod, because this may hang the Paragon.
An alternate numbering scheme labels the rows, starting at the top, from a to p, and the columns, starting at the left from 0 to n, (n depends on the machine size).
Valid arguments to -base include: 0, a0, p5, 422, etc.
The allocator always starts starts in strict mode, then tries lax, and finally any. The -allocation mode prevents the allocator form trying lower levels than mode. If no partition of the required size (and shape, if size was specified with a width and height) is available, yod aborts the load.
The -allocation option also accepts the argument rnd or random which causes random nodes to be allocated. This option is meant for debug purposes and should not be used during prime time, since it may prevent allocation of rectangular areas for other jobs.
MESH ALLOCATION AND NODE MAPPING
The mesh allocator is controlled with the -size, -base, and -allocation flags. If the -size option is used with a height and width, then the allocator restricts itself to the given aspect ratio. In all other cases the allocator tries to find a contiguous region in the mesh that is ``as square as possible''.
If a base is specified, only partitions with their top left corner at that base are tried.
The -allocation flag determines the restrictions imposed on the allocator. Strict allocation means, that the partition has to be contiguous; i.e. not spanning nodes with other applications running on them, or nodes not running SUNMOS. If allocation is lax, then the allocator may skip rows and/or columns of nodes running other applications or OSF. The weakest allocation mode, any, allows the allocator to use any collection of nodes, assuming there are enough free nodes available.
In all cases the allocator starts out in strict mode and tries to satisfy the request. If this fails and lax mode is allowed, then the allocator continues its search for a suitable partition in that mode. If this also fails, then the any mode is used as a last resort. The weakest mode used by the allocator can be specified with the -allocation option. The default is any, unless a height and width has been given to the -size flag; in which case the default is lax.
The base node (specified by -base option, or chosen by the allocator), receives the logical node ID 0. Node numbering continues to the right, then to the next row, until the bottom right node of the partition is assigned the logical ID n - 1, where n is the number of nodes in the partition.
For heterogeneous loads, the global mesh is allocated as described above, according to the global mesh size specification indicated on the first line of the loadfile. The submeshes specified in the load file are then logically allocated within the global mesh. Application 1 gets logical nodes 0 through h1*w1 - 1, application 2 nodes h1*w1 through (h1*w1 + h2*w2 -1), and so on. The width, height and base offset of each submesh relative to the global mesh is of course determined according to the submesh size string.
PROGRAM LOAD AND EXECUTION
Yod loads the coff image of the user program into memory, calculates the sizes of the text, data, and bss segments, and determines the start address of the executable. The command line arguments following the filename are collected as well as the current environment variables.
This information is then sent to the base node of the allocated partition. The SUNMOS kernel on the base node partitions the physical memory into the requested sections and fans the information out to the other nodes assigned to this application. If SUNMOS determines that there is not enough memory to run the user program, an error message is sent to yod. The load is then aborted and yod terminates.
After a successful initialization, the program text and data is loaded onto all nodes in the application and the bss segment is initialized. Then yod sends the start signal and the application starts running. The first thing it does, is to request the command line arguments (argv) and the environment (env) from yod. The first three file descriptors are opened (stdin, stdout, and stderr), and the user's main() is called.
In a heterogeneous load, a program's parameters, text, data, arguments and environment are being sent to the base node of its logical submesh. The fanout takes place in the submesh only.
NQS AND MACS INTERFACE
If NQS is setup properly to manage SUNMOS partitions, it is possible to queue and execute SUNMOS jobs under NQS. In that case, NQS will set the environment variable NX_DFLT_PART to the partition it has created. A script containing the yod command and its arguments will be executed by NQS. Yod will perform node allocation within the partition assigned by
NQS, and run the SUNMOS job.
If the NX_DFLT_PART environment variable is not set, yod behaves as if NQS and MACS were not present. (The same as it always has.) This mode will probably be removed in later releases and replaced by the following mode as the default.
If the environment variable NX_DFLT_PART is set to .sunmos.interactive and the partition exists, yod will allocate nodes from that partition and allow MACS to keep track of number of nodes allocated, start, and end time. Note: SUNMOS nodes not in the .sunmos.interactive partition are considered to be in use by NQS. Therefore, to run interactive SUNMOS jobs you have to create the .sunmos.interactive partition or unset NX_DFLT_PART.
SERVICES
While the user program is running, yod performs file I/O operations on behalf of the application. Any I/O operation of the application is translated by the C library into a message to yod to execute the requested command on the service node. Therefore, all file systems (local and remote) accessible on the service node, are available to all nodes of the application.
If a kill signal (Ctrl-C) is sent to yod, then the application is terminated and the partition freed, before yod exits. After all nodes in the application exit, yod will also exit.
AUTHORS
Rolf Riesen, Sandia National Laboratories. Stephen Wheat, Sandia National Laboratories. Gabi Istrail, Sandia National Laboratories. (Heterogeneous load capability.)
BUGS
If yod terminates abnormally, or is killed with a signal it cannot catch (e.g. kill -9), then the mesh partition remains allocated.
Acknowledgement and Disclaimer