==========================================================================
Frequently Asked Questions about Virtual Node mode
--------------------------------------------------------------------------   
  (see also other FAQ files in /usr/local/FAQ on sasn100)
--------------------------------------------------------------------------   
    Overview questions:
    -------------------
    1) What is the minimum I need to know to use virtual node mode?
    2) What is virtual node mode? 
    3) How do I invoke virtual node mode on Janus?
    4) What are Sandia's performance goals for virtual node mode on janus?
       What kind of performance improvements have been observed during
       testing?
    5) In virtual node mode, is the Cougar OS residing on one or both 
       processors?
    6) In virtual node mode, are there one or two copies of my executable
       residing on each physical node?
    7) Will virtual node mode exercise any system hardware that was not 
       previously used?
    8) What are the known problems with virtual node mode?
    9) What impact on janus system stability do we expect from virtual
       node mode?
   10) How will Intel and Sandia respond if virtual node mode adversely
       impacts system stability?
   11) Who do I contact if I have problems with virtual node mode?
   12) Where do I get help?

    Detail questions:
    -----------------------
   13) What languages are supported on virtual node mode?  How and where do
       I compile my code for virtual node mode?
   14) How do I kill my virtual node job?
   15) Can I use debugging tools on a virtual node job on janus?
   16) Is showmesh affected by virtual node mode?
   17) How do I determine the processor numbers of my virtual processes?
   18) What signals are available from virtual node mode?
   19) Can I use the profiler on a virtual node job?
   20) How is software latency affected by virtual node mode?
   21) How is memory allocated for virtual node mode?
   22) What modes are now available to use the second processor?
   23) Are there any problems using common blocks when using the second
       processor for computation?
   24) Are there any tools to help determine performance problems when
       using the second processor for computation?
   25) Can I still use the "COP" interface to use the second processor for 
       computation?
   26) What should I do to avoid the currently known problems with
       the current release of the OS?
   27) Is there any way to see how much memory my application is using
       on the Cougar nodes?
   28) Will the "Good Citizen" rules for Janus be affected by virtual node
       mode?
   29) How can I monitor stack usage during a virtual node run?
   30) What impact does virtual node mode have on submitting NQS jobs?
   31) What about using -share mode or checkpoint/restart in virtual
       node mode?
   32) Can I run a heterogeneous application in virtual node mode?


--------------------------------------------------------------------------

    1)  What is the minimum I need to know to use virtual node mode?

    To use virtual node modes, there are three basics:
    - The nodes share memory so only half as much (128MB each) is available.
    - Add -p 3 to the yod command. Any size (or sz) parameter now
    specifies the number of virtual nodes to use.
    - When using NQS, for the -lp parameter, specify 1/2 the number
    of virtual nodes to be used.  NQS allocates physical nodes.

    2) What is virtual node mode?

    Virtual node mode (-proc 3 on the yod command line) enables a user to      
    run applications on both processors of a physical node without any special 
    modifications to the application.                                          
    That is, from the user perspective both processes on the physical node     
    will look identical to a process running -proc 0 or -proc 1, except        
    that each process uses half of the available physical memory.

    3) How do I invoke virtual node mode on Janus?

    "yod -proc 3 ..." or "yod -p 3 ..." will invoke virtual node mode.


    4) What are Sandia's performance goals for virtual node mode on janus?
       What kind of performance improvements have been observed during
       testing?

    Sandia's goal is to achieve at least a 30% increase in throughput 
    on janus by means of virtual node mode.  This conservative figure
    was selected due to the potential impact of memory bus conflicts on
    the processor boards.  Early results of a CTH benchmark performed by
    Ben Cole during a Sunday afternoon eval slot showed an 85 % speedup
    in using both nodes on each board.  An Xpatch benchmark by Bob 
    Benner, which was expected to have significant potential for memory
    bus conflicts, had a 100% speedup.

    Please send your own results and comparisons to
    rebenne@cs.sandia.gov and bhcole@sandia.gov.

    5) In virtual node mode, is the Cougar OS residing on one or both 
       processors?

    A single copy of Cougar runs on the node and serves the processes
    running on both processors of the node.

    7) Will virtual node mode exercise any system hardware that was not 
       previously used?

    No!


    8) What are the known problems with virtual node mode?

    There are no known bugs with virtual node mode within the Cougar
    OS.  There are issues with the profiler and debugger, which are 
    discussed below.  There is also a known problem concerning 
    scalability within TOS to handle twice as many processors as before,
    particularly for I/O intensive tasks.


    9) What impact on janus system stability do we expect from virtual
       node mode?

    System stability with the preexisting processor modes, especially -proc
    0 and 1, should be enhanced because of a number of bugs in their 
    implementations that were discovered and fixed in the course of 
    implementing, debugging, and testing virtual node mode.

    The most recent results from Intel's weekend stress tests of janus
    with virtual nodes are encouraging.  
    We observed some problems with I/O scalability due to the increased
    number of nodes. This problem is being worked on


   10) How will Intel and Sandia respond if virtual node mode adversely
       impacts system stability?

    In case system stability degrades for any reason, the Intel on-site
    personnel can disallow virtual nodes by creating a file called
    /cougar/proc_3_disabled. A reboot is not required. If that file exists,
    then the yod will not permit a virtual nodes program to run. An
    error message is printed and the job exits.  NQS scripts can check
    for the existence of this file.  This feature has never been required
    since virtual node mode was installed in early 1999.

   11) Who do I contact if I have problems with virtual node mode?

    As always, contact janus-help@sandia.gov.  This list will include 
    virtual node OS developers.


   12) Where do I get help?

    For usage problems of the janus computer itself, whether they concern
    virtual node mode or not, please send e-mail to
         janus-help@sandia.gov
    This e-mail address is for assistance with running your janus jobs only.
    Please direct questions regarding dedicated mode requests or observations
    regarding the NQS setup to
         janus-managers@sandia.gov


   13) What languages are supported on virtual node mode?  How and where do
       I compile my code for virtual node mode?

   There is no change in the binary executable files for virtual node mode,
   and hence no change in the compilers.  All current languages are supported
   and no special compiler options are needed.

   14) How do I kill my virtual node job?

    There is no change from the present methods of using kill -2 or kill -9.


   15) Can I use debugging tools on a virtual node job on janus?

    Both "debug" and "xdebug" should work for all processor modes.

   16) Is showmesh affected by virtual node mode?

    No.


   17) How do I determine the processor numbers of my virtual processes?

    If you are using 2N or 2N-1 processes in virtual node mode, then 
    virtual process X+N is on the same physical node as process X, where 
    0 <= X < N.  For example, a simulation with 24 physical nodes and 48
    virtual processes has processes 0 and 24 on physical node 0, processes
    1 and 25 on physical node 1, etc.


   18) What signals are available from virtual node mode?

    The same as for processor modes 0, 1, 2:

    signal      default             Description
    -------     -----------------   -------------------------------------
    SIGFPE      core dump           Floating Point Exception
    SIGKILL     terminate process   Kill
    SIGSEGV     core dump           Segmentation Violation
    SIGALRM     terminate process   Alarm clock
    SIGTERM     terminate process   Software termination signal from kill
    SIGUSR1     terminate process   User defined signal 1


   19) Can I use the profiler on a virtual node job?

    YES.

   20) How is software latency affected by virtual node mode?

    The best case software latency for zero length messages in virtual 
    node (p3) mode is 20 microseconds, compared to 14 microseconds in p1 mode.


   21) How are heap and stack space allocated for virtual node mode?

    Each virtual node has its own protected address space.  The heap,
    stack, communication space, and other memory regions are allocated
    the same as before, except that the total physical memory of the
    node is divided between two processes.

   22) What modes are now available to use the second processor?

    The second processor may be used in one of four modes:

    o  Ignore it (the "heater" mode).  Use the "-proc 0" option with yod.
       This is the default mode.

    o  Use the first processor as a communication co-processor.  Use the
       "-proc 1" option with yod.  This option migrates the user application
       to the second processor. 

    o  Use the second processor to run an additional application thread.
       Use the "-proc 2" option with yod.  Using this mode may require
       additional work, either by linking in special math libraries, or
       tuning your application with OpenMP directives.  To use dual 
       processor math libraries with -proc 2, link with -mp -lcsmath, 
       whereas single processor math libraries require linking with only 
       -lcsmath.  

    o  Use the second processor to run an additional application process.
       Use the "-proc 3" option with yod.  Using this mode makes the
       second processor look identical to the first, from the perspective
       of a user application.  To use the math libraries in this case 
       requires linking just with -lcsmath.



   23) Are there any problems using common blocks when using virtual nodes?
       
       No.
    
    
   24) Are there any tools to help determine performance problems when
       using the second processor for computation?

    Not at this point.  See FAQ item 19 above concerning the status of the
    profiler for virtual node jobs.


   25) Can I still use the "COP" interface to use the second processor for 
       computation?

    Yes.  You can continue to use COP with -proc 2 mode.
    However, the COP interface will not work with either -proc 3 mode or
    OpenMP.  Typically, a job will hang at the first COP call.


   26) What should I do to avoid the currently known problems with
       the current release of the OS?

    Warnings about the current OS are in the file on sasn100:
        /usr/local/FAQ/janus-warn


   27) Is there any way to see how much memory my application is using
       on the Cougar nodes?

    There is an unsupported system call heap_info() which will return
    this information.  At present we have not tested this call in 
    virtual node mode.


   28) Will the "Good Citizen" rules for Janus be affected by virtual node
       mode?

    Not at this time.


   29) How can I monitor stack usage during a virtual node run?

    A utility is available in janus:/usr/community/stackmon that enables 
    you to print out how close you are to the end of your stack space 
    from within an application.  The source for this utility, along with 
    a sample driver program, is provided.  This utility as written will 
    provide correct results in virtual node mode (although it does not 
    work for -proc 2 mode).


   30) What impact does virtual node mode have on submitting NQS jobs?

    The default NQS behavior, if you do not specify a size on the yod 
    line and do not specify a -lP on the qsub line, is to give you all
    available nodes.  For example, on a system with 30 physical nodes
    you would get 30 nodes by default.  If you use "-p 3" on your yod
    line you will get 60 virtual nodes.

    If you do not specify a size or specify "-sz all" on the yod line
    and specify "-lp 2" on the qsub line, you will get 2 nodes.  If you
    use "-p 3" on your yod line you will get 4 virtual nodes in this 
    case (twice the number of physical nodes specified on the qsub line).


   31) What about using -share mode or checkpoint/restart in virtual
       node mode?

    None of these features are supported in virtual node mode.


   32) Can I run a heterogeneous application in virtual node mode?

    Heterogeneous applications are those in which different program binaries
    reside on different set of processors and interact in a parallel
    application.  An example might be an engineering app that runs on 220
    processors and has an associated postprocessing package that runs on a
    separate set of 12 processors and recieves data from the engineering app
    in real time and processes it.

    Yes, you can run heterogeneous applications in virtual node mode, with 
    the restriction that all of the executables specified in your loadfile
    must be running in virtual node mode - you cannot mix -proc modes in 
    the loadfile.  Other restrictions on heterogeneous applications have
    have been relaxed significantly beginning in Cougar v. 3.0.  In your 
    loadfile you can now choose to specify -sz on each command line either 
    as an h x w x 4 mesh with offsets from an overall mesh specified on the 
    first line of the loadfile, or you can give numerical values to the 
    overall and individual sizes.

    Some examples:

    yod -proc 3 -F loadfile

    (a) a loadfile with mesh sizes and offsets:

         4x2x4
         yod -sz 1x2x4:0,0 hello1
         yod -sz 3x2x4:0,1 hello2

    (b) a loadfile with numerical sizes:

         6
         yod -sz 2 hello1
         yod -sz 4 hello2

    In the latter case, each program must have a number of nodes that is
    even, except for the last program.  For example, the following loadfile
    is good,

    (c) a good loadfile for an odd number of virtual processes

         17
         yod -sz 12 hello1
         yod -sz 5 hello2

    but the next one will fail,

    (d) a bad loadfile for an odd number of virtual processes

         17
         yod -sz 11 hello1
         yod -sz 6 hello2

    This latter loadfile would require two different executables to be
    loaded onto one of the physical nodes - this is not yet supported.

    Submit questions about heterogeneous programs and loadfiles to
    janus-help@sandia.gov, where Bob Benner and others will respond to
    them.



--------------------------------------------------------------------------
Last updated 12 March 2002 by Gerry Quinlan
Disclaimer added 29 June 2001
--------------------------------------------------------------------------


Acknowledgement and Disclaimer