Environment Variables

GridMPI/YAMPI has many configurable settings which are all set through environment variables. Some variables are set internally by the mpirun command. A few variables are mandatory, especially MPIROOT and IMPI_AUTH_NONE or IMPI_AUTH_KEY are.

($Date: 2009/03/29 15:06:03 $)


List of Environment Variables

Debugging and Information
General Behavior
IMPI Behavior
P2P-layer Control
Special Features
Compiler Drivers
Internally Used
MPIRUN Control

Required and Frequently Needed

Specifies a GridMPI/YAMPI installation directory. Commands such as mpicc and mpirun searches under the directory for necessary include, library, and binary files.
Specifies not to use authentication in the IMPI server. The value is ignored. The IMPI specification defines two weak methods of authentication: IMPI_AUTH_NONE and IMPI_AUTH_KEY. Either of these is required, and both of a run of IMPI server and a run of mpirun should set matching variables.
Specifies a remote shell command. Typical choice is "rsh" or "ssh". It may include very simple options to the rsh or ssh commands. SSH is the default.
Specifies a file name of a host configuration file for YAMPI. Default is a "mpi_conf" in the current directory. A host file (in the simplest) contains a host name per line.
Specifies a buffer size of sockets (SNDBUF/RCVBUF) in bytes (in decimal number). It calls setsockopt with a specified value. (1) Unspecifying this will set the default 64KB. (2) Specifying a (positive) number sets the value. (3) Empty string or "0" disables calling setsockopt, thus uses the OS default.
Specifies a buffer size of sockets (SNDBUF/RCVBUF) used in IMPI connections, similar to _YAMPI_SOCBUF. If IMPI_SOCBUF is not specified, _YAMPI_SOCBUF is also used for IMPI connections.
Specifies a port range to listen to for intra-cluster connection. Port numbers from start to end (inclusive) are tried.
Specifies a port range to listen to for IMPI connections. Port numbers from start to end (inclusive) are tried.
Specifies a port range to listen to for the IMPI server. Port numbers from start to end (inclusive) are tried.
Specifies the network interface by the network address. It is used when the nodes have multiple network interfaces and the preferred connection is on the other one than the primary interface. It is specified by the format of the network address like "", and it is matched after masking by the netmask.
Specifies the version of IP. Default is 4.
Specifies a remote shell command to execute the IMPI Relay. Typical choice is "rsh" or "ssh". It may include very simple options to the rsh or ssh commands. If you want to simply run the IMPI Relay on localhost, instead of to use a remote shell, set this variable to '#'. SSH is the default.
Specifies an option of collective operations in GridMPI/YAMPI. YAMPI may define some algorithms for collective operations, and this value chooses one. (It is very implementation specific. The meaning of the value is not described here. See the source code.)
Specifies a switching point of MPI_Send to use the rendezvous protocol. If the message size (after packing) equals to this value or larger, MPI_Send switches to MPI_Rsend. Setting this to zero disables switching and MPI_Send always uses the eager protocol. Zero is the default (may change in the future).
Specifies to enable threads without regard to the use of MPI_Init_thread. It allows to run MPI-1 programs with threads enabled. Especially, the current implementation of GridMPI/YAMPI uses threads internally to handle requests of one-sided communication and MPI-IO, and it requires setting this variable.

Debugging and Information

Specifies to print some information verbosely at startup and runtime. It prints information normally at startup. It accepts a colon-separated list of keywords:
ALLAll of below
LLLow level; combined with others to make more verbose
TCPTCP p2p-layer
MEMMEM p2p-layer
PMPM (SCore) p2p-layer
MXMyrinet MX p2p-layer
IBOpenIB InfiniBand p2p-layer
IMPIIMPI p2p-layer
BLDZBulldozing for checkpointing
RMAOne-sided (Remote Memory Access)
DPMSpawning (Dynamic Process Management)
SOCKSocket operations for TCP and IMPI

_YAMPI_DEBUG=number (mpirun sets this)
Specifies to print trace information. It is mainly for implementors but may somehow help ordinary users. It is a bit mask, so specify 255 for the highest verbosity.
Specifies to dump cores at MPI_Abort. The default is 1. Set this to zero when it is uncomfortable to dump cores at abortion, because MPI_Abort is called as a part of a normal operation. Note that control at faulting conditions (such as SIGSEGV) can be done by _YAMPI_BACKTRACE.

Specifies to print backtrace at errors (at signals of synchronous faults). The value controls the way of printing. 0: Do nothing; 1: Prints backtrace (not overwrite previous signal settings at MPI_Init); 2: Prints backtrace (overwrites previous signal settings at MPI_Init); 3: Backtraces by GDB (overwrites previous signal settings at MPI_Init); and 4: Resets signal handlers to default (overwrites previous signal settings at MPI_Init).

It uses BACKTRACE(3) in GNU glibc, PRINTSTACK(3C) in Solaris, or GDB if so specified. It may affect the signal settings of SIGABRT, SIGILL, SIGBUS, SIGFPE, SIGSEGV, SIGPIPE, and SIGSTKFLT.

Specifies a file name which contains the command list to GDB, when _YAMPI_BACKTRACE=3.
Specifies a behavior on closure of sockets. 0: Specifies not to abort; 1: Specifies to abort silently (default); and 2: Aborts with signaling an error. It is effective for TCP/IP transport (including IMPI protocol).

Specifies to print warnings. The default is 1. It can be used to suppress verbose warnings.

Specifies to dump full core in AIX. AIX dumps only the stack by default, and does not include the data. Setting this to 1 make a core include all data.

Specifies to print trace information of one-sided communication, such as locking, epoch intervals, and datatype operations.

Specifies to print trace information of MPI-IO. The specification requires errors be non-fatal, and information obtained through error numbers is very minimal. Setting this make easier to diagnose erroneous operations.

Specifies to trigger SIGSEGV at abortion, to properly dump cores in broken machines. (A Surigiken Inc. special)

General Behaviour

Specifies to use the standard packing for long double. The default is NOT standard conforming: It packs long double to sizeof(long double) instead of standard-specified 16 bytes, even in the external32 format. This breaks data exchange in a heterogeneous environment. For conforming packing, set this variable to any value. The default is chosen to make often-occurring badly-coded programs work in a homogeneous environment.
Specifies to disable buffering on stdout/stderr. This controls setlinebuf(3C) on stdout/stderr. Default is line-by-line flushing. When _YAMPI_NOFLUSH is set, flushing is disabled, which makes dumping via stdout faster for a large amount of data.
Specifies to disable using UNIX domain sockets. _YAMPI_DISABLE_UNIX is a synonym.

IMPI Behaviour

GridMPI has control of the parameters defined in the IMPI specification through environment variables. The precise meaning of the parameters can be found in the IMPI specification.

The values are unsigned 32-bit values (except IMPI_AUTH_KEY), but specifying too large values will break the resource limit. The implementation does not force a limit on the values.

Specifies NOT to use any authentication in connecting to the IMPI server. The value is ignored. (See the IMPI specification).
Specifies to use the simple KEY authentication protocol in connecting to the IMPI server. The key value is given as 64-bit decimal integer. It is matched between IMPI server invocation and MPI process invocation. (See the IMPI specification).
IMPI_C_DATALEN=value (default 65536 bytes)
Specifies the maximum data bytes (payload size) in a packet. The smallest value from the clients is negotiated. (See the IMPI specification).
IMPI_COLL_XSIZE=value (default 1024 bytes)
Specifies a cross-over point of collective algorithms by message size. This specifies the least byte size to use long algorithms. All clients must specify the same value, or an error is signalled. (See the IMPI specification).
IMPI_COLL_MAXLINEAR=value (default 4)
Specifies a cross-over point of collective algorithms by the number of hosts (In GridMPI-0.2, the number of hosts equals to the total number of procs). This specifies the least number of hosts to use "non-linear (hypercube)" algorithms. All clients must specify the same value, or an error is signalled. (See the IMPI specification).
IMPI_H_ACKMARK=value (default 10 packets)
Specifies a flow control parameter. Acks are returned at every ack-mark packets. The smallest value from the clients is negotiated. (See the IMPI specification with errata).
IMPI_H_HIWATER=value (default 20 packets)
Specifies a flow control parameter. It is an upper limit of received packets without sending an ack. The smallest value from the clients is negotiated. (See the IMPI specification with errata).

The following constraint is maintained: (1 ≤ IMPI_H_ACKMARK ≤ IMPI_H_HIWATER).

Unspecifiable. The value of TAGUB of YAMPI (0x7fffffff) is used.

Compiler Drivers

These override the default set at configuration time.

MP_PREFIX=directory path

Specifies a C compiler for mpicc.
Specifies a C++ compiler for mpic++.
Specifies a F77 compiler for mpif77.
Specifies a F90 compiler for mpif90.










Special Features

Specifies tuples of interface device such as eth0 and class-id such as 1:1, and the maximum transmision rate. interface and classid are mandatory; rate is optional. rate should be set to the bottleneck bandwidth between clusters. It is set to the physical bandwith of the interface if it is not specified. See tc(8) man page about class-id. (SUPPORT LINUX ONLY)

Platform Specifics


Setting 1 to this makes a send request block until it is complete. It is on by default. It is effective for MX transport.
Setting 1 to this makes MX error handler is set to MX_ERRORS_RETURN and errors are handled in YAMPI, instead of using MX_ERRORS_ARE_FATAL which is the default error handler prints some messages and terminates the MPI process. It is 0 by default. It is effective for MX transport.

Internally Used

Environment variables are used to pass information from commands like mpirun to MPI processes. These are listed for the information purpose and are not intended for direct use. These variables may change or be missing in the future releases.

_YAMPI_NPROC=nprocs (mpirun sets this)

_YAMPI_MYRANK=rank (mpirun sets this)

_YAMPI_PHOST=host (mpirun sets this)

_YAMPI_PPORT=port (mpirun sets this)

_YAMPI_PHASH=number (mpirun sets this)

_YAMPI_ARCH=hostname (mpirun sets this)
Specifies a transport. The default is 0 (YAMPI/TCP).
_YAMPI_PROGNAME=name (mpirun sets this)

IMPI_SERVER_PORT=port (mpirun sets this)

IMPI_CLIENT_ID=id (mpirun sets this)


CKPT_DIR=dir (gridmpirun sets this)
Checkpoint files are stored in the directory $CKPT_DIR. (= ".")
CKPT_FILE=name (gridmpirun and mpirun set this)

CKPT_RESTART=0/1 (gridmpirun sets this)

MPIRUN Control

There are many environment variables to control mpirun behavior. This section lists only small portions of them. Run mpifork -show-optons for all options.

Specifies a command to call from the mpirun script. mpirun uses $MPIROOT/bin/mpifork by default. Or, it uses one in the path or one in the same directory as mpirun.
Specifies to prefix mpifork by the (local) value of $MPIROOT/bin. Default is 0, meaning not to prefix. It is useful when the home is shared (eg, by NFS), but setting the path in the rc-file fails.