This document describes a step-by-step installation and test procedure of GridMPI.
Version 1.1 is a minor bug fix release.
The following lists the recommended options for the configurer. The configurer shall find default compilers in most cases, and specifying compilers is optional. Note that when specifying --with-binmode=32/64 option, do the configure; make; make install procedure twice, with calling make distclean between them. Also, do not mix --with-binmode=no and --with-binmode=32/64 options, which overrides the previously specified configuration.
Platform Compiler Configuration Notes Linux/IA32 GCC ./configure (1) Linux/IA32 Intel CC=icc CXX=icpc F77=ifort F90=ifort ./configure (1) Linux/IA32 (x86_64) (Opteron/EM64T) GCC ./configure --with-binmode=32
./configure --with-binmode=64(1) Linux/IA32 (x86_64) (EM64T) Intel CC=icc CXX=icpc F77=ifort F90=ifort ./configure (1)(2) Linux/IA32 (x86_64) (Opteron/EM64T) PGI CC=pgcc CXX=pgCC F77=pgf77 F90=pgf90 ./configure (1)(2)(3) Linux/IA32 (x86_64) (Opteron/EM64T) Pathscale CC=pathcc CXX=pathCC F77=pathf90 F90=pathf90 ./configure --with-binmode=32
CC=pathcc CXX=pathCC F77=pathf90 F90=pathf90 ./configure --with-binmode=64(1) Linux/IA64 GCC ./configure IBM AIX/Power IBM XL Compilers ./configure --with-vendormpi=ibmmpi --with-binmode=32
./configure --with-vendormpi=ibmmpi --with-binmode=64Hitachi SR11K (AIX/Power) IBM XL and Hitachi F90 ./configure --with-vendormpi=ibmmpi --with-binmode=32
./configure --with-vendormpi=ibmmpi --with-binmode=64(4) Fujitsu Solaris8/SPARC64V Fujitsu CC=c99 CXX=FCC F77=frt F90=f90 ./configure --with-vendormpi=fjmpi --with-binmode=32
CC=c99 CXX=FCC F77=frt F90=f90 ./configure --with-vendormpi=fjmpi --with-binmode=64Solaris10/SPARC Sun (SUN Studio11) ./configure --with-binmode=32
./configure --with-binmode=64
GridMPI is fully tested with RedHat 9 for IA32 machines with GNU GCC. It is checked to compile and run a simplest test on Fedora Core 3 and 4 on IA32, SuSE SLES 8 on x86_64, and SuSE SLES 8 on IA64. Also, partially tested with Intel Compilers on IA32.
IT IS NEEDED UPDATE GCC 4.0.1 OR LATER FOR FEDORA CORE 4.
GridMPI needs following non-standard commands to compile.
- makedepend
makedepnd is in the "xorg-x11-devel" RPM package in RedHat or Fedora Core.
All hosts in clusters need to be global IP address reachable. [FAQ]
NOTE: The SCore/PMv2 support (a fast communication library of the SCore cluster system) is experimental in GridMPI-1.1. It is lightly tested only in a cluster environemnt. It needs further tuning.
Set $MPIROOT to the installation directory, and add $MPIROOT/bin in the PATH.
Commands and libraries are installed and searched in $MPIROOT/bin, $MPIROOT/include and $MPIROOT/lib. Note that it does NOT understand shell's "~" notation. Add settings in ".profile" or ".cshrc", etc.
# Example assumes /opt/gridmpi as MPIROOT (For sh/bash) $ MPIROOT=/opt/gridmpi; export MPIROOT $ PATH="$MPIROOT/bin:$PATH"; export PATH (For csh/tcsh) % setenv MPIROOT /opt/gridmpi % set path=($MPIROOT/bin $path)
Unpack the source in an appropriate directory.
In the following, files are expanded under the $HOME directory. The source expands in the gridmpi-1.1 directory.
$ cd $HOME $ tar zxvf gridmpi-1.1.tar.gz
The contents are:
LICENSE: The NAREGI License README: README file checkpoint: checkpointing source configure: configuration script yampii: YAMPII (PC cluster MPI) source src: GridMPI source libpsp: PSPacer (Precise Software Pacer) support code
Simply do the following:
$ cd $HOME/gridmpi-1.1 ...(1) $ ./configure ...(2) $ make ...(3) $ make install ...(4)
(1) Move to the source directory.
(2) Invoke the configurer. No options suffice for PC cluster settings.
(3) Make.
(4) Install.
Files are installed in $MPIROOT/bin, $MPIROOT/include, and $MPIROOT/lib.
See FAQ to use a C complier other than the default one. [FAQ]
Check the files in the installation directories.
(1) In $MPIROOT/bin,
mpicc, mpif77, mpic++, mpif90, mpirun, gridmpirun, impi-server, mpifork, nsd, canquit, detach, gnamesv
(2) In $MPIROOT/include,
mpi.h, mpif.h, mpi-1.h, mpi-2.h mpic++.h
(3) In $MPIROOT/lib,
libmpi.a
(4) Check the commands are in the path.
$ which mpicc $ which mpif77 $ which mpirun $ which gridmpirun
Compile pi.c in the src/test/basic directory.
(1) Compile a test program.
$ cd $HOME/gridmpi-1.1/src/test/basic/ $ mpicc pi.c
(1) Create a configuration file.
Content of mpi_conf:
localhost localhost
(2) Run an application (a.out) as a cluster MPI.
$ mpirun -np 2 ./a.out
In this case, GridMPI does not use wide-area communication, and is a cluster configuration using YAMPII.
When the cluster environment does not support rsh (remote-shell), it fails because MPI processes are started using rsh in a cluster by default. Set the environment variable _YAMPI_RSH to use ssh.
(For sh/bash) $ _YAMPI_RSH="ssh -x"; export _YAMPI_RSH (For csh/tcsh) % setenv _YAMPI_RSH "ssh -x"
Setting of ssh should be with no password. Refer to the FAQ to use ssh-agent. [FAQ]
(1) Create configuration files. Here, two localhost entires in mpi_conf1, and two localhost entries in mpi_conf2.
Content of mpi_conf1:
localhost localhost
Content of mpi_conf2:
localhost localhost
(2) Run an application (a.out).
$ export IMPI_AUTH_NONE=0 ...(1) $ impi-server -server 2 & ...(2) $ mpirun -client 0 addr:port -np 2 -c mpi_conf1 ./a.out < /dev/null & ...(3) $ mpirun -client 1 addr:port -np 2 -c mpi_conf2 ./a.out ...(4)
(1) Set IMPI_AUTH_NONE environment variable. It specifies the authentication method of the impi-server. The value can be anything, because it is ignored.
(2) Start the impi-server. impi-server is a process to make a contact and to exchange information between MPI processes. impi-server shall be started each time, because it exits at the end of an execution of an MPI program. The -server argument specifies the number of MPI jobs (invocations of mpirun command). impi-server prints the IP address/port pair to the stdout.
(3,4) Start MPI jobs by mpirun. The -client argument specifies the MPI job ID, and the IP address/port pair of impi-server. Job ID is from 0 to the number of jobs minus one to distinguish mpirun invocations. The -c option specifies the list of nodes. It starts an MPI program with NPROCS=4 (2+2).
GridMPI can use a vendor supplied MPI as an underlying communication layer as a "Vendor MPI". It is necessary to specify options to the configurer to use Vendor MPI. GridMPI supports IBM-MPI (on IBM P-Series and Hitachi SR11000) as a Vendor MPI.
GridMPI needs following (non-standard) commands to compile.
- gmake (GNU make) - makedepend - cc_r and xlc_r - IBM-MPI library (assumed to be in /usr/lpp/ppe.poe/lib)
GridMPI uses xlc_r to compile the source code. MPI applications can be compiled with cc_r, xlf_r, and Hitachi f90.
When the IBM-MPI library is not installed in the directory /usr/lpp/ppe.poe/lib, it is necessary to specify its location by MP_PREFIX (it is needed in both installation and use time). The MP_PREFIX environment variable is specified by the IBM-MPI.
See Installation on PC Clusters. [jump]
See Installation on PC Clusters. [jump]
The procedure slightly differs from the PC Clusters case in specifying --with-vendormpi to the configurer and in using gmake to compile.
$ cd $HOME/gridmpi-1.1 ...(1) $ ./configure --with-vendormpi=ibmmpi --with-binmode=32 ...(2) $ gmake ...(3) $ gmake install ...(4) $ gmake distclean $ ./configure --with-vendormpi=ibmmpi --with-binmode=64 ...(2) $ gmake ...(3) $ gmake install ...(4)
(1) Move to the source directory.
(2) Invoke the configurer.
The --with-vendormpi=ibmmpi specifies to use Vendor MPI.
The --with-binmode=32/64 specifies binary mode. Use --with-binmode=no to use a compiler default mode (or when the compiler does not support options to control the mode). Use --with-binmode=32/64 to use both modes. The configure-make-install procedure shall be performed twice, once for 32bit mode and once for 64bit mode. Also specify -q32/-q64 to mpicc at compiling applications. Do not forget gmake distclean between two runs of configure.
(3) Make with gmake.
(4) Install.
Files are installed in $MPIROOT/bin, $MPIROOT/include, and $MPIROOT/lib.
Check the configuration output. The first configuration information is from GridMPI. The second configuration information is from YAMPII. Note that GridMPI's configurer calls YAMPII's configurer inside. Check that --with-vendormpi is ibmmpi, and --with-gridmpi is yes in the YAMPII part.
Configuration (output from configure)
Configuration MPIROOT /opt/gridmpi --enable-debug no --enable-threads no --with-binmode 32/64 --with-score no --with-vendormpi ibmmpi --with-libckpt no --with-libpsp no --enable-dlload yes Configuration MPIROOT /opt/gridmpi --enable-debug no --enable-unix yes --enable-onesided no --enable-threads no --with-binmode 32/64 --with-score no --with-gridmpi yes --with-vendormpi ibmmpi --with-libckpt no --with-libckpt-includedir no --with-libckpt-libdir no --enable-dlload yes
NOTE: gmake distclean is necessary to clean the all configuration state when compiling with a different configuration. Also note that it removes all Makefiles.
Check the files in the installation directories.
(1) In $MPIROOT/bin,
mpicc, mpif77, mpic++, mpif90, mpirun, gridmpirun, impi-server, mpifork, nsd, canquit, detach, gnamesv
(2) In $MPIROOT/include,
mpi.h, mpif.h, mpi-1.h, mpi-2.h mpic++.h
(3) In $MPIROOT/lib,
libmpi32.a (--with-binmode=32 case) libmpi64.a (--with-binmode=64 case)
(4) Check the commands are in the path.
$ which mpicc $ which mpif77 $ which mpirun $ which gridmpirun
Compile pi.c in the src/test/basic directory.
(1) Compile a test program.
$ cd $HOME/gridmpi-1.1/src/test/basic/ $ mpicc -q32 -O3 pi.c (for 32bit binary) $ mpicc -q64 -O3 pi.c (for 64bit binary)
In the IBM-MPI environment, IBM POE (Parallel Operating Environment) is used to start MPI processes. Nodes are specified by a file host.list in POE by default. In using POE with the LoadLeveler, a batch command file llfile is needed.
(1) Create a configuration file.
Content of host.list:
node00 node01
Content of llfile:
#@job_type=parallel #@resources=ConsumableCpus(1) #@queue
(2) Run an application (a.out) as a cluster MPI.
$ mpirun -np 2 ./a.out -llfile llfile
NOTE: In no LoadLeveler environment, the specification of -llfile llfile is not necessary.
mpirun calls poe command inside to start MPI processes in the IBM POE. In the process, the -c argument is renamed with the -hostfile argument for poe command.
(1) Create configuration files. Here, two node00 entries in host1.list, and two node01 entries in host2.list.
Content of host1.list:
node00 node00
Content of host2.list:
node01 node01
Content of llfile:
#@job_type=parallel #@resources=ConsumableCpus(1) #@queue
(2) Run an application (a.out).
$ export IMPI_AUTH_NONE=0 $ impi-server -server 2 & $ mpirun -client 0 addr:port -np 2 -c host1.list ./a.out -llfile llfile & $ mpirun -client 1 addr:port -np 2 -c host2.list ./a.out -llfile llfile
NOTE: In no LoadLeveler environment, the specification of -llfile llfile is not necessary.
See Installation on PC Clusters for descriptions. [jump]
GridMPI supports Fujitsu MPI and Fujitsu compilers in Solaris8 (Fujitsu PrimePower Series).
GridMPI needs following (non-standard) commands to compile.
- Fujitsu c99/f90 - Fujitsu MPI (Parallelnavi) - gmake (GNU make) - makedepend (in /usr/openwin/bin)
The configurer assumes the Fujitsu compilers are installed in directory /opt/FSUNf90, and the Fujitsu MPI in /opt/FJSVmpi2 and /opt/FSUNaprun. GridMPI uses Fujitsu c99 to compile the source code.
# Example assumes /opt/gridmpi as MPIROOT (For sh/bash) $ MPIROOT=/opt/gridmpi; export MPIROOT $ PATH="$MPIROOT/bin:/opt/FSUNf90/bin:/opt/FSUNaprun/bin:/usr/ccs/bin:$PATH"; export PATH $ LD_LIBRARY_PATH="$LD_LIBRARY_PATH:/opt/FSUNf90/lib:/opt/FJSVmpi2/lib:\ /opt/FSUNaprun/lib"; export LD_LIBRARY_PATH $ LD_LIBRARY_PATH_64="$LD_LIBRARY_PATH_64:/opt/FSUNf90/lib/sparcv9:\ /opt/FJSVmpi2/lib/sparcv9:/opt/FSUNaprun/lib/sparcv9:\ /usr/ucblib/sparcv9:/usr/lib/sparcv9"; export LD_LIBRARY_PATH_64 (For csh/tcsh) % setenv MPIROOT /opt/gridmpi % set path=($MPIROOT/bin /opt/FSUNf90/bin /opt/FSUNaprun/bin /usr/ccs/bin $path) % setenv LD_LIBRARY_PATH "${LD_LIBRARY_PATH}:/opt/FSUNf90/lib:/opt/FJSVmpi2/lib:\ /opt/FSUNaprun/lib" % setenv LD_LIBRARY_PATH_64 "${LD_LIBRARY_PATH_64}:/opt/FSUNf90/lib/sparcv9:\ /opt/FJSVmpi2/lib/sparcv9:/opt/FSUNaprun/lib/sparcv9:\ /usr/ucblib/sparcv9:/usr/lib/sparcv9"
See Installation on PC Clusters. [jump]
The procedure slightly differs from the PC Clusters case in specifying --with-vendormpi to the configurer and in using gmake to compile.
$ cd $HOME/gridmpi-1.1 ...(1) $ CC=c99 CXX=FCC F77=frt F90=f90 ./configure --with-vendormpi=fjmpi --with-binmode=32 ...(2) $ gmake ...(3) $ gmake install ...(4) $ gmake distclean $ CC=c99 CXX=FCC F77=frt F90=f90 ./configure --with-vendormpi=fjmpi --with-binmode=64 ...(2) $ gmake ...(3) $ gmake install ...(4)
(1) Move to the source directory.
(2) Invoke the configurer.
The --with-vendormpi=fjmpi specifies to use Vendor MPI.
The --with-binmode=no/32/64 specifies binary mode. Use --with-binmode=no to use a compiler default mode (or when the compiler does not support options to control the mode). Use --with-binmode=32/64 to use both modes. The configure-make-install procedure shall be performed twice, once for 32bit mode and once for 64bit mode. Also specify -q32/-q64 to mpicc at compiling applications. Do not forget gmake distclean between two runs of configure.
(3) Make with gmake.
(4) Install.
Files are installed in $MPIROOT/bin, $MPIROOT/include, and $MPIROOT/lib.
Check the configuration output. The first configuration information is from GridMPI. The second configuration information is from YAMPII. Note that GridMPI's configurer calls YAMPII's configurer inside.
Check that --with-vendormpi is fjmpi, and --with-gridmpi is yes in the YAMPII part.
Configuration (output from configure)
Configuration MPIROOT /opt/gridmpi --enable-debug no --enable-threads no --with-binmode 32/64 --with-score no --with-vendormpi fjmpi --with-libckpt no --with-libpsp no --enable-dlload yes Configuration MPIROOT /opt/gridmpi --enable-debug no --enable-unix yes --enable-onesided no --enable-threads no --with-binmode 32/64 --with-score no --with-gridmpi yes --with-vendormpi fjmpi --with-libckpt no --with-libckpt-includedir no --with-libckpt-libdir no --enable-dlload yes
NOTE: gmake distclean is necessary to clean the all configuration state when compiling with a different configuration. Also note that it removes all Makefiles.
Check the files in the installation directories.
(1) In $MPIROOT/bin,
mpicc, mpif77, mpic++, mpif90, mpirun, gridmpirun, impi-server, mpifork, nsd, canquit, detach, gnamesv
(2) In $MPIROOT/include,
mpi.h, mpif.h, mpi-1.h, mpi-2.h mpic++.h
(3) In $MPIROOT/lib,
libmpi32.so libmpi_frt32.a libmpi_gmpi32.so (--with-binmode=32 case) libmpi64.so libmpi_frt64.a libmpi_gmpi64.so (--with-binmode=64 case)
(4) Check the commands are in the path.
$ which mpicc $ which mpif77 $ which mpirun $ which gridmpirun
Compile pi.c in the src/test/basic directory.
(1) Compile a test program.
$ cd $HOME/gridmpi-1.1/src/test/basic/ $ mpicc -q32 -Kfast pi.c (for 32bit binary) $ mpicc -q64 -Kfast -KV9 pi.c (for 64bit binary)
In the Fujitsu MPI environment, MPI runtime /opt/FJSVmpi2/bin/mpiexec is used to start MPI processes. Options to mpirun are translated and passed to Fujitsu MPI mpiexec: -np to -n and -c to -nl. The contents of the host configuration file specified by -c is traslated to the nodelist format of -nl, and the configuration file should be a list of the host names (a host per line, comments not allowed).
(1) Run an application (a.out) as a cluster MPI.
Configuration file is not needed. Fujitsu MPI automatically configures for available nodes.
$ mpirun -np 2 ./a.out
mpirun accepts the node list option -nl when configured with Fujitsu MPI. For example, only node zero should be used in the run, pass a list of zeros to the -nl option (one more zeros is needed to the number to the MPI processes which is assigned to the daemon process).
mpirun -np 4 -nl 0,0,0,0,0
mpirun accepts the -c option, which is intended for using with NODELIST of PBS Pro.
(1) Run an application (a.out).
$ export IMPI_AUTH_NONE=0 $ impi-server -server 2 & $ mpirun -client 0 addr:port -np 2 ./a.out & $ mpirun -client 1 addr:port -np 2 ./a.out
GridMPI is not fully tested in Solaris, but it should work. It can be installed and used as PC clusters, except:
GridMPI is not fully tested in FreeBSD, but it should work. It can be installed and used as PC clusters.
In running a single cluster MPI, processes are started by mpirun (without the -client option). It is just a cluster MPI, YAMPII, and the YAMPII protocol is used in communication.
In running a multiple-cluster MPI, processes are started by mpirun -client n ip-address-port for each MPI job. Multiply invoked MPI jobs join by connecting to an impi-server process.
The impi-server is a process to exchange information of processes (e.g., IP address/port pairs of the processes) from multiple MPI invocations. It does nothing after exchanging information until joining processes in MPI_Finalize.
In multiple-cluster MPI, the YAMPII protocol is used for intra-cluster communication, and the IMPI (Interoperable MPI) protocol is used for inter-cluster communication. Multiply started MPI processes receives their rank with regard to the client number. The lowest ranks are assigned to the processes started with mpirun -client 0 and the next lowest are to processes started with mpirun -client 1, and so on.
IMPI Protocol +---------+===================+---------+ | | | | +-----|---------|-----+ +-----|---------|-----+ +--------+ | +-------+ +-------+ | | +-------+ +-------+ | | impi- | | | rank0 | | rank1 | | | | rank2 | | rank3 | | | server | | +-------+ +-------+ | | +-------+ +-------+ | +--------+ | | | | | | | | | +=========+ | | +=========+ | | YAMPII Protocol | | YAMPII Protocol | +---------------------+ +---------------------+ mpirun -client 0 mpirun -client 1 |
gridmpirun is a simple frontend script to invoke the impi-server and a number of mpirun via an rsh/ssh. It reads its own configuration file and starts processes as specified.
impi_conf specifies the way to start MPI processes on each cluster.
(1) Create configuration files. Here, two node00 entires in host1.list, and two node01 entries in host2.list.
Content of host1.list:
node00 node00
Content of host2.list:
node01 node01
Content of impi_conf:
-np 2 -c host1.list -np 2 -c host2.list
Content of llfile:
#@job_type=parallel #@resources=ConsumableCpus(1) #@queue
(2) Run an application (a.out).
$ gridmpirun -np 4 -machinefile impi_conf ./a.out -llfile llfile
-np specifies the total number of MPI processes. gridmpirun breaks up the number into the sum of the number of processes of each cluster.