Installation Procedure

This document describes a step-by-step installation and test procedure of GridMPI.

Contents

NOTE: Changes from Version 1.0 to Version 1.1

Version 1.1 is a minor bug fix release.

NOTE: Configuration Templates

The following lists the recommended options for the configurer. The configurer shall find default compilers in most cases, and specifying compilers is optional. Note that when specifying --with-binmode=32/64 option, do the configure; make; make install procedure twice, with calling make distclean between them. Also, do not mix --with-binmode=no and --with-binmode=32/64 options, which overrides the previously specified configuration.

Platform Compiler Configuration Notes
Linux/IA32GCC ./configure(1)
Linux/IA32Intel CC=icc CXX=icpc F77=ifort F90=ifort ./configure(1)
Linux/IA32 (x86_64) (Opteron/EM64T)GCC ./configure --with-binmode=32
./configure --with-binmode=64
(1)
Linux/IA32 (x86_64) (EM64T)Intel CC=icc CXX=icpc F77=ifort F90=ifort ./configure(1)(2)
Linux/IA32 (x86_64) (Opteron/EM64T)PGI CC=pgcc CXX=pgCC F77=pgf77 F90=pgf90 ./configure(1)(2)(3)
Linux/IA32 (x86_64) (Opteron/EM64T)Pathscale CC=pathcc CXX=pathCC F77=pathf90 F90=pathf90 ./configure --with-binmode=32
CC=pathcc CXX=pathCC F77=pathf90 F90=pathf90 ./configure --with-binmode=64
(1)
Linux/IA64GCC ./configure 
IBM AIX/PowerIBM XL Compilers ./configure --with-vendormpi=ibmmpi --with-binmode=32
./configure --with-vendormpi=ibmmpi --with-binmode=64
 
Hitachi SR11K (AIX/Power)IBM XL and Hitachi F90 ./configure --with-vendormpi=ibmmpi --with-binmode=32
./configure --with-vendormpi=ibmmpi --with-binmode=64
(4)
Fujitsu Solaris8/SPARC64VFujitsu CC=c99 CXX=FCC F77=frt F90=f90 ./configure --with-vendormpi=fjmpi --with-binmode=32
CC=c99 CXX=FCC F77=frt F90=f90 ./configure --with-vendormpi=fjmpi --with-binmode=64
Solaris10/SPARCSun (SUN Studio11) ./configure --with-binmode=32
./configure --with-binmode=64
 

1. Installation on PC Clusters

1.1. Prerequisite

GridMPI is fully tested with RedHat 9 for IA32 machines with GNU GCC. It is checked to compile and run a simplest test on Fedora Core 3 and 4 on IA32, SuSE SLES 8 on x86_64, and SuSE SLES 8 on IA64. Also, partially tested with Intel Compilers on IA32.

IT IS NEEDED UPDATE GCC 4.0.1 OR LATER FOR FEDORA CORE 4.

GridMPI needs following non-standard commands to compile.

	- makedepend

makedepnd is in the "xorg-x11-devel" RPM package in RedHat or Fedora Core.

All hosts in clusters need to be global IP address reachable. [FAQ]

NOTE: The SCore/PMv2 support (a fast communication library of the SCore cluster system) is experimental in GridMPI-1.1. It is lightly tested only in a cluster environemnt. It needs further tuning.

1.2. Setting Environment Variables

Set $MPIROOT to the installation directory, and add $MPIROOT/bin in the PATH.

Commands and libraries are installed and searched in $MPIROOT/bin, $MPIROOT/include and $MPIROOT/lib. Note that it does NOT understand shell's "~" notation. Add settings in ".profile" or ".cshrc", etc.

# Example assumes /opt/gridmpi as MPIROOT

(For sh/bash)
$ MPIROOT=/opt/gridmpi; export MPIROOT
$ PATH="$MPIROOT/bin:$PATH"; export PATH

(For csh/tcsh)
% setenv MPIROOT /opt/gridmpi
% set path=($MPIROOT/bin $path)

1.3. Unpacking the Source

Unpack the source in an appropriate directory.

In the following, files are expanded under the $HOME directory. The source expands in the gridmpi-1.1 directory.

$ cd $HOME
$ tar zxvf gridmpi-1.1.tar.gz

The contents are:

	LICENSE:	The NAREGI License
	README:		README file
	checkpoint:	checkpointing source
	configure:	configuration script
	yampii:		YAMPII (PC cluster MPI) source
	src:		GridMPI source
	libpsp:		PSPacer (Precise Software Pacer) support code

1.4. Compiling the Source

Simply do the following:

$ cd $HOME/gridmpi-1.1			...(1)
$ ./configure				...(2)
$ make					...(3)
$ make install				...(4)

(1) Move to the source directory.

(2) Invoke the configurer. No options suffice for PC cluster settings.

(3) Make.

(4) Install.

Files are installed in $MPIROOT/bin, $MPIROOT/include, and $MPIROOT/lib.

See FAQ to use a C complier other than the default one. [FAQ]

1.5. Checking the Installed Files

Check the files in the installation directories.

(1) In $MPIROOT/bin,

	mpicc, mpif77, mpic++, mpif90, mpirun, gridmpirun,
	impi-server, mpifork, nsd, canquit, detach, gnamesv

(2) In $MPIROOT/include,

	mpi.h, mpif.h, mpi-1.h, mpi-2.h mpic++.h

(3) In $MPIROOT/lib,

	libmpi.a

(4) Check the commands are in the path.

$ which mpicc
$ which mpif77
$ which mpirun
$ which gridmpirun

1.6. Testing Compilation

Compile pi.c in the src/test/basic directory.

(1) Compile a test program.

$ cd $HOME/gridmpi-1.1/src/test/basic/
$ mpicc pi.c

1.7. Starting a Program (as a Cluster MPI)

(1) Create a configuration file.

Content of mpi_conf:

localhost
localhost

(2) Run an application (a.out) as a cluster MPI.

$ mpirun -np 2 ./a.out

In this case, GridMPI does not use wide-area communication, and is a cluster configuration using YAMPII.

When the cluster environment does not support rsh (remote-shell), it fails because MPI processes are started using rsh in a cluster by default. Set the environment variable _YAMPI_RSH to use ssh.

(For sh/bash)
$ _YAMPI_RSH="ssh -x"; export _YAMPI_RSH

(For csh/tcsh)
% setenv _YAMPI_RSH "ssh -x"

Setting of ssh should be with no password. Refer to the FAQ to use ssh-agent. [FAQ]

1.8. Starting a Program (as a Multiple-Cluster MPI)

(1) Create configuration files. Here, two localhost entires in mpi_conf1, and two localhost entries in mpi_conf2.

Content of mpi_conf1:

localhost
localhost

Content of mpi_conf2:

localhost
localhost

(2) Run an application (a.out).

$ export IMPI_AUTH_NONE=0		...(1)
$ impi-server -server 2 &		...(2)
$ mpirun -client 0 addr:port -np 2 -c mpi_conf1 ./a.out < /dev/null &	...(3)
$ mpirun -client 1 addr:port -np 2 -c mpi_conf2 ./a.out		...(4)

(1) Set IMPI_AUTH_NONE environment variable. It specifies the authentication method of the impi-server. The value can be anything, because it is ignored.

(2) Start the impi-server. impi-server is a process to make a contact and to exchange information between MPI processes. impi-server shall be started each time, because it exits at the end of an execution of an MPI program. The -server argument specifies the number of MPI jobs (invocations of mpirun command). impi-server prints the IP address/port pair to the stdout.

(3,4) Start MPI jobs by mpirun. The -client argument specifies the MPI job ID, and the IP address/port pair of impi-server. Job ID is from 0 to the number of jobs minus one to distinguish mpirun invocations. The -c option specifies the list of nodes. It starts an MPI program with NPROCS=4 (2+2).


2. Installation on IBM AIX

GridMPI can use a vendor supplied MPI as an underlying communication layer as a "Vendor MPI". It is necessary to specify options to the configurer to use Vendor MPI. GridMPI supports IBM-MPI (on IBM P-Series and Hitachi SR11000) as a Vendor MPI.

2.1. Prerequisite

GridMPI needs following (non-standard) commands to compile.

	- gmake (GNU make)
	- makedepend
	- cc_r and xlc_r
	- IBM-MPI library (assumed to be in /usr/lpp/ppe.poe/lib)

GridMPI uses xlc_r to compile the source code. MPI applications can be compiled with cc_r, xlf_r, and Hitachi f90.

When the IBM-MPI library is not installed in the directory /usr/lpp/ppe.poe/lib, it is necessary to specify its location by MP_PREFIX (it is needed in both installation and use time). The MP_PREFIX environment variable is specified by the IBM-MPI.

2.2. Setting Environment Variables

See Installation on PC Clusters. [jump]

2.3. Unpacking the Source

See Installation on PC Clusters. [jump]

2.4. Compiling the Source

The procedure slightly differs from the PC Clusters case in specifying --with-vendormpi to the configurer and in using gmake to compile.

$ cd $HOME/gridmpi-1.1			...(1)
$ ./configure --with-vendormpi=ibmmpi --with-binmode=32	...(2)
$ gmake					...(3)
$ gmake install				...(4)
$ gmake distclean
$ ./configure --with-vendormpi=ibmmpi --with-binmode=64	...(2)
$ gmake					...(3)
$ gmake install				...(4)

(1) Move to the source directory.

(2) Invoke the configurer.

The --with-vendormpi=ibmmpi specifies to use Vendor MPI.

The --with-binmode=32/64 specifies binary mode. Use --with-binmode=no to use a compiler default mode (or when the compiler does not support options to control the mode). Use --with-binmode=32/64 to use both modes. The configure-make-install procedure shall be performed twice, once for 32bit mode and once for 64bit mode. Also specify -q32/-q64 to mpicc at compiling applications. Do not forget gmake distclean between two runs of configure.

(3) Make with gmake.

(4) Install.

Files are installed in $MPIROOT/bin, $MPIROOT/include, and $MPIROOT/lib.

Check the configuration output. The first configuration information is from GridMPI. The second configuration information is from YAMPII. Note that GridMPI's configurer calls YAMPII's configurer inside. Check that --with-vendormpi is ibmmpi, and --with-gridmpi is yes in the YAMPII part.

Configuration (output from configure)
Configuration
  MPIROOT				/opt/gridmpi
  --enable-debug			no
  --enable-threads			no
  --with-binmode			32/64
  --with-score                          no
  --with-vendormpi			ibmmpi
  --with-libckpt			no
  --with-libpsp				no
  --enable-dlload			yes

Configuration
  MPIROOT				/opt/gridmpi
  --enable-debug			no
  --enable-unix				yes
  --enable-onesided			no
  --enable-threads			no
  --with-binmode			32/64
  --with-score				no
  --with-gridmpi			yes
  --with-vendormpi			ibmmpi
  --with-libckpt			no
  --with-libckpt-includedir		no
  --with-libckpt-libdir			no
  --enable-dlload			yes

NOTE: gmake distclean is necessary to clean the all configuration state when compiling with a different configuration. Also note that it removes all Makefiles.

2.5. Checking the Installed Files

Check the files in the installation directories.

(1) In $MPIROOT/bin,

	mpicc, mpif77, mpic++, mpif90, mpirun, gridmpirun,
	impi-server, mpifork, nsd, canquit, detach, gnamesv

(2) In $MPIROOT/include,

	mpi.h, mpif.h, mpi-1.h, mpi-2.h mpic++.h

(3) In $MPIROOT/lib,

	libmpi32.a	(--with-binmode=32 case)
	libmpi64.a	(--with-binmode=64 case)

(4) Check the commands are in the path.

$ which mpicc
$ which mpif77
$ which mpirun
$ which gridmpirun

2.6. Testing Compilation

Compile pi.c in the src/test/basic directory.

(1) Compile a test program.

$ cd $HOME/gridmpi-1.1/src/test/basic/
$ mpicc -q32 -O3 pi.c (for 32bit binary)
$ mpicc -q64 -O3 pi.c (for 64bit binary)

2.7. Starting a Program (as a Cluster MPI)

In the IBM-MPI environment, IBM POE (Parallel Operating Environment) is used to start MPI processes. Nodes are specified by a file host.list in POE by default. In using POE with the LoadLeveler, a batch command file llfile is needed.

(1) Create a configuration file.

Content of host.list:

node00
node01

Content of llfile:

#@job_type=parallel
#@resources=ConsumableCpus(1)
#@queue

(2) Run an application (a.out) as a cluster MPI.

$ mpirun -np 2 ./a.out -llfile llfile

NOTE: In no LoadLeveler environment, the specification of -llfile llfile is not necessary.

mpirun calls poe command inside to start MPI processes in the IBM POE. In the process, the -c argument is renamed with the -hostfile argument for poe command.

2.8. Starting a Program (as a Multiple-Cluster MPI)

(1) Create configuration files. Here, two node00 entries in host1.list, and two node01 entries in host2.list.

Content of host1.list:

node00
node00

Content of host2.list:

node01
node01

Content of llfile:

#@job_type=parallel
#@resources=ConsumableCpus(1)
#@queue

(2) Run an application (a.out).

$ export IMPI_AUTH_NONE=0
$ impi-server -server 2 &
$ mpirun -client 0 addr:port -np 2 -c host1.list ./a.out -llfile llfile &
$ mpirun -client 1 addr:port -np 2 -c host2.list ./a.out -llfile llfile

NOTE: In no LoadLeveler environment, the specification of -llfile llfile is not necessary.

See Installation on PC Clusters for descriptions. [jump]


3. Installation on Fujitsu Solaris/SPARC64V

GridMPI supports Fujitsu MPI and Fujitsu compilers in Solaris8 (Fujitsu PrimePower Series).

3.1. Prerequisite

GridMPI needs following (non-standard) commands to compile.

	- Fujitsu c99/f90
	- Fujitsu MPI (Parallelnavi)
	- gmake (GNU make)
	- makedepend (in /usr/openwin/bin)

The configurer assumes the Fujitsu compilers are installed in directory /opt/FSUNf90, and the Fujitsu MPI in /opt/FJSVmpi2 and /opt/FSUNaprun. GridMPI uses Fujitsu c99 to compile the source code.

3.2. Setting Environment Variables

# Example assumes /opt/gridmpi as MPIROOT

(For sh/bash)
$ MPIROOT=/opt/gridmpi; export MPIROOT
$ PATH="$MPIROOT/bin:/opt/FSUNf90/bin:/opt/FSUNaprun/bin:/usr/ccs/bin:$PATH"; export PATH
$ LD_LIBRARY_PATH="$LD_LIBRARY_PATH:/opt/FSUNf90/lib:/opt/FJSVmpi2/lib:\
/opt/FSUNaprun/lib"; export LD_LIBRARY_PATH
$ LD_LIBRARY_PATH_64="$LD_LIBRARY_PATH_64:/opt/FSUNf90/lib/sparcv9:\
/opt/FJSVmpi2/lib/sparcv9:/opt/FSUNaprun/lib/sparcv9:\
/usr/ucblib/sparcv9:/usr/lib/sparcv9"; export LD_LIBRARY_PATH_64

(For csh/tcsh)
% setenv MPIROOT /opt/gridmpi
% set path=($MPIROOT/bin /opt/FSUNf90/bin /opt/FSUNaprun/bin /usr/ccs/bin $path)
% setenv LD_LIBRARY_PATH "${LD_LIBRARY_PATH}:/opt/FSUNf90/lib:/opt/FJSVmpi2/lib:\
/opt/FSUNaprun/lib"
% setenv LD_LIBRARY_PATH_64 "${LD_LIBRARY_PATH_64}:/opt/FSUNf90/lib/sparcv9:\
/opt/FJSVmpi2/lib/sparcv9:/opt/FSUNaprun/lib/sparcv9:\
/usr/ucblib/sparcv9:/usr/lib/sparcv9"

3.3. Unpacking the Source

See Installation on PC Clusters. [jump]

3.4. Compiling the Source

The procedure slightly differs from the PC Clusters case in specifying --with-vendormpi to the configurer and in using gmake to compile.

$ cd $HOME/gridmpi-1.1			...(1)
$ CC=c99 CXX=FCC F77=frt F90=f90 ./configure --with-vendormpi=fjmpi --with-binmode=32	...(2)
$ gmake					...(3)
$ gmake install				...(4)
$ gmake distclean
$ CC=c99 CXX=FCC F77=frt F90=f90 ./configure --with-vendormpi=fjmpi --with-binmode=64	...(2)
$ gmake					...(3)
$ gmake install				...(4)

(1) Move to the source directory.

(2) Invoke the configurer.

The --with-vendormpi=fjmpi specifies to use Vendor MPI.

The --with-binmode=no/32/64 specifies binary mode. Use --with-binmode=no to use a compiler default mode (or when the compiler does not support options to control the mode). Use --with-binmode=32/64 to use both modes. The configure-make-install procedure shall be performed twice, once for 32bit mode and once for 64bit mode. Also specify -q32/-q64 to mpicc at compiling applications. Do not forget gmake distclean between two runs of configure.

(3) Make with gmake.

(4) Install.

Files are installed in $MPIROOT/bin, $MPIROOT/include, and $MPIROOT/lib.

Check the configuration output. The first configuration information is from GridMPI. The second configuration information is from YAMPII. Note that GridMPI's configurer calls YAMPII's configurer inside.

Check that --with-vendormpi is fjmpi, and --with-gridmpi is yes in the YAMPII part.

Configuration (output from configure)
Configuration
  MPIROOT				/opt/gridmpi
  --enable-debug			no
  --enable-threads			no
  --with-binmode			32/64
  --with-score				no
  --with-vendormpi			fjmpi
  --with-libckpt			no
  --with-libpsp				no
  --enable-dlload			yes

Configuration
  MPIROOT				/opt/gridmpi
  --enable-debug			no
  --enable-unix				yes
  --enable-onesided			no
  --enable-threads			no
  --with-binmode			32/64
  --with-score				no
  --with-gridmpi			yes
  --with-vendormpi			fjmpi
  --with-libckpt			no
  --with-libckpt-includedir		no
  --with-libckpt-libdir			no
  --enable-dlload			yes

NOTE: gmake distclean is necessary to clean the all configuration state when compiling with a different configuration. Also note that it removes all Makefiles.

3.5. Checking the Installed Files

Check the files in the installation directories.

(1) In $MPIROOT/bin,

	mpicc, mpif77, mpic++, mpif90, mpirun, gridmpirun,
	impi-server, mpifork, nsd, canquit, detach, gnamesv

(2) In $MPIROOT/include,

	mpi.h, mpif.h, mpi-1.h, mpi-2.h mpic++.h

(3) In $MPIROOT/lib,

	libmpi32.so libmpi_frt32.a libmpi_gmpi32.so (--with-binmode=32 case)
	libmpi64.so libmpi_frt64.a libmpi_gmpi64.so (--with-binmode=64 case)

(4) Check the commands are in the path.

$ which mpicc
$ which mpif77
$ which mpirun
$ which gridmpirun

3.6. Testing Compilation

Compile pi.c in the src/test/basic directory.

(1) Compile a test program.

$ cd $HOME/gridmpi-1.1/src/test/basic/
$ mpicc -q32 -Kfast pi.c (for 32bit binary)
$ mpicc -q64 -Kfast -KV9 pi.c (for 64bit binary)

3.7. Starting a Program (as a Cluster MPI)

In the Fujitsu MPI environment, MPI runtime /opt/FJSVmpi2/bin/mpiexec is used to start MPI processes. Options to mpirun are translated and passed to Fujitsu MPI mpiexec: -np to -n and -c to -nl. The contents of the host configuration file specified by -c is traslated to the nodelist format of -nl, and the configuration file should be a list of the host names (a host per line, comments not allowed).

(1) Run an application (a.out) as a cluster MPI.

Configuration file is not needed. Fujitsu MPI automatically configures for available nodes.

$ mpirun -np 2 ./a.out

mpirun accepts the node list option -nl when configured with Fujitsu MPI. For example, only node zero should be used in the run, pass a list of zeros to the -nl option (one more zeros is needed to the number to the MPI processes which is assigned to the daemon process).

mpirun -np 4 -nl 0,0,0,0,0

mpirun accepts the -c option, which is intended for using with NODELIST of PBS Pro.

3.8. Starting a Program (as a Multiple-Cluster MPI)

(1) Run an application (a.out).

$ export IMPI_AUTH_NONE=0
$ impi-server -server 2 &
$ mpirun -client 0 addr:port -np 2 ./a.out &
$ mpirun -client 1 addr:port -np 2 ./a.out

3. Notes on Other Platforms

Solaris

GridMPI is not fully tested in Solaris, but it should work. It can be installed and used as PC clusters, except:

FreeBSD/IA32

GridMPI is not fully tested in FreeBSD, but it should work. It can be installed and used as PC clusters.


5. Info: Structure of GridMPI Execution

In running a single cluster MPI, processes are started by mpirun (without the -client option). It is just a cluster MPI, YAMPII, and the YAMPII protocol is used in communication.

In running a multiple-cluster MPI, processes are started by mpirun -client n ip-address-port for each MPI job. Multiply invoked MPI jobs join by connecting to an impi-server process.

The impi-server is a process to exchange information of processes (e.g., IP address/port pairs of the processes) from multiple MPI invocations. It does nothing after exchanging information until joining processes in MPI_Finalize.

In multiple-cluster MPI, the YAMPII protocol is used for intra-cluster communication, and the IMPI (Interoperable MPI) protocol is used for inter-cluster communication. Multiply started MPI processes receives their rank with regard to the client number. The lowest ranks are assigned to the processes started with mpirun -client 0 and the next lowest are to processes started with mpirun -client 1, and so on.

                        IMPI Protocol
          +---------+===================+---------+
          |         |                   |         |
    +-----|---------|-----+       +-----|---------|-----+ 	+--------+
    | +-------+ +-------+ |       | +-------+ +-------+ |	| impi-	 |
    | | rank0 | | rank1 | |       | | rank2 | | rank3 | |	| server |
    | +-------+ +-------+ |       | +-------+ +-------+ |	+--------+
    |     |         |     |       |     |         |     |
    |     +=========+     |       |     +=========+     |
    |   YAMPII Protocol   |       |   YAMPII Protocol   |
    +---------------------+       +---------------------+
       mpirun -client 0		     mpirun -client 1

6. Test with GRIDMPIRUN Script

gridmpirun is a simple frontend script to invoke the impi-server and a number of mpirun via an rsh/ssh. It reads its own configuration file and starts processes as specified.

impi_conf specifies the way to start MPI processes on each cluster.

(1) Create configuration files. Here, two node00 entires in host1.list, and two node01 entries in host2.list.

Content of host1.list:

node00
node00

Content of host2.list:

node01
node01

Content of impi_conf:

-np 2 -c host1.list
-np 2 -c host2.list

Content of llfile:

#@job_type=parallel
#@resources=ConsumableCpus(1)
#@queue

(2) Run an application (a.out).

$ gridmpirun -np 4 -machinefile impi_conf ./a.out -llfile llfile

-np specifies the total number of MPI processes. gridmpirun breaks up the number into the sum of the number of processes of each cluster.


($Date: 2007/03/01 14:47:24 $)