Covers how to best run parallel programs on the Lunarc clusters.
Before a program with MPI instructions can be compiled the module with the desired compiler and the corresponding module for an MPI implementation should be loaded. The recommended implementation is openmpi. For example, if the desired compiler is the latest Intel compiler, which currently is version 11.1, the following loading should be performed
module add intel module add openmpi/intel
The program can then be compiled and linked with mpif77 or mpif90 for Fortran 77 and Fortran 90, respectively, and mpicc or mpiCC for C and C++, respectively, with flags as for the requested compiler. These commands are scripts that know the location of mpi-related include files and libraries as well as the compiler.
The program can then be run with mpiexec, but to get the correct version the corresponding MPI module should be loaded in queue script. Some compilers are using shared libraries by default, in particular, Intel, which means that the compiler module also has to be loaded. In the queue script the command use_modules has to be sourced to, guess what, use modules. Thus, in our example, the following lines should be added to the queue script before mpiexec and the program are called.
. use_modules module add intel module add openmpi/intel
A submit script for Milleotto could look like
# Request 8 processors on 2 nodes, 4 processors on each node #PBS -l nodes=2:ppn=4 # Request 8 hour 10 minutes of wall-clock time #PBS -l walltime=8:10:00 # Request that regular output (stdout) and # terminal output (stderr) go to the same file #PBS -j oe # Send mail when the job aborts or terminate #PBS -m ae # Mail address #PBS -M firstname.lastname@example.org #Goto the directory from which you submitted the job cd $PBS_O_WORKDIR # Load modules . use_modules module add intel module add openmpi/intel # Run on all processors mpiexec ./foo
Milleotto has 4 processors on a node, while Iris and Platon have 8. Thus, on the latter two the nodes declaration should be
# Request 8 processors on 1 node #PBS -l nodes=1:ppn=8
Note that mpiexec by default distributes the execution on all the requested processors. It communicates through the queueing system and it picks up the number of nodes automatically. Another command to start a parallel execution is mpirun. For openmpi, mpirun is exactly the same as mpiexec. However, for the somewhat deprecated MPI implementation mpich, mpirun requires that the number of processors is specified
# Run on 8 processors mprun -np 8 ./foo
On Milleotto and Iris an mpiexec module that works with mpich is loaded by default. To get the openmpi mpiexec/mpirun, the openmpi module has to be loaded.
To see what MPI-compiler combinations are available, use the command
If the same modules are always used, they can be loaded by default by adding the "module add" lines to the end of the file .bash_profile in the home directory.