Next: About this document ...
Up: 3 Parallelism
Previous: 3.3 Parallelization levels
Contents
Subsections
3.4 Tricks and problems
Some implementations of the MPI library have problems with input
redirection in parallel. This typically shows up under the form of
mysterious errors when reading data. If this happens, use the option
-i (or -in, -inp, -input),
followed by the input file name.
Example:
pw.x -i inputfile -nk 4 > outputfile
Of course the
input file must be accessible by the processor that must read it
(only one processor reads the input file and subsequently broadcasts
its contents to all other processors).
Apparently the LSF implementation of MPI libraries manages to ignore or to
confuse even the -i/in/inp/input mechanism that is present in all
QUANTUM ESPRESSO codes. In this case, use the -i option of mpirun.lsf
to provide an input file.
If you notice very bad parallel performances with MPI and MKL libraries,
it is very likely that the OpenMP parallelization performed by the latter
is colliding with MPI. Recent versions of MKL enable autoparallelization
by default on multicore machines. You must set the environmental variable
OMP_NUM_THREADS to 1 to disable it.
Note that if for some reason the correct setting of variable
OMP_NUM_THREADS
does not propagate to all processors, you may equally run into trouble.
Lorenzo Paulatto (Nov. 2008) suggests to use the -x option to mpirun to
propagate OMP_NUM_THREADS to all processors.
Axel Kohlmeyer suggests the following (April 2008):
"(I've) found that Intel is now turning on multithreading without any
warning and that is for example why their FFT seems faster than
FFTW. For serial and OpenMP based runs this makes no difference (in
fact the multi-threaded FFT helps), but if you run MPI locally, you
actually lose performance. Also if you use the 'numactl' tool on linux
to bind a job to a specific cpu core, MKL will still try to use all
available cores (and slow down badly). The cleanest way of avoiding
this mess is to either link with
-lmkl_intel_lp64 -lmkl_sequential -lmkl_core (on 64-bit:
x86_64, ia64)
-lmkl_intel -lmkl_sequential -lmkl_core (on 32-bit, i.e. ia32 )
or edit the libmkl_'platform'.a file. I'm using now a file
libmkl10.a with:
GROUP (libmkl_intel_lp64.a libmkl_sequential.a libmkl_core.a)
It works like a charm". UPDATE: Since v.4.2, configure links by
default MKL without multithreaded support.
Many users of QUANTUM ESPRESSO, in particular those working on PC clusters,
have to rely on themselves (or on less-than-adequate system managers) for
the correct configuration of software for parallel execution. Mysterious and
irreproducible crashes in parallel execution are sometimes due to bugs
in QUANTUM ESPRESSO, but more often than not are a consequence of buggy
compilers or of buggy or miscompiled MPI libraries.
Next: About this document ...
Up: 3 Parallelism
Previous: 3.3 Parallelization levels
Contents
paolo giannozzi
2015-03-08