Greetings all,
I am fairly new to lammps. In past, I have been able to complie and
run simple test cases in serial/parallel. However, I notice that there seems
to be a problem running PRD simulation in conjunction with the "dump"
command.
While testing the example of vacancy diffusion in Si, as provided in the
folder ~/lammps-14May12/examples/prd/ , I encounter an MPI crash .. and a
core.* file is generation if I choose a partition like "-partition 4x2"
The only difference between my input and the example, is I try to use the
dump command by uncommenting it
dump events all custom 1 dump.prd id type x y z (line 83 of in.prd)
I found the simulation runs fine..printing out the events when I comment
out the dump command, or when I set up multiple replicas which each replica
using just 1 node. Say, -partition 4x1 or -partition 8x1 .. works
fine..!!
This makes me speculate if the dump command has some issues with prd run.
Typical error messages contain:
... Rank 2, Process 16333 received signal SIGSEGV(11)
..
MPI_COMM_WORLD rank 2 has terminated without calling MPI_Finalize()
MPI: aborting job
MPI: Received signal 11
Has anyone else encountered similar issues? It would be great help if
someone could clarify what is going wrong and advise me on how it can be
addressed.
i have been spending some time today looking into
this and - for somebody that has never looked at this
part of the code before and never ran production PRD
calculations - it is not exactly straightforward to debug.
it is not directly the dump, but something related to it
that is giving the problems (actually forces seem to
get corrupted somehow).
what i can say at the moment is that you cannot have
partitions with more than one MPI task.
the only workaround that i can suggest at the moment
moment is to install the USER-OMP package and
compile with OpenMP support. in fact, for the PRD
example that seems to be faster:
here is the run without dump and all-MPI:
mpirun -x OMP_NUM_THREADS=1 -np 8
~/compile/lammps-icms/src/lmp_openmpi-omp -log none -in in.prd
-partition 4x2 -echo screen
LAMMPS (14 Jun 2012-ICMS)
Running on 4 partitions of processors
Setting up PRD ...
Step CPU Clock Event Correlated Coincident Replica
100 0.000 0 0 0 0 0
200 0.572 400 1 0 4 1
700 2.385 2100 2 0 2 3
900 3.257 2600 3 0 1 3
1400 4.705 4300 4 0 1 2
1500 4.949 4400 5 1 1 2
1800 5.862 5300 6 0 2 3
2100 6.784 6200 7 0 1 3
Loop time of 6.78701 on 8 procs for 2000 steps with 511 atoms
and here is the same run with OpenMP parallelization for each replica instead.
mpirun -x OMP_NUM_THREADS=2 -np 4
~/compile/lammps-icms/src/lmp_openmpi-omp -log none -in in.prd
-partition 4x1 -echo screen -sf omp
LAMMPS (14 Jun 2012-ICMS)
Running on 4 partitions of processors
Setting up PRD ...
Step CPU Clock Event Correlated Coincident Replica
100 0.000 0 0 0 0 0
200 0.407 400 1 0 4 1
700 1.776 2100 2 0 2 3
900 2.445 2600 3 0 1 3
1400 3.808 4300 4 0 1 2
1500 4.044 4400 5 1 1 2
1800 4.925 5300 6 0 2 3
2100 5.817 6200 7 0 1 3
Loop time of 5.81973 on 4 procs for 2000 steps with 511 atoms
5.8 seconds is certainly faster than 6.8 seconds...
perhaps somebody that knows more about the PRD code
can look into it and solve the real problem.
cheers,
axel.