some advice on using valgrind to debug LAMMPS

Quang_Ha · November 19, 2018, 8:39pm

Hi all,

This may come in pretty general for LAMMPS mailing list, but I thought
I would want to give it a try... I am trying to debug my code to allow
it running with yes-user-intel when compiling LAMMPS with Intel's
architecture. At the moment, running on fewer number of processors per
node seems to work fine, but there is memory error when trying to run
on large number of nodes.

I have tried debugging my code with valgrind command as followed:

valgrind --log-file="out.txt" mpirun -np 28
$LMP_SRC/lmp_intel_cpu_openmpi < flow.lmp -var dname data-wall &>
out-err.txt

And the output for out.txt: https://pastebin.com/7tMj7rkt
and out-err.txt: https://pastebin.com/6B62dtmz

At the moment, I'm trying to track down the origin of uninitialised
values (according to some Google's results), and shall slowly work my
way from there. In the meantime, does anyone have some tricks up their
sleeves that could help speeding up the debugging process?

Many thanks,
QT

akohlmey · November 19, 2018, 8:45pm

don’t use OpenMPI for that. it creates far too many false positives, even if you use suitable suppressions.
i do valgrind runs almost exclusively without MPI or using MPICH.

axel.

_Diaz_Adrian · November 19, 2018, 8:50pm

The tricks depend on what it is you coded into lammps.

Quang_Ha · November 19, 2018, 9:10pm

Thanks Axel.

Ironically and surprisingly, I tried compiling LAMMPS with MPICH and
Intel compiler - none of those errors appear.... Still need to verify
further but that should be a relief for now...

I feels like my last couple of days have been robbed...

Thanks,
QT