GPU Memory leak

Michael1 · September 13, 2019, 4:51pm

Hi all,

I’m using lammps-16Mar18 (also tested on lammps-7Aug2019), OpenMPI 3.1.3, CentOS Linux 7 cluster with 2x24-core Intel Xeon CPUs and 4xGTX Titan X GPUs.

I noticed that during a run, my simulation would get killed after a while. I figured out this was due to my simulation maxing out my compute node’s 96GB of memory.

I finally narrowed it down to executing a run command in a loop with GPU acceleration, like in my simplified input file (attached). If I run it as a single run command, if I don’t use GPU acceleration, or if I use “pre no” in the run command, the simulation does not accumulate memory. So this is likely caused by repeated initialization of the GPU.

Thanks in advance for everything you all do!

Michael Jacobs
Ph.D. Candidate in Dept. of Polymer Science
The University of Akron

min0.dat (790 KB)

in.run (424 Bytes)

akohlmey · September 13, 2019, 8:49pm

thanks for reporting this. would you mind to file this case as a bug report at https://github.com/lammps/lammps/issues ?
axel.

Michael1 · September 14, 2019, 12:33pm

Bug report has been filed.

Thanks again!
Michael Jacobs
Ph.D. Candidate in Dept. of Polymer Science
The University of Akron