Dear all,
I was wondering if anyone has experience building LAMMPS for an NEC-SX super-computer because I have been having problems with it. Essentially, I haven’t had any problems with pair potentials, but force calculations eventually become NAN for the airebo potential when atoms are bonded in chains longer than 2 atoms. I am, slowly, attempting to weed out the problem, but there is no interactive login abilities on SX nodes. I have to wait for a job to process on the developer-queue and print out something. I could not get Valgrind to compile for this system, but I could coerce fftw and a gmake to build by patching the configure scripts.
The problem is likely due to overzealous optimizations, but peculiarly, all optimization options lead to this behavior except “no-optimization”, which also leaves me with a 150 MB binary executable (it barely fits in my home directory). That means even “safe” optimizations cause this strange behavior.
Interestingly, this does not cause LAMMPS to crash because it continues to run all time steps, so perhaps there is simply a conversion to print format that is going wrong instead. I had expected LAMMPS to crash with NAN forces, but perhaps this is expected. The version of lammps I am using is a git-checkout from Jan-15-2013 (0c292ed533831f2b4298c2656f587a65b9b596e5).
I have also noticed that if I build lammps with gcc on the interactive shell (does not run on the SX system), it runs faster than the cross-compiled version on the supercomputer and doesn’t produce NAN forces of course.
I just wanted to know if anyone has experience with patching LAMMPS to work on an unfriendly system like this or if you have any suggestions on where I may not have looked to figure this out. I will gladly provide any patches if I figure this out.
Best,
Derek Thomas