GPU simulation losing atoms

I tried converting one of my simulations to work on a GPU, but I keep losing atoms and I cannot get a working version of the code. The basis of the simulation are some nanoparticles surrounded by water molecules. I have been trying to debug this for some time now, and have gotten stuck. The simulation will run until the walltime is met, but it won’t do more than a few calculations. I have already run this using the CPU, but the GPU is having issues, with lammps version 26Aug11-gpu-d.

Here is the error I get in the monitor.log file:

Step Temp E_pair E_mol TotEng Press
0 0 98120.18 9221.7655 107341.95 228210.93
ERROR: Lost atoms: original 24069 current 24065

Here is my lammps file:

dimension 3

boundary p p p
units real
atom_style full
neighbor 2.0 bin
neigh_modify delay 0 every 1 check yes
read_data input.data

Force Field Info ## Pedone-06 potential

pair_style lj/cut/coul/long/cuda 8.0
pair_coeff 1 1*2 0.000 0.000
pair_coeff 2 2 0.16275 3.16435
bond_style harmonic
bond_coeff 1 450 0.9572
angle_style harmonic
angle_coeff 1 55 104.52
kspace_style pppm/cuda 1.0e-6

#velocity all create 298.0 12345689 dist uniform
#fix 1 all shake/cuda 1e-6 500 0 m 1 a 1
#fix 3 all langevin 298.0 298.0 1000 123458
#fix 2 all nve/cuda

Minimize energy

#min_style cg
#minimize 1.0e-4 1.0e-4 1000 10000

Perform Nose Hoover NVT Integration

fix 4 all nvt/cuda temp 300 300 50

Timestep/thermo info

timestep 1
thermo .1

################ Dump/append files ################
log log

run 10
write_restart 1ps-initial-eq

I tried converting one of my simulations to work on a GPU, but I keep losing
atoms and I cannot get a working version of the code. The basis of the
simulation are some nanoparticles surrounded by water molecules. I have been
trying to debug this for some time now, and have gotten stuck. The
simulation will run until the walltime is met, but it won't do more than a
few calculations. I have already run this using the CPU, but the GPU is
having issues, with lammps version 26Aug11-gpu-d.

how can you expect to get a working version, when you are using an
almost 3 years old codebase??

Here is the error I get in the monitor.log file:

Step Temp E_pair E_mol TotEng Press
       0 0 98120.18 9221.7655 107341.95 228210.93
ERROR: Lost atoms: original 24069 current 24065

Here is my lammps file:

dimension 3
boundary p p p
units real
atom_style full
neighbor 2.0 bin
neigh_modify delay 0 every 1 check yes
read_data input.data

## Force Field Info ## Pedone-06 potential ##
pair_style lj/cut/coul/long/cuda 8.0
pair_coeff 1 1*2 0.000 0.000
pair_coeff 2 2 0.16275 3.16435
bond_style harmonic
bond_coeff 1 450 0.9572
angle_style harmonic
angle_coeff 1 55 104.52
kspace_style pppm/cuda 1.0e-6

#velocity all create 298.0 12345689 dist uniform
#fix 1 all shake/cuda 1e-6 500 0 m 1 a 1
#fix 3 all langevin 298.0 298.0 1000 123458
#fix 2 all nve/cuda

#### Minimize energy ####
#min_style cg
#minimize 1.0e-4 1.0e-4 1000 10000

## Perform Nose Hoover NVT Integration ##
fix 4 all nvt/cuda temp 300 300 50

## Timestep/thermo info ##
timestep 1
thermo .1

################ Dump/append files ################
log log

run 10
write_restart 1ps-initial-eq

----------------------------------------------------------------------------------------
Here are some other things I have tried

1. using thermo_modify to ignore lost atoms (the simulation will run for all
10 timesteps, but it will lose atoms).
2. Use communicate multi command (with thermo_modify to ignore lost atoms)
to try and save the atom (this leads to a crash at timestep 2) with lost
atoms, and some bad values for energy being reported.

where is 3. "tested with the current version of LAMMPS"?
and 4. "tested with the GPU package"?

I have considered that. I am an undergrad who shares the gpu with a lot of other people. They have been able to get working codes with the same version of lammps that I am using. So I figure that shouldn't have been an issue. I will try to build the latest version of gpu lammps and go from there.

However, just assuming that wasn't what caused the error ( which it very well might be). Are there any other ideas on what could be causing the error while I am upgrading lammps?

I have considered that. I am an undergrad who shares the gpu with a lot of other people. They have been able to get working codes with the same version of lammps that I am using. So I figure that shouldn't have been an issue. I will try to build the latest version of gpu lammps and go from there.

considering all the bug that were fixed and improvements that were
made to the code since then, i consider it a very, *very* bad idea to
*not* update (and that is only the censored version of this
statement).

However, just assuming that wasn't what caused the error ( which it very well might be). Are there any other ideas on what could be causing the error while I am upgrading lammps?

i have no interest in tracking down issues in obsolete versions of the
code. you should go and bug the people that don't upgrade, since it is
their fault and they should spend the time.

axel.

Thanks, I will update and see if that works.

-Evan