Do You Know Commands to Run a Job from the Example GPU Job Folder

Hello everyone -

I am not a developer or a current LAMMPS user, but want to try to use the latest GPU version on a cluster.

It would help very much if someone could tell me the commands to run GPU jobs from the Example Job Folder.

Thanks very much -

Mike

Hello everyone -

hello mike,

I am not a developer or a current LAMMPS user, but want to try to use
the latest GPU version on a cluster.

It would help very much if someone could tell me the commands to run
GPU jobs from the Example Job Folder.

there is really not much of a difference to running CPU-only jobs.
if you need help with that, too, you may be better off
contracting somebody that knows how to do this.

the main differences are the use of the package command
and to have LAMMPS compiled with the proper code included.
details (referring to the very latest posted patchlevel)
are in the online manual:

http://lammps.sandia.gov/doc/package.html
http://lammps.sandia.gov/doc/Section_accelerate.html

axel.

Thanks Axel -

I appreciate the information and suggestion.

We will check the links and let everyone know.

All the best -

Mike

Hi Axel,

I’ve read through the docs a little (Thanks for the links). I can see that launching jobs should be fairly simple, ./lmp_machine < input.script , I keep getting ERROR: Invalid pair style on the example GPU jobs.

I’ve built the lmp binary with make yes-opt so it should be loading the opt pair styles…

Any idea why this is happening? I’ve also tried launching the job with ./lmp_machine -suffic opt < input.script

One example GPU script :

GI-System

units metal

newton off

atom_style charge

read_data data.phosphate

replicate 3 3 3

pair_style lj/cut/coul/long/gpu 15.0

#pair_style

pair_coeff 1 1 0.0 0.29

pair_coeff 1 2 0.0 0.29

pair_coeff 1 3 0.000668 2.5738064

pair_coeff 2 2 0.0 0.29

pair_coeff 2 3 0.004251 1.91988674

pair_coeff 3 3 0.012185 2.91706967

#kspace_style pppm/gpu/single 1e-5

#kspace_style pppm/gpu/double 1e-5

neighbor 2.0 bin

thermo 100

timestep 0.001

fix 0 all gpu force/neigh 0 1 1

fix 1 all npt temp 400 400 0.01 iso 1000.0 1000.0 1.0

run 1000

unfix 1

Output:

[root@…2854… gpu]# ./lmp_g++3 -in in.phosphate.gpu -suffix opt

LAMMPS (29 Jul 2011)

Reading data file …

orthogonal box = (33.0201 33.0201 33.0201) to (86.9799 86.9799 86.9799)

1 by 1 by 1 processor grid

10950 atoms

10950 velocities

Replicating atoms …

orthogonal box = (33.0201 33.0201 33.0201) to (194.899 194.899 194.899)

1 by 1 by 1 processor grid

295650 atoms

ERROR: Invalid pair style

Hi Axel,

I've read through the docs a little (Thanks for the links). I can see
that launching jobs should be fairly simple, ./lmp_machine <
input.script , I keep getting ERROR: Invalid pair style on the
example GPU jobs.

that means that you didn't compile in the GPU package.

I've built the lmp binary with make yes-opt so it should be loading
the opt pair styles..

opt is different from GPU.

also, if you run without mpirun, you will have to adapt
the in.xxxx scripts, as they are set up for a machine with
two GPUs, which requires at least two copies of lammps to
run in parallel (but you can try 4 or 6 to maximize GPU
utilization)

Any idea why this is happening? I've also tried launching the job
with ./lmp_machine -suffic opt < input.script

the GPU examples already have an explicit suffix.

axel.

The GPU package is being compiled in:

[[email protected]... src]# make yes-gpu
Installing package gpu
[[email protected]... src]# make yes-opt
Installing package opt
[[email protected]... src]# make g++3
make[1]: Entering directory `/root/lammps-29Jul11/src/Obj_g++3'
mpic++ -g -O -DLAMMPS_GZIP -DFFT_NONE -I/cm/shared/apps/mpich2_smpd/ge/gcc/64/1.3.2p1/include -DFFT_FFTW -c force.cpp
g++ -g -O -L../../lib/gpu -L/cm/shared/apps/mpich2_smpd/ge/gcc/64/1.3.2p1/lib -L/cm/shared/apps/fftw/open64/64/3.2.2/include -L/cm/shared/apps/cuda40/toolkit/4.0.17/lib64 angle_charmm.o angle_cosine.o angle_cosine_delta.o angle_cosine_periodic.o angle_cosine_squared.o angle.o angle_harmonic.o angle_hybrid.o angle_table.o atom.o atom_vec_angle.o atom_vec_atomic.o atom_vec_bond.o atom_vec_charge.o atom_vec.o atom_vec_ellipsoid.o atom_vec_full.o atom_vec_hybrid.o atom_vec_molecular.o atom_vec_sphere.o bond.o bond_fene.o bond_fene_expand.o bond_harmonic.o bond_hybrid.o bond_morse.o bond_nonlinear.o bond_quartic.o bond_table.o change_box.o comm.o compute_angle_local.o compute_atom_molecule.o compute_bond_local.o compute_centro_atom.o compute_cluster_atom.o compute_cna_atom.o compute_com.o compute_com_molecule.o compute_coord_atom.o compute.o compute_dihedral_local.o compute_displace_atom.o compute_erotate_sphere.o compute_group_group.o compute_gyration.o compute_gyration_molecule.o compute_heat_flux.o compute_improper_local.o compute_ke_atom.o compute_ke.o compute_msd.o compute_msd_molecule.o compute_pair.o compute_pair_local.o compute_pe_atom.o compute_pe.o compute_pressure.o compute_property_atom.o compute_property_local.o compute_property_molecule.o compute_rdf.o compute_reduce.o compute_reduce_region.o compute_slice.o compute_stress_atom.o compute_temp_com.o compute_temp.o compute_temp_deform.o compute_temp_partial.o compute_temp_profile.o compute_temp_ramp.o compute_temp_region.o compute_temp_sphere.o compute_ti.o create_atoms.o create_box.o delete_atoms.o delete_bonds.o dihedral_charmm.o dihedral.o dihedral_harmonic.o dihedral_helix.o dihedral_hybrid.o dihedral_multi_harmonic.o dihedral_opls.o displace_atoms.o displace_box.o domain.o dump_atom.o dump_cfg.o dump.o dump_custom.o dump_dcd.o dump_image.o dump_local.o dump_xyz.o error.o finish.o fix_adapt.o fix_addforce.o fix_ave_atom.o fix_ave_correlate.o fix_aveforce.o fix_ave_histo.o fix_ave_spatial.o fix_ave_time.o fix_bond_break.o fix_bond_create.o fix_bond_swap.o fix_box_relax.o fix.o fix_deform.o fix_deposit.o fix_drag.o fix_dt_reset.o fix_efield.o fix_enforce2d.o fix_evaporate.o fix_external.o fix_gpu.o fix_gravity.o fix_heat.o fix_indent.o fix_langevin.o fix_lineforce.o fix_minimize.o fix_momentum.o fix_move.o fix_nh.o fix_nh_sphere.o fix_nph.o fix_nph_sphere.o fix_npt.o fix_npt_sphere.o fix_nve.o fix_nve_limit.o fix_nve_noforce.o fix_nve_sphere.o fix_nvt.o fix_nvt_sllod.o fix_nvt_sphere.o fix_orient_fcc.o fix_planeforce.o fix_press_berendsen.o fix_print.o fix_qeq_comb.o fix_read_restart.o fix_recenter.o fix_respa.o fix_rigid.o fix_rigid_nve.o fix_rigid_nvt.o fix_setforce.o fix_shake.o fix_shear_history.o fix_spring.o fix_spring_rg.o fix_spring_self.o fix_store_force.o fix_store_state.o fix_temp_berendsen.o fix_temp_rescale.o fix_thermal_conductivity.o fix_tmd.o fix_ttm.o fix_viscosity.o fix_viscous.o fix_wall.o fix_wall_harmonic.o fix_wall_lj126.o fix_wall_lj93.o fix_wall_reflect.o fix_wall_region.o force.o group.o improper.o improper_cvff.o improper_harmonic.o improper_hybrid.o improper_umbrella.o input.o integrate.o irregular.o kspace.o lammps.o lattice.o library.o main.o math_extra.o memory.o min_cg.o min.o min_fire.o min_hftn.o minimize.o min_linesearch.o min_quickmin.o min_sd.o modify.o neigh_bond.o neighbor.o neigh_derive.o neigh_full.o neigh_gran.o neigh_half_bin.o neigh_half_multi.o neigh_half_nsq.o neigh_list.o neigh_request.o neigh_respa.o neigh_stencil.o output.o pack.o pair_adp.o pair_airebo.o pair_born.o pair_buck_coul_cut.o pair_buck.o pair_comb.o pair_coul_cut.o pair_coul_debye.o pair.o pair_dpd.o pair_dpd_tstat.o pair_eam_alloy.o pair_eam_alloy_opt.o pair_eam.o pair_eam_fs.o pair_eam_fs_opt.o pair_eam_opt.o pair_eim.o pair_gauss.o pair_hbond_dreiding_lj.o pair_hbond_dreiding_morse.o pair_hybrid.o pair_hybrid_overlay.o pair_lj96_cut.o pair_lj96_cut_gpu.o pair_lj_charmm_coul_charmm.o pair_lj_charmm_coul_charmm_implicit.o pair_lj_cut_coul_cut.o pair_lj_cut_coul_cut_gpu.o pair_lj_cut_coul_debye.o pair_lj_cut.o pair_lj_cut_gpu.o pair_lj_cut_opt.o pair_lj_cut_tgpu.o pair_lj_expand.o pair_lj_expand_gpu.o pair_lj_gromacs_coul_gromacs.o pair_lj_gromacs.o pair_lj_smooth.o pair_morse.o pair_morse_gpu.o pair_morse_opt.o pair_omp_gpu.o pair_rebo.o pair_soft.o pair_sw.o pair_table.o pair_tersoff.o pair_tersoff_zbl.o pair_yukawa.o random_mars.o random_park.o read_data.o read_restart.o region_block.o region_cone.o region.o region_cylinder.o region_intersect.o region_plane.o region_prism.o region_sphere.o region_union.o replicate.o respa.o run.o set.o special.o thermo.o timer.o universe.o update.o variable.o velocity.o verlet.o write_restart.o -lgpu -lmpich -lmpl -lcudart -lcuda -o ../lmp_g++3
size ../lmp_g++3
   text data bss dec hex filename
11736477 8528 11008 11756013 b361ed ../lmp_g++3
make[1]: Leaving directory `/root/lammps-29Jul11/src/Obj_g++3'

I actually got it to work with the in.melt_2.5.gpu job...

[[email protected]... gpu]# ./lmp_g++3 -sf opt < in.melt_2.5.gpu
LAMMPS (29 Jul 2011)
Lattice spacing in x,y,z = 1.6796 1.6796 1.6796
Created orthogonal box = (0 0 0) to (67.1838 67.1838 67.1838)
  1 by 1 by 1 processor grid
Created 256000 atoms

The GPU package is being compiled in:

[[email protected]... src]# make yes-gpu
Installing package gpu
[[email protected]... src]# make yes-opt
Installing package opt
[[email protected]... src]# make g++3
make[1]: Entering directory `/root/lammps-29Jul11/src/Obj_g++3'
mpic++ -g -O -DLAMMPS_GZIP -DFFT_NONE -I/cm/shared/apps/mpich2_smpd/ge/gcc/64/1.3.2p1/include -DFFT_FFTW -c force.cpp
g++ -g -O -L../../lib/gpu -L/cm/shared/apps/mpich2_smpd/ge/gcc/64/1.3.2p1/lib -L/cm/shared/apps/fftw/open64/64/3.2.2/include -L/cm/shared/apps/cuda40/toolkit/4.0.17/lib64 angle_charmm.o angle_cosine.o angle_cosine_delta.o angle_cosine_periodic.o angle_cosine_squared.o angle.o angle_harmonic.o angle_hybrid.o angle_table.o atom.o atom_vec_angle.o atom_vec_atomic.o atom_vec_bond.o atom_vec_charge.o atom_vec.o atom_vec_ellipsoid.o atom_vec_full.o atom_vec_hybrid.o atom_vec_molecular.o atom_vec_sphere.o bond.o bond_fene.o bond_fene_expand.o bond_harmonic.o bond_hybrid.o bond_morse.o bond_nonlinear.o bond_quartic.o bond_table.o change_box.o comm.o compute_angle_local.o compute_atom_molecule.o compute_bond_local.o compute_centro_atom.o compute_cluster_atom.o compute_cna_atom.o compute_com.o compute_com_molecule.o compute_coord_atom.o compute.o compute_dihedral_local.o compute_displace_atom.o compute_erotate_sphere.o compute_group_group.o compute_gyration.o compute_gyration_molecule.o compute_heat_flux.o compute_improper_local.o compute_ke_atom.o compute_ke.o compute_msd.o compute_msd_molecule.o compute_pair.o compute_pair_local.o compute_pe_atom.o compute_pe.o compute_pressure.o compute_property_atom.o compute_property_local.o compute_property_molecule.o compute_rdf.o compute_reduce.o compute_reduce_region.o compute_slice.o compute_stress_atom.o compute_temp_com.o compute_temp.o compute_temp_deform.o compute_temp_partial.o compute_temp_profile.o compute_temp_ramp.o compute_temp_region.o compute_temp_sphere.o compute_ti.o create_atoms.o create_box.o delete_atoms.o delete_bonds.o dihedral_charmm.o dihedral.o dihedral_harmonic.o dihedral_helix.o dihedral_hybrid.o dihedral_multi_harmonic.o dihedral_opls.o displace_atoms.o displace_box.o domain.o dump_atom.o dump_cfg.o dump.o dump_custom.o dump_dcd.o dump_image.o dump_local.o dump_xyz.o error.o finish.o fix_adapt.o fix_addforce.o fix_ave_atom.o fix_ave_correlate.o fix_aveforce.o fix_ave_histo.o fix_ave_spatial.o fix_ave_time.o fix_bond_break.o fix_bond_create.o fix_bond_swap.o fix_box_relax.o fix.o fix_deform.o fix_deposit.o fix_drag.o fix_dt_reset.o fix_efield.o fix_enforce2d.o fix_evaporate.o fix_external.o fix_gpu.o fix_gravity.o fix_heat.o fix_indent.o fix_langevin.o fix_lineforce.o fix_minimize.o fix_momentum.o fix_move.o fix_nh.o fix_nh_sphere.o fix_nph.o fix_nph_sphere.o fix_npt.o fix_npt_sphere.o fix_nve.o fix_nve_limit.o fix_nve_noforce.o fix_nve_sphere.o fix_nvt.o fix_nvt_sllod.o fix_nvt_sphere.o fix_orient_fcc.o fix_planeforce.o fix_press_berendsen.o fix_print.o fix_qeq_comb.o fix_read_restart.o fix_recenter.o fix_respa.o fix_rigid.o fix_rigid_nve.o fix_rigid_nvt.o fix_setforce.o fix_shake.o fix_shear_history.o fix_spring.o fix_spring_rg.o fix_spring_self.o fix_store_force.o fix_store_state.o fix_temp_berendsen.o fix_temp_rescale.o fix_thermal_conductivity.o fix_tmd.o fix_ttm.o fix_viscosity.o fix_viscous.o fix_wall.o fix_wall_harmonic.o fix_wall_lj126.o fix_wall_lj93.o fix_wall_reflect.o fix_wall_region.o force.o group.o improper.o improper_cvff.o improper_harmonic.o improper_hybrid.o improper_umbrella.o input.o integrate.o irregular.o kspace.o lammps.o lattice.o library.o main.o math_extra.o memory.o min_cg.o min.o min_fire.o min_hftn.o minimize.o min_linesearch.o min_quickmin.o min_sd.o modify.o neigh_bond.o neighbor.o neigh_derive.o neigh_full.o neigh_gran.o neigh_half_bin.o neigh_half_multi.o neigh_half_nsq.o neigh_list.o neigh_request.o neigh_respa.o neigh_stencil.o output.o pack.o pair_adp.o pair_airebo.o pair_born.o pair_buck_coul_cut.o pair_buck.o pair_comb.o pair_coul_cut.o pair_coul_debye.o pair.o pair_dpd.o pair_dpd_tstat.o pair_eam_alloy.o pair_eam_alloy_opt.o pair_eam.o pair_eam_fs.o pair_eam_fs_opt.o pair_eam_opt.o pair_eim.o pair_gauss.o pair_hbond_dreiding_lj.o pair_hbond_dreiding_morse.o pair_hybrid.o pair_hybrid_overlay.o pair_lj96_cut.o pair_lj96_cut_gpu.o pair_lj_charmm_coul_charmm.o pair_lj_charmm_coul_charmm_implicit.o pair_lj_cut_coul_cut.o pair_lj_cut_coul_cut_gpu.o pair_lj_cut_coul_debye.o pair_lj_cut.o pair_lj_cut_gpu.o pair_lj_cut_opt.o pair_lj_cut_tgpu.o pair_lj_expand.o pair_lj_expand_gpu.o pair_lj_gromacs_coul_gromacs.o pair_lj_gromacs.o pair_lj_smooth.o pair_morse.o pair_morse_gpu.o pair_morse_opt.o pair_omp_gpu.o pair_rebo.o pair_soft.o pair_sw.o pair_table.o pair_tersoff.o pair_tersoff_zbl.o pair_yukawa.o random_mars.o random_park.o read_data.o read_restart.o region_block.o region_cone.o region.o region_cylinder.o region_intersect.o region_plane.o region_prism.o region_sphere.o region_union.o replicate.o respa.o run.o set.o special.o thermo.o timer.o universe.o update.o variable.o velocity.o verlet.o write_restart.o -lgpu -lmpich -lmpl -lcudart -lcuda -o ../lmp_g++3
size ../lmp_g++3
   text data bss dec hex filename
11736477 8528 11008 11756013 b361ed ../lmp_g++3
make[1]: Leaving directory `/root/lammps-29Jul11/src/Obj_g++3'

I actually got it to work with the in.melt_2.5.gpu job...

[[email protected]... gpu]# ./lmp_g++3 -sf opt < in.melt_2.5.gpu
LAMMPS (29 Jul 2011)
Lattice spacing in x,y,z = 1.6796 1.6796 1.6796
Created orthogonal box = (0 0 0) to (67.1838 67.1838 67.1838)
  1 by 1 by 1 processor grid
Created 256000 atoms

--------------------------------------------------------------------------
- Using GPGPU acceleration for lj/cut:
- with 1 proc(s) per device.
--------------------------------------------------------------------------
GPU 0: Tesla M2090, 512 cores, 5.2/5.2 GB, 1.3 GHZ (Single Precision)
GPU 1: Tesla M2090, 512 cores, 5.2/5.2 GB, 1.3 GHZ (Single Precision)
--------------------------------------------------------------------------

Initializing GPU and compiling on process 0...Done.
Initializing GPUs 0-1 on core 0...Done.

Setting up run ...
Memory usage per processor = 46.8381 Mbytes
Step Temp E_pair E_mol TotEng Press
       0 1.44 -6.7733673 0 -4.6133757 -5.0196752
     100 0.7586562 -5.7603257 0 -4.6223459 0.19586131
     200 0.75643086 -5.757286 0 -4.6226441 0.22641303
     300 0.74927399 -5.7463983 0 -4.6224918 0.29738074
     400 0.74049347 -5.7329249 0 -4.6221891 0.377668
     500 0.73092402 -5.7182663 0 -4.6218846 0.46897715
     600 0.72317032 -5.7063315 0 -4.6215803 0.53474975
     700 0.71587262 -5.6950583 0 -4.6212536 0.59599934
     800 0.71088656 -5.6872327 0 -4.620907 0.64815743
     900 0.7059702 -5.679696 0 -4.6207449 0.68778549

It does not work with the styles in the in.phosphate.gpu or in.rhodo.gpu job files though.

yes, because those use gpu accelerated versions of
styles from the KSPACE package, which will only be
compiled in, if KSPACE is compiled in as well (since
they depend on its functionality).

you may have to change those scripts for the kspace_style
since the latest update for the FFT libraries now ties
the precision of the pppm/gpu style to the floating point
precision of the FFT library and thus the /single or /double
flags have been removed.

i also recommend to compile the gpu library with mixed
precision, since that will be a more realistic scenario
when running with LAMMPS (since the CPU version uses
all double precision throughout and the code is meant
to be used for very large systems, where coordinates
in single precision would lead to serious errors in the
physics of the model due to floating point truncation errors.

axel.