Lammps installation on cluster

Hi all

I installed Openmpi, FFTW and LAMMPS on a cluster locally. When I want to run a simulation I see the following lines in my log.lammps. You can see in the first lines of the following script the installation LAMMPS is correct. The file that I ran is run correctly on my PC. would you please help me?

t.o fix_wall_region.o force.o group.o improper.o improper_cvff.o improper_harmonic.o improper_hybrid.o input.o integrate.o kspace.o lammps.o lattice.o library.o main.o memory.o min_cg.o min.o min_hftn.o minimize.o min_linesearch.o min_sd.o modify.o neigh_bond.o neighbor.o neigh_derive.o neigh_full.o neigh_gran.o neigh_half_bin.o neigh_half_multi.o neigh_half_nsq.o neigh_list.o neigh_request.o neigh_respa.o neigh_stencil.o output.o pack.o pair_airebo.o pair_born_coul_long.o pair_buck_coul_cut.o pair_buck_coul_long.o pair_buck.o pair_comb.o pair_coul_cut.o pair_coul_debye.o pair_coul_long.o pair.o pair_dpd.o pair_dpd_tstat.o pair_eam_alloy.o pair_eam.o pair_eam_fs.o pair_eim.o pair_hybrid.o pair_hybrid_overlay.o pair_lj96_cut.o pair_lj_charmm_coul_charmm.o pair_lj_charmm_coul_charmm_implicit.o pair_lj_charmm_coul_long.o pair_lj_cut_coul_cut.o pair_lj_cut_coul_debye.o pair_lj_cut_coul_long.o pair_lj_cut_coul_long_tip4p.o pair_lj_cut.o pair_lj_expand.o pair_lj_gromacs_coul_gromacs.o pair_lj_gromacs.o pair_lj_smooth.o pair_morse.o pair_soft.o pair_sw.o pair_table.o pair_tersoff.o pair_tersoff_zbl.o pair_yukawa.o pppm.o pppm_tip4p.o random_mars.o random_park.o read_data.o read_restart.o region_block.o region_cone.o region.o region_cylinder.o region_intersect.o region_plane.o region_prism.o region_sphere.o region_union.o remap.o remap_wrap.o replicate.o respa.o run.o set.o shell.o special.o temper.o thermo.o timer.o universe.o update.o variable.o velocity.o verlet.o write_restart.o -lpthread -lfftw -lstdc++ -o …/lmp_linux
size …/lmp_linux
text data bss dec hex filename
2702180 2972 8432 2713584 2967f0 …/lmp_linux
make[1]: Leaving directory `/home/elena/lammps-10Sep10/src/Obj_linux’

[elena@…2728… ~] cd 05NHW [[email protected]... 05NHW] mpirun -np 4 /home/elena/lammps-10Sep10/src/lmp_linux <in
libibverbs: Fatal: couldn’t read uverbs ABI version.

The initial errors are not from LAMMPS - something from your
system. LAMMPS is giving NaNs which could be lots of things,
likely bad input. Can you run any of the many examples that
are part of the LAMMPS distro? If so, then you can be confident
the problem is your script, not the LAMMPS build.

Steve

elena,

Hi all

I installed Openmpi, FFTW and LAMMPS on a cluster locally. When I want to
run a simulation I see the following lines in my log.lammps. You can see in

[...]

[[email protected]... ~] cd 05NHW \[elena@\.\.\.2728\.\.\. 05NHW\] mpirun -np 4 /home/elena/lammps-10Sep10/src/lmp_linux

if you go through the pains of installing LAMMPS you should at least
install the current version, not one from last fall. there were lots of
improvements and bugfixes since.

<in
libibverbs: Fatal: couldn't read uverbs ABI version.
--------------------------------------------------------------------------
[0,1,2]: OpenIB on host cluster was unable to find any HCAs.
Another transport will be used instead, although this may result in
lower performance.
--------------------------------------------------------------------------
--------------------------------------------------------------------------
[0,1,2]: uDAPL on host cluster was unable to find any NICs.
Another transport will be used instead, although this may result in
lower performance.

this is an issue of the underlying MPI. most likely you
have your MPI configured for use with infiniband, but you
are testing on a node, that doesn't have an infiniband card.

try using: mpirun --mca btl sm,self -np4 ...
instead. that will work fine for a local test.

--------------------------------------------------------------------------
libibverbs: Fatal: couldn't read uverbs ABI version.
--------------------------------------------------------------------------
[0,1,0]: OpenIB on host cluster was unable to find any HCAs.
Another transport will be used instead, although this may result in
lower performance.
--------------------------------------------------------------------------
--------------------------------------------------------------------------
[0,1,0]: uDAPL on host cluster was unable to find any NICs.
Another transport will be used instead, although this may result in
lower performance.
--------------------------------------------------------------------------
libibverbs: Fatal: couldn't read uverbs ABI version.
--------------------------------------------------------------------------
[0,1,1]: OpenIB on host cluster was unable to find any HCAs.
Another transport will be used instead, although this may result in
lower performance.
--------------------------------------------------------------------------
--------------------------------------------------------------------------
[0,1,1]: uDAPL on host cluster was unable to find any NICs.
Another transport will be used instead, although this may result in
lower performance.
--------------------------------------------------------------------------
libibverbs: Fatal: couldn't read uverbs ABI version.
--------------------------------------------------------------------------
[0,1,3]: OpenIB on host cluster was unable to find any HCAs.
Another transport will be used instead, although this may result in
lower performance.
--------------------------------------------------------------------------
--------------------------------------------------------------------------
[0,1,3]: uDAPL on host cluster was unable to find any NICs.
Another transport will be used instead, although this may result in
lower performance.
--------------------------------------------------------------------------
LAMMPS (10 Sep 2010)
Scanning data file ...
2 = max bonds/atom
1 = max angles/atom
1 = max dihedrals/atom
Reading data file ...
orthogonal box = (-5 -5 -5) to (65 65 110)
1 by 2 by 2 processor grid
21000 atoms
15000 bonds
9000 angles
3000 dihedrals
Finding 1-2 1-3 1-4 neighbors ...
2 = max # of 1-2 neighbors
2 = max # of 1-3 neighbors
3 = max # of 1-4 neighbors
5 = max # of special neighbors
WARNING: Resetting reneighboring criteria during minimization
PPPM initialization ...
G vector = 0.244699
grid = 30 30 45
stencil order = 5
RMS precision = 7.61047e-05
brick FFT buffer size/proc = 19600 10800 8820
Setting up minimization ...
Memory usage per processor = 12.8591 Mbytes
Step Temp E_pair E_mol TotEng Press
0 0 nan 14365.027 nan nan

Thanks alot

since you are using kspace, have you checked whether
the FFTW2 you have installed is single precision or double
precision? lammps needs double precision.

cheers,
    axel.

Dear Steve

As I said the program is run correctly on my PC, but on the cluster there are the following errors

Thanks

elena

Hello, Elena:

You potentially have two separate problems:

(1) The input file that you have created is doing something wrong.
(2) You have not determined that LAMMPS is correctly built on the cluster.

Problem (1) is definitely occurring—otherwise, you wouldn’t have gotten “nan” messages. Working with the given input file on the cluster, however, you can’t really determine if the problem is (1) or (2). As Steve suggested, try checking the provided input files. That will tell you if the error is how you built LAMMPS, or something going wrong with your script.

–AEI

Dear Prof. Ahmed E. Ismail

As said in my previous email, the programm is run correctly on my PC(therefore there is no error inmy input script and data file) but on supercomputer I have the following error:

[elena@…2728… ~] cd 05NHW [[email protected]... 05NHW] mpirun -np 4 /home/elena/lammps-10Sep10/src/lmp_linux <in
libibverbs: Fatal: couldn’t read uverbs ABI version.

You are not listening to what any of us are saying.

You may have built the code correctly on your PC, but have an incorrect build on your cluster. You have not demonstrated otherwise. You also do not have a correct script, so you can’t determine if the problem on the cluster is caused by your script or by your build.

If people tell you what you need to do, but you won’t listen, why are you asking for our help?

—AEI

elena,

you are discounting the fact that your
error may be marginal, i.e. that it may
depend on the number of processors
and the way how the individual compiler
optimizes the code.

axel.