help to fix the gpu problem

Dear Lammps Users:

I downloaded the last version of gpu package and compiled
"Makefile.fermi" to get the .a file.

But when I run ./nvc_get_devices, I got the error message as follow:

That's the kind of error you get if you haven't launched
the MPI daemon, e.g. mpd with MPICH. But I don't know
why nvc_get_devices would require you to do that.

Maybe Mike Brown has a comment on this.

Steve

Thanks a lot. I used mpich2-1.2.1p1 and here are the commands to compile mpi:
./configure --prefix=/usr/local/mpich2/1.2.1p1 --enable-f77
--enable-f90 --enable-cxx --enable-threads --with-PACKAGE=yes
--with-pm=mpd CC=icc CXX=icpc F77=ifort F90=ifort 2>&1 | tee c.txt
make 2>&1 | tee m.txt
make install 2>&1 | tee mi.txt
I add the following lines in .bashrc:
export PATH=/usr/local/mpich2/1.2.1p1/bin:$PATH
export LD_LIBRARY_PATH=/usr/local/mpich2/1.2.1p1/lib:$PATH

My intel compilers is the version 11.1/072. My graphic cards are three
GTX480s. I compiled the gpu package with Makefile.fermi in which I
changed arch=20 and deleted the option -ffast-math unknown by icpc.

So I am pretty sure that I installed mpich2 very well. And lmp_linux
works very well without gpu package.

Thanks again.

Hongyi

You can ignore the MPI error - the underlying GPU library, Geryon, will use either exit() or MPI_Abort() depending on the compile flags; these should be changed for nvc_get_devices to use exit() since it is serial.

The "unknown error" you are getting is unusual and probably hardware or os related. Have you tried to run any other cuda code on the GPUs. I have seen this error, for example, when the power is not connected correctly to the GPUs.

- Mike

Hongyi Liu wrote:

Thank you very much. I set both lib and lib64 as the Environment
Variables (I didn't add the 32-bit lib before. Maybe it is a new tip
now) and then ./nvc_get_devices and gpu package work very well. Now I
can run the gpu examples.

I still get three regular warnings and one negligeable warning in case
of any need:
icpc: command line warning #10006: ignoring unknown option '-ffast-math'
cudpp_mini/cudpp_maximal_launch.cpp(83): warning #68: integer
conversion resulted in a change of sign
              return -1;
                     ^

cudpp_mini/cudpp_maximal_launch.cpp(88): warning #68: integer
conversion resulted in a change of sign
              return -1;
                     ^

cudpp_mini/cudpp_maximal_launch.cpp(93): warning #68: integer
conversion resulted in a change of sign
      return -1;
             ^

Thanks again.

-HL

Although I can run the gpu example, there are four warnings when I
compile lammps with gpu package:
icc -O -DLAMMPS_GZIP -I../../lib/atc -I../../lib/reax
-I../../lib/poems -I../../lib/meam -DMPICH_SKIP_MPICXX
-I/usr/local/mpich2/1.2.1p1/include -DFFT_FFTW2
-I/usr/local/fftw/2.1.5/include -c force.cpp
pair_cg_cmm_coul_msm.h(42): warning #1125: function
"LAMMPS_NS::Pair::extract(char *, int &)" is hidden by
"LAMMPS_NS::PairCGCMMCoulMSM::extract" -- virtual function override
intended?
    void *extract(char *str);
          ^

icc -O -DLAMMPS_GZIP -I../../lib/atc -I../../lib/reax
-I../../lib/poems -I../../lib/meam -DMPICH_SKIP_MPICXX
-I/usr/local/mpich2/1.2.1p1/include -DFFT_FFTW2
-I/usr/local/fftw/2.1.5/include -c lammps.cpp
pair_cg_cmm_coul_msm.h(42): warning #1125: function
"LAMMPS_NS::Pair::extract(char *, int &)" is hidden by
"LAMMPS_NS::PairCGCMMCoulMSM::extract" -- virtual function override
intended?
    void *extract(char *str);
          ^

icc -O -DLAMMPS_GZIP -I../../lib/atc -I../../lib/reax
-I../../lib/poems -I../../lib/meam -DMPICH_SKIP_MPICXX
-I/usr/local/mpich2/1.2.1p1/include -DFFT_FFTW2
-I/usr/local/fftw/2.1.5/include -c modify.cpp
icc -O -DLAMMPS_GZIP -I../../lib/atc -I../../lib/reax
-I../../lib/poems -I../../lib/meam -DMPICH_SKIP_MPICXX
-I/usr/local/mpich2/1.2.1p1/include -DFFT_FFTW2
-I/usr/local/fftw/2.1.5/include -c pair_cg_cmm_coul_long_gpu.cpp
icc -O -DLAMMPS_GZIP -I../../lib/atc -I../../lib/reax
-I../../lib/poems -I../../lib/meam -DMPICH_SKIP_MPICXX
-I/usr/local/mpich2/1.2.1p1/include -DFFT_FFTW2
-I/usr/local/fftw/2.1.5/include -c pair_cg_cmm_coul_msm.cpp
pair_cg_cmm_coul_msm.h(42): warning #1125: function
"LAMMPS_NS::Pair::extract(char *, int &)" is hidden by
"LAMMPS_NS::PairCGCMMCoulMSM::extract" -- virtual function override
intended?
    void *extract(char *str);
          ^

icc -O -DLAMMPS_GZIP -I../../lib/atc -I../../lib/reax
-I../../lib/poems -I../../lib/meam -DMPICH_SKIP_MPICXX
-I/usr/local/mpich2/1.2.1p1/include -DFFT_FFTW2
-I/usr/local/fftw/2.1.5/include -c pair_cg_cmm_coul_msm_gpu.cpp
pair_cg_cmm_coul_msm.h(42): warning #1125: function
"LAMMPS_NS::Pair::extract(char *, int &)" is hidden by
"LAMMPS_NS::PairCGCMMCoulMSM::extract" -- virtual function override
intended?
    void *extract(char *str);
          ^

Thanks again.

-HL

you can ignore anything with CoulMSM in it. it is not used.

axel.

Thanks a lot. Good to know this.