Errors while compiling with gpu package

_Luis_Goncalves · June 17, 2011, 5:18pm

I don't think it's that. I am using nvcc and mpicc (open64) where they
apply. Also, I am not using the flag -fno-rtti during compilation, as Alex
suggested.

Any other thoughts?

Luis

If Christian's answer is not the problem (it would
be bad to compile any LAMMPS library that uses MPI with a different MPI

than LAMMPS itself), then maybe Mike has

a suggestion (CCd).

Steve

Hi
Could it be that you compiled the gpu-lib and the main programm with

different MPI versions (i.e. different compiler wrappers)?

The GPU lib also uses an MPI Wrapper as compiler.
Christian
-------- Original-Nachricht --------

Datum: Fri, 17 Jun 2011 12:38:56 -0300 (BRT)
Von: "Luis Goncalves" <[email protected]...>
An: [email protected]
Betreff: [lammps-users] Errors while compiling with gpu package Hello

developers,

I am experiencing a linking error while compiling lammps for linux.

Here

is part of the message:
../../lib/gpu/libgpu.a(pair_gpu_device.o):(.rodata+0x4a0): undefined

reference to `typeinfo for MPI::Intracomm'

../../lib/gpu/libgpu.a(pair_gpu_device.o):(.rodata+0x4c0): undefined

reference to `typeinfo for MPI::Intracomm'

../../lib/gpu/libgpu.a(pair_gpu_device.o):(.rodata+0x510): undefined

reference to `typeinfo for MPI::Comm'

(and other similar messages)
I have the packages kspace and gpu installed. Previous to that, I

compiled the gpu library with no errors. However, the executable
nvc_get_devices returned an error 134 with the message:

terminate called after throwing an instance of 'cudaError'
Aborted
My gpu is a GT 240 with 1.2 compute capability and I installed the cuda
toolkit 4.0.17 64 bits.
What could be the cause of those errors?
Best regards,
Luis
------------------------------------------------------------------------------

EditLive Enterprise is the world's most technically advanced content
authoring tool. Experience the power of Track Changes, Inline Image
Editing and ensure content is compliant with Accessibility Checking.
http://p.sf.net/sfu/ephox-dev2dev

_______________________________________________
lammps-users mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/lammps-users

------------------------------------------------------------------------------

EditLive Enterprise is the world's most technically advanced content
authoring tool. Experience the power of Track Changes, Inline Image
Editing and ensure content is compliant with Accessibility Checking.
http://p.sf.net/sfu/ephox-dev2dev

akohlmey · June 17, 2011, 5:23pm

I don't think it's that. I am using nvcc and mpicc (open64) where they
apply. Also, I am not using the flag -fno-rtti during compilation, as Alex
suggested.

but that is exactly my point. it looks like you
were _not_ using the -fno-rtti flag when compiling
the gpu library, but it was used when your
MPI library was compiled. it may also mean
that you are including the wrong mpi.h file,
e.g. the one from the STUBS directory (which
doesn't enforce C bindings on MPI library calls).

axel.

_Luis_Goncalves · June 17, 2011, 7:54pm

OK, I added the flag -fno-rtti and succeded with the compilation.

I tested the executable with this fix

fix gpuConf all gpu force 0 0 -1

and kpace_style pppm/gpu/single. Running on 6 processors, the following
message came:

Cuda driver error 100 in call at file 'geryon/nvd_device.h' in line 207.
application called MPI_Abort(MPI_COMM_WORLD, -1) - process 0
Cuda driver error 100 in call at file 'geryon/nvd_device.h' in line 207.
application called MPI_Abort(MPI_COMM_WORLD, -1) - process 4
Cuda driver error 100 in call at file 'geryon/nvd_device.h' in line 207.
application called MPI_Abort(MPI_COMM_WORLD, -1) - process 2
Cuda driver error 100 in call at file 'geryon/nvd_device.h' in line 207.
application called MPI_Abort(MPI_COMM_WORLD, -1) - process 1
Cuda driver error 100 in call at file 'geryon/nvd_device.h' in line 207.
application called MPI_Abort(MPI_COMM_WORLD, -1) - process 3
Cuda driver error 100 in call at file 'geryon/nvd_device.h' in line 207.
application called MPI_Abort(MPI_COMM_WORLD, -1) - process 5

Btw, the nvc_get_devices still has the same problem.
Any ideas?

Thanks already!
Luis

akohlmey · June 17, 2011, 7:53pm

OK, I added the flag -fno-rtti and succeded with the compilation.

I tested the executable with this fix

fix gpuConf all gpu force 0 0 -1

and kpace_style pppm/gpu/single. Running on 6 processors, the following
message came:

Cuda driver error 100 in call at file 'geryon/nvd_device.h' in line 207.
application called MPI_Abort(MPI_COMM_WORLD, -1) - process 0
Cuda driver error 100 in call at file 'geryon/nvd_device.h' in line 207.
application called MPI_Abort(MPI_COMM_WORLD, -1) - process 4
Cuda driver error 100 in call at file 'geryon/nvd_device.h' in line 207.
application called MPI_Abort(MPI_COMM_WORLD, -1) - process 2
Cuda driver error 100 in call at file 'geryon/nvd_device.h' in line 207.
application called MPI_Abort(MPI_COMM_WORLD, -1) - process 1
Cuda driver error 100 in call at file 'geryon/nvd_device.h' in line 207.
application called MPI_Abort(MPI_COMM_WORLD, -1) - process 3
Cuda driver error 100 in call at file 'geryon/nvd_device.h' in line 207.
application called MPI_Abort(MPI_COMM_WORLD, -1) - process 5

Btw, the nvc_get_devices still has the same problem.
Any ideas?

does the nivida kernel module and driver match the
minimum requirement of the cuda toolkit that you
used to compile the gpu library?

what output does this command produce: nvidia-smi -a

axel.