LAMMPS GPU utilization

I have running GPULAMMPS (with USER-CUDA) for a test system of ~65k particles for a simple LJ fluid.

I have ran the same simulation on both a GTX470 and a Tesla C2070 and noticed significant performance differences between the two.

If I use nvidia-smi -a command whilst running, I notice that the utilization of the GTX470 is about 86% whereas the Tesla is about 43%.

Are there any potential reasons why the Tesla should be utilizing about half as much as the GTX470, and additionally why there is not full (100%) GPU utilization for either.

I have attached a copy of my input file below.

Thanks,
Michael

Christian (CC'd) may be able to offer some help with this.

Paul

I have running GPULAMMPS (with USER-CUDA) for a test system of ~65k
particles for a simple LJ fluid.

I have ran the same simulation on both a GTX470 and a Tesla C2070 and
noticed significant performance differences between the two.

If I use nvidia-smi -a command whilst running, I notice that the
utilization of the GTX470 is about 86% whereas the Tesla is about 43%.

Are there any potential reasons why the Tesla should be utilizing about
half as much as the GTX470, and additionally why there is not full

yes. lower memory bandwidth. particularly if the C2070 has ECC
enabled. MD is _very_ memory bandwidth dependent. please have
a look at the benchmarks for HOOMD-blue, a single GPU MD code
written from scratch and specifically optimized for CUDA:

http://codeblue.umich.edu/hoomd-blue/benchmarks.html

(100%) GPU utilization for either.

to achieve 100% GPU utilization with classical MD is _very_ hard.
certain parts of the calculations have enough parallelism to occupy
the GPU well, others don't. you also have occasional synchronization
points, which will interfere with full utilization. for 100% occupancy,
you effectively need an embarrassingly parallel problem and run this
in an infinite loop across several thousand threads.

cheers,
     axel.