What's the performance of Kremer-Grest(KG)model with DPD thermostat by using GPU in window?

In your hybrid model, since pair_style dpd/stat and bond_style fene are not included in the GPU package so their calculations were all on the CPU. The only thing that was on GPU was lj/cut. So the pair time reduced, bond time stayed roughly the same, neigh time increased since you now need to construct neighbor lists on both the CPU and GPU.

Changing to a faster should not help much. The key is to reduce the time spent in dpd/stat and fene bonds as well as neighboring time.

Ray

Hi Yongjin,

to clarify, both lj/cut and dpd/tstat are supported by the GPU package, you can double check by looking at the screen output. When you specify neigh no for the GPU package, the neighbor lists are built on the CPU and then copied to the GPU by the /gpu pair styles, therefore the increase in neigh time came from the time spent on the CPU. The /gpu pair forces and the bond forces (bond/fene in your case) are computed concurrently, the former on the GPU and the latter on the CPU.

For systems with short pair cutoffs, neighbor lists built on the host and many bonds like yours, the speedup from the GPU library will usually be modest.

Can you try switch the newton flag for the bond forces to on, i.e. using

newton off on

in the input script with the GPU run, to see if there’s any difference in the neigh time?

Thanks for pointing out the mistake with the reference’s journal.

Steve and Axel, can you help me fix it in the next patch when you have a chance?

Best,
-Trung

Hi Yongjin,

to clarify, both lj/cut and dpd/tstat are supported by the GPU package, you
can double check by looking at the screen output. When you specify neigh no
for the GPU package, the neighbor lists are built on the CPU and then copied
to the GPU by the /gpu pair styles, therefore the increase in neigh time
came from the time spent on the CPU. The /gpu pair forces and the bond
forces (bond/fene in your case) are computed concurrently, the former on the
GPU and the latter on the CPU.

For systems with short pair cutoffs, neighbor lists built on the host and
many bonds like yours, the speedup from the GPU library will usually be
modest.

Can you try switch the newton flag for the bond forces to on, i.e. using

newton off on

in the input script with the GPU run, to see if there's any difference in
the neigh time?

Thanks for pointing out the mistake with the reference's journal.

Steve and Axel, can you help me fix it in the next patch when you have a
chance?

fix is on the way:

https://github.com/lammps/lammps/pull/286/commits/230b29eae652474532947ec554db99ee2c2bf35d

axel.