run lammps in parallel with cuda acceleration

Thanks Axel and Steve, and sorry for this delay reply. I have solved the my problem through reading the package command which is ignored at first.

I am using 4 GPUs (Tesla C1060), and the system number is about 20k.

While reading the package command for gpu, I can not understand the following statements:

"As an example, if you have two GPUs per node and 8 CPU cores per node, and would like to run on 4 nodes (32 cores) with dynamic balancing of force calculation across CPU and GPU cores, you could specify"
"package gpu force/neigh 0 1 -1"

My question is where the "-1" comes from (package gpu force/neigh 0 1 -1). From the syntax of package command, "-1" represents the fraction of particles assigned to the GPU. Can it be negative? If I have 8 nodes, with 4 GPUs and 16 CPU cores per node, what should this value be?

Best wishes,

Xiaohui

From the top of the doc page:

{balance} value = split
split = fraction of work to offload to coprocessor, -1 for dynamic

So -1 means dynamic balancing, i.e. LAMMPS picks the split factor.
If you use 0 < split < 1, then you are setting the balance explicitly,
not dynamically.

Steve