I used mpiexec -n 8 lmp -in
to use 8 cores, I have 16 threads, but lammps using one thread, how to use all threads?
Do you mean it only uses one thread per core, i.e. there are 8 threads in total? If so this is expected. Basically the formula is like this:
[# of total threads] = [# of MPI processes] * [# of threads per process]
[# of threads per process] is usually 1 by default in LAMMPS, so that explains your observation. To “use all threads”, namely increase [# of total threads] to 16, you have two approaches:
- use 16 MPI processes, e.g.
mpiexec -n 16
; - if your input script allows, use an acceleration style (INTEL, KOKKOS, or OPENMP) that supports OpenMP multithreading, so [# of threads per process] can be larger than 1. Check the documentation 7.4. Accelerator packages — LAMMPS documentation for details. For example, you may want to try INTEL style with this:
mpiexec -n 8 lmp -sf intel -pk intel 0 omp 2
. Byomp 2
you are specifying 2 OpenMP threads per process, so [# of total threads] is 8*2=16. On the contrary,mpiexec -n 8 lmp -sf intel -pk intel 0 omp 1
would only invoke 1 thread per process.
However, please note that not all programs can benefit from hyperthreading, and it is quite possible that using all 16 threads will slowdown your simulation. Typically you should benchmark for your specific system to see whether the extra threads give a performance gain.
- What platform are you running on?
- How do you determine that only one thread is used?
- What is the output of
lmp -h
?
Its windows,
in the log file it is written
output of lmp -h says compatible gpu present.
That is not the information I am looking for.
Actually, I am trying to decrease the time of run, which info you are looking for?
As I wrote, the output of lmp -h
.
Large-scale Atomic/Molecular Massively Parallel Simulator - 23 Jun 2022 - Update 4
Git info (stable / patch_23Jun2022_update4)
Usage example: C:\Users\Admin\AppData\Local\LAMMPS 64-bit 23Jun2022-MPI\bin\lmp.exe -var t 300 -echo screen -in in.alloy
List of command line options supported by this LAMMPS executable:
-echo none/screen/log/both : echoing of input script (-e)
-help : print this help message (-h)
-in none/filename : read input from file or stdin (default) (-i)
-kokkos on/off ... : turn KOKKOS mode on or off (-k)
-log none/filename : where to send log output (-l)
-mdi '<mdi flags>' : pass flags to the MolSSI Driver Interface
-mpicolor color : which exe in a multi-exe mpirun cmd (-m)
-cite : select citation reminder style (-c)
-nocite : disable citation reminder (-nc)
-nonbuf : disable screen/logfile buffering (-nb)
-package style ... : invoke package command (-pk)
-partition size1 size2 ... : assign partition sizes (-p)
-plog basename : basename for partition logs (-pl)
-pscreen basename : basename for partition screens (-ps)
-restart2data rfile dfile ... : convert restart to data file (-r2data)
-restart2dump rfile dgroup dstyle dfile ...
: convert restart to dump file (-r2dump)
-reorder topology-specs : processor reordering (-r)
-screen none/filename : where to send screen output (-sc)
-skiprun : skip loops in run and minimize (-sr)
-suffix gpu/intel/opt/omp : style suffix to apply (-sf)
-var varname value : set index style variable (-v)
OS: Windows 10 22H2, Windows ABI 6.2 (9200) on x86_64
Compiler: MinGW-w64 64bit 9.0 / GNU C++ 11.2.1 20211019 (Fedora MinGW 11.2.1-6.fc36) with OpenMP 4.5
C++ standard: C++14
MPI v2.2: MPICH
Accelerator configuration:
GPU package API: OpenCL
GPU package precision: mixed
KOKKOS package API: OpenMP Serial
KOKKOS package precision: double
OPENMP package API: OpenMP
OPENMP package precision: double
INTEL package API: OpenMP
INTEL package precision: single mixed double
Compatible GPU present: yes
Active compile time flags:
-DLAMMPS_GZIP
-DLAMMPS_PNG
-DLAMMPS_JPEG
-DLAMMPS_FFMPEG
-DLAMMPS_EXCEPTIONS
-DLAMMPS_SMALLBIG
sizeof(smallint): 32-bit
sizeof(imageint): 32-bit
sizeof(tagint): 32-bit
sizeof(bigint): 64-bit
Available compression formats:
Extension: .gz Command: gzip
Installed packages:
ASPHERE ATC AWPMD BOCS BODY BPM BROWNIAN CG-DNA CG-SDK CLASS2 COLLOID COLVARS
COMPRESS CORESHELL DIELECTRIC DIFFRACTION DIPOLE DPD-BASIC DPD-MESO DPD-REACT
DPD-SMOOTH DRUDE EFF ELECTRODE EXTRA-COMPUTE EXTRA-DUMP EXTRA-FIX
EXTRA-MOLECULE EXTRA-PAIR FEP GPU GRANULAR INTEL INTERLAYER KOKKOS KSPACE
LATTE MACHDYN MANIFOLD MANYBODY MC MDI MEAM MESONT MGPT MISC ML-HDNNP ML-IAP
ML-RANN ML-SNAP MOFFF MOLECULE MOLFILE OPENMP OPT ORIENT PERI PHONON PLUGIN
PLUMED POEMS PTM QEQ QTB REACTION REAXFF REPLICA RIGID SHOCK SMTBQ SPH SPIN
SRD TALLY UEF VORONOI YAFF
List of individual style options included in this LAMMPS executable
* Atom styles:
angle angle/kk atomic atomic/kk body
bond bond/kk bpm/sphere charge charge/kk
dielectric dipole dpd dpd/kk edpd
electron ellipsoid full full/kk hybrid
hybrid/kk line mdpd mesont molecular
molecular/kk oxdna peri smd sph
sphere sphere/kk spin spin/kk tdpd
template tri wavepacket
* Integrate styles:
respa respa/omp verlet verlet/kk verlet/lrt/intel
verlet/split
* Minimize styles:
cg cg/kk fire fire/old hftn
quickmin sd spin spin/cg spin/lbfgs
* Pair styles:
adp adp/kk adp/omp agni agni/omp
airebo airebo/intel airebo/morse airebo/morse/intel
airebo/morse/omp airebo/omp atm awpmd/cut
beck beck/gpu beck/omp body/nparticle
body/rounded/polygon body/rounded/polyhedron bop
born born/coul/dsf born/coul/dsf/cs born/coul/long
born/coul/long/cs born/coul/long/cs/gpu
born/coul/long/gpu born/coul/long/omp born/coul/msm
born/coul/msm/omp born/coul/wolf born/coul/wolf/cs
born/coul/wolf/cs/gpu born/coul/wolf/gpu
born/coul/wolf/omp born/gpu born/omp bpm/spring
brownian brownian/omp brownian/poly brownian/poly/omp
buck buck6d/coul/gauss/dsf buck6d/coul/gauss/long
buck/coul/cut buck/coul/cut/gpu buck/coul/cut/intel
buck/coul/cut/kk buck/coul/cut/omp buck/coul/long
buck/coul/long/cs buck/coul/long/gpu
buck/coul/long/intel buck/coul/long/kk
buck/coul/long/omp buck/coul/msm buck/coul/msm/omp
buck/gpu buck/intel buck/kk buck/long/coul/long
buck/long/coul/long/omp buck/mdf buck/omp colloid
colloid/gpu colloid/omp comb comb3 comb/omp
cosine/squared coul/cut coul/cut/dielectric coul/cut/global
coul/cut/global/omp coul/cut/gpu coul/cut/kk coul/cut/omp
coul/cut/soft coul/cut/soft/omp coul/debye coul/debye/gpu
coul/debye/kk coul/debye/omp coul/diel coul/diel/omp coul/dsf
coul/dsf/gpu coul/dsf/kk coul/dsf/omp coul/exclude coul/long
coul/long/cs coul/long/cs/gpu coul/long/dielectric
coul/long/gpu coul/long/kk coul/long/omp coul/long/soft
coul/long/soft/omp coul/msm coul/msm/omp coul/shield
coul/slater/cut coul/slater/long coul/streitz coul/tt
coul/wolf coul/wolf/cs coul/wolf/kk coul/wolf/omp reax
dpd dpd/ext dpd/ext/kk dpd/ext/omp dpd/ext/tstat
dpd/ext/tstat/kk dpd/ext/tstat/omp dpd/fdt
dpd/fdt/energy dpd/fdt/energy/kk dpd/gpu dpd/intel
dpd/kk dpd/omp dpd/tstat dpd/tstat/gpu dpd/tstat/kk
dpd/tstat/omp drip dsmc e3b eam
eam/alloy eam/alloy/gpu eam/alloy/intel eam/alloy/kk eam/alloy/omp
eam/alloy/opt eam/cd eam/cd/old eam/fs eam/fs/gpu
eam/fs/intel eam/fs/kk eam/fs/omp eam/fs/opt eam/gpu
eam/he eam/intel eam/kk eam/omp eam/opt
edip edip/multi edip/omp edpd eff/cut
eim eim/omp exp6/rx exp6/rx/kk extep
gauss gauss/cut gauss/cut/omp gauss/gpu gauss/omp
gayberne gayberne/gpu gayberne/intel gayberne/omp
gran/hertz/history gran/hertz/history/omp gran/hooke
gran/hooke/history gran/hooke/history/kk
gran/hooke/history/omp gran/hooke/omp granular gw
gw/zbl harmonic/cut harmonic/cut/omp
hbond/dreiding/lj hbond/dreiding/lj/omp
hbond/dreiding/morse hbond/dreiding/morse/omp hdnnp
hybrid hybrid/kk hybrid/overlay hybrid/overlay/kk
hybrid/scaled ilp/graphene/hbn ilp/graphene/hbn/opt
ilp/tmd ilp/tmd/opt kolmogorov/crespi/full
kolmogorov/crespi/z lcbop lebedeva/z lennard/mdf
line/lj list lj96/cut lj96/cut/gpu lj96/cut/omp
lj/charmm/coul/charmm lj/charmm/coul/charmm/gpu
lj/charmm/coul/charmm/implicit lj/charmm/coul/charmm/implicit/kk
lj/charmm/coul/charmm/implicit/omp lj/charmm/coul/charmm/intel
lj/charmm/coul/charmm/kk lj/charmm/coul/charmm/omp
lj/charmm/coul/long lj/charmm/coul/long/gpu
lj/charmm/coul/long/intel lj/charmm/coul/long/kk
lj/charmm/coul/long/omp lj/charmm/coul/long/opt
lj/charmm/coul/long/soft lj/charmm/coul/long/soft/omp
lj/charmm/coul/msm lj/charmm/coul/msm/omp
lj/charmmfsw/coul/charmmfsh lj/charmmfsw/coul/long lj/class2
lj/class2/coul/cut lj/class2/coul/cut/kk
lj/class2/coul/cut/omp lj/class2/coul/cut/soft
lj/class2/coul/long lj/class2/coul/long/cs
lj/class2/coul/long/gpu lj/class2/coul/long/kk
lj/class2/coul/long/omp lj/class2/coul/long/soft lj/class2/gpu
lj/class2/kk lj/class2/omp lj/class2/soft lj/cubic lj/cubic/gpu
lj/cubic/omp lj/cut lj/cut/coul/cut lj/cut/coul/cut/dielectric
lj/cut/coul/cut/dielectric/omp lj/cut/coul/cut/gpu
lj/cut/coul/cut/kk lj/cut/coul/cut/omp
lj/cut/coul/cut/soft lj/cut/coul/cut/soft/omp
lj/cut/coul/debye lj/cut/coul/debye/dielectric
lj/cut/coul/debye/dielectric/omp lj/cut/coul/debye/gpu
lj/cut/coul/debye/kk lj/cut/coul/debye/omp lj/cut/coul/dsf
lj/cut/coul/dsf/gpu lj/cut/coul/dsf/kk
lj/cut/coul/dsf/omp lj/cut/coul/long
lj/cut/coul/long/cs lj/cut/coul/long/dielectric
lj/cut/coul/long/dielectric/omp lj/cut/coul/long/gpu
lj/cut/coul/long/intel lj/cut/coul/long/kk
lj/cut/coul/long/omp lj/cut/coul/long/opt
lj/cut/coul/long/soft lj/cut/coul/long/soft/omp lj/cut/coul/msm
lj/cut/coul/msm/dielectric lj/cut/coul/msm/gpu
lj/cut/coul/msm/omp lj/cut/coul/wolf
lj/cut/coul/wolf/omp lj/cut/dipole/cut
lj/cut/dipole/cut/gpu lj/cut/dipole/cut/omp
lj/cut/dipole/long lj/cut/dipole/long/gpu lj/cut/gpu
lj/cut/intel lj/cut/kk lj/cut/omp lj/cut/opt lj/cut/soft
lj/cut/soft/omp lj/cut/thole/long lj/cut/thole/long/omp
lj/cut/tip4p/cut lj/cut/tip4p/cut/omp
lj/cut/tip4p/long lj/cut/tip4p/long/gpu
lj/cut/tip4p/long/omp lj/cut/tip4p/long/opt
lj/cut/tip4p/long/soft lj/cut/tip4p/long/soft/omp lj/expand
lj/expand/coul/long lj/expand/coul/long/gpu lj/expand/gpu
lj/expand/kk lj/expand/omp lj/gromacs lj/gromacs/coul/gromacs
lj/gromacs/coul/gromacs/kk lj/gromacs/coul/gromacs/omp lj/gromacs/gpu
lj/gromacs/kk lj/gromacs/omp lj/long/coul/long
lj/long/coul/long/dielectric lj/long/coul/long/intel
lj/long/coul/long/omp lj/long/coul/long/opt
lj/long/dipole/long lj/long/tip4p/long
lj/long/tip4p/long/omp lj/mdf lj/relres lj/relres/omp
lj/sdk lj/sdk/coul/long lj/sdk/coul/long/gpu
lj/sdk/coul/long/omp lj/sdk/coul/msm lj/sdk/coul/msm/omp
lj/sdk/gpu lj/sdk/kk lj/sdk/omp lj/sf/dipole/sf
lj/sf/dipole/sf/gpu lj/sf/dipole/sf/omp lj/smooth
lj/smooth/gpu lj/smooth/linear lj/sf
lj/smooth/linear/omp lj/sf/omp lj/smooth/omp
lj/switch3/coulgauss/long local/density lubricate lubricateU
lubricateU/poly lubricate/omp lubricate/poly lubricate/poly/omp
mdpd mdpd/rhosum meam meam/c meam/spline
meam/spline/omp meam/sw/spline mesocnt mesont/tpm mgpt
mie/cut mie/cut/gpu mliap mm3/switch3/coulgauss/long
momb morse morse/gpu morse/kk morse/omp
morse/opt morse/smooth/linear morse/smooth/linear/omp
morse/soft multi/lucy multi/lucy/rx multi/lucy/rx/kk
nb3b/harmonic nm/cut nm/cut/coul/cut nm/cut/coul/cut/omp
nm/cut/coul/long nm/cut/coul/long/omp nm/cut/omp
nm/cut/split oxdna2/coaxstk oxdna2/dh oxdna2/excv oxdna/coaxstk
oxrna2/coaxstk oxdna/excv oxdna/hbond oxdna2/hbond oxdna/stk
oxdna2/stk oxdna/xstk oxdna2/xstk oxrna2/dh oxrna2/excv
oxrna2/hbond oxrna2/stk oxrna2/xstk peri/eps peri/lps
peri/lps/omp peri/pmb peri/pmb/omp peri/ves polymorphic
rann reaxff reax/c reaxff/kk reax/c/kk
reaxff/omp reax/c/omp rebo rebo/intel rebo/omp
resquared resquared/gpu resquared/omp saip/metal saip/metal/opt
sdpd/taitwater/isothermal smatb smatb/single smd/hertz
smd/tlsph smd/tri_surface smd/ulsph smtbq snap
snap/kk soft soft/gpu soft/omp
sph/heatconduction sph/idealgas sph/lj sph/rhosum
sph/taitwater sph/taitwater/morris spin/dipole/cut spin/dipole/long
spin/dmi spin/exchange spin/exchange/biquadratic spin/magelec
spin/neel srp sw sw/angle/table sw/gpu
sw/intel sw/kk sw/mod sw/mod/omp sw/omp
table table/gpu table/kk table/omp table/rx
table/rx/kk tdpd tersoff tersoff/gpu tersoff/kk
tersoff/mod tersoff/mod/c tersoff/mod/c/omp tersoff/mod/gpu
tersoff/mod/kk tersoff/mod/omp tersoff/omp tersoff/table
tersoff/table/omp tersoff/zbl tersoff/zbl/gpu tersoff/zbl/kk
tersoff/zbl/omp thole threebody/table tip4p/cut tip4p/cut/omp
tip4p/long tip4p/long/omp tip4p/long/soft tip4p/long/soft/omp
tracker tri/lj ufm ufm/gpu ufm/omp
ufm/opt vashishta vashishta/gpu vashishta/kk vashishta/omp
vashishta/table vashishta/table/omp wf/cut yukawa
yukawa/colloid yukawa/colloid/gpu yukawa/colloid/omp
yukawa/gpu yukawa/kk yukawa/omp zbl zbl/gpu
zbl/kk zbl/omp zero
* Bond styles:
bpm/rotational bpm/spring class2 class2/kk class2/omp
fene fene/expand fene/expand/omp fene/intel fene/kk
fene/nm fene/omp gaussian gromos gromos/omp
harmonic harmonic/intel harmonic/kk harmonic/omp harmonic/shift
harmonic/shift/cut harmonic/shift/cut/omp
harmonic/shift/omp hybrid mm3 morse
morse/omp nonlinear nonlinear/omp oxdna2/fene oxdna/fene
oxrna2/fene quartic quartic/omp special table
table/omp zero
* Angle styles:
charmm charmm/intel charmm/kk charmm/omp class2
class2/kk class2/omp class2/p6 cosine cosine/buck6d
cosine/delta cosine/delta/omp cosine/kk cosine/omp
cosine/periodic cosine/periodic/omp cosine/shift cosine/shift/exp
cosine/shift/exp/omp cosine/shift/omp cosine/squared
cosine/squared/omp cross dipole dipole/omp
fourier fourier/omp fourier/simple fourier/simple/omp
gaussian harmonic harmonic/intel harmonic/kk harmonic/omp
hybrid mm3 quartic quartic/omp sdk
sdk/omp table table/omp zero
* Dihedral styles:
charmm charmm/intel charmm/kk charmm/omp charmmfsw
class2 class2/kk class2/omp cosine/shift/exp
cosine/shift/exp/omp fourier fourier/intel fourier/omp
harmonic harmonic/intel harmonic/kk harmonic/omp helix
helix/omp hybrid multi/harmonic multi/harmonic/omp
nharmonic nharmonic/omp opls opls/intel opls/kk
opls/omp quadratic quadratic/omp spherical table
table/cut table/omp zero
* Improper styles:
class2 class2/kk class2/omp cossq cossq/omp
cvff cvff/intel cvff/omp distance distharm
fourier fourier/omp harmonic harmonic/intel harmonic/kk
harmonic/omp hybrid inversion/harmonic ring
ring/omp sqdistharm umbrella umbrella/omp zero
* KSpace styles:
ewald ewald/dipole ewald/dipole/spin ewald/disp
ewald/disp/dipole ewald/electrode ewald/omp msm
msm/cg msm/cg/omp msm/dielectric msm/omp pppm
pppm/cg pppm/cg/omp pppm/dielectric pppm/dipole pppm/dipole/spin
pppm/disp pppm/disp/dielectric pppm/disp/intel pppm/disp/omp
pppm/disp/tip4p pppm/disp/tip4p/omp pppm/electrode
pppm/electrode/intel pppm/gpu pppm/intel pppm/kk
pppm/omp pppm/stagger pppm/tip4p pppm/tip4p/omp
* Fix styles
accelerate/cos acks2/reax acks2/reaxff acks2/reaxff/kk acks2/reax/kk
adapt adapt/fep addforce addtorque append/atoms
atc atom/swap ave/atom ave/chunk ave/correlate
ave/correlate/long ave/histo ave/histo/weight
ave/time aveforce balance bocs bond/break
bond/create bond/create/angle bond/react bond/swap
box/relax brownian brownian/asphere brownian/sphere
charge/regulation cmap colvars controller
damping/cundall deform deform/kk deposit ave/spatial
ave/spatial/sphere lb/pc lb/rigid/pc/sphere
client/md dpd/energy dpd/energy/kk drag drude
drude/transform/direct drude/transform/inverse dt/reset
edpd/source efield ehex electrode/conp
electrode/conp/intel electrode/conq electrode/conq/intel
electrode/thermo electrode/thermo/intel
electron/stopping electron/stopping/fit enforce2d
enforce2d/kk eos/cv eos/table eos/table/rx eos/table/rx/kk
evaporate external ffl filter/corotate flow/gauss
freeze freeze/kk gcmc gld gle
gravity gravity/kk gravity/omp grem halt
heat hyper/global hyper/local imd indent
ipi langevin langevin/drude langevin/eff langevin/kk
langevin/spin latte lineforce manifoldforce mdi/aimd
meso/move mol/swap momentum momentum/chunk momentum/kk
move msst mvv/dpd mvv/edpd mvv/tdpd
neb neb/spin nph nph/asphere nph/asphere/omp
nph/body nph/eff nph/kk nph/omp nph/sphere
nph/sphere/omp nphug npt npt/asphere npt/asphere/omp
npt/body npt/cauchy npt/eff npt/gpu npt/intel
npt/kk npt/omp npt/sphere npt/sphere/omp npt/uef
numdiff numdiff/virial nve nve/asphere nve/asphere/gpu
nve/asphere/intel nve/asphere/noforce nve/awpmd
nve/body nve/bpm/sphere nve/dot nve/dotc/langevin
nve/eff nve/gpu nve/intel nve/kk nve/limit
nve/line nve/manifold/rattle nve/noforce nve/omp
nve/sphere nve/sphere/kk nve/sphere/omp nve/spin nve/tri
nvk nvt nvt/asphere nvt/asphere/omp nvt/body
nvt/eff nvt/gpu nvt/intel nvt/kk
nvt/manifold/rattle nvt/omp nvt/sllod nvt/sllod/eff
nvt/sllod/intel nvt/sllod/kk nvt/sllod/omp nvt/sphere nvt/sphere/omp
nvt/uef oneway orient/bcc orient/eco orient/fcc
pafi phonon pimd planeforce plumed
poems polarize/bem/gmres polarize/bem/icc
polarize/functional pour precession/spin press/berendsen
print propel/self property/atom property/atom/kk
qbmsst qeq/comb qeq/comb/omp qeq/dynamic qeq/fire
qeq/point qeq/reaxff qeq/reax qeq/reaxff/kk qeq/reax/kk
qeq/reaxff/omp qeq/reax/omp qeq/shielded qeq/slater qtb
rattle reaxff/bonds reax/c/bonds reaxff/bonds/kk reax/c/bonds/kk
reaxff/species reax/c/species reaxff/species/kk
reax/c/species/kk recenter restrain rhok
rigid rigid/meso rigid/nph rigid/nph/omp rigid/nph/small
rigid/npt rigid/npt/omp rigid/npt/small rigid/nve rigid/nve/omp
rigid/nve/small rigid/nvt rigid/nvt/omp rigid/nvt/small rigid/omp
rigid/small rigid/small/omp rx rx/kk saed/vtk
setforce setforce/kk setforce/spin shake shake/kk
shardlow shardlow/kk smd smd/adjust_dt
smd/integrate_tlsph smd/integrate_ulsph
smd/move_tri_surf smd/setvel smd/wall_surface
sph sph/stationary spring spring/chunk spring/rg
spring/self srd store/force store/state tdpd/source
temp/berendsen temp/csld temp/csvr temp/rescale temp/rescale/eff
tfmc tgnpt/drude tgnvt/drude thermal/conductivity
ti/spring tmd ttm ttm/grid ttm/mod
tune/kspace vector viscosity viscous viscous/sphere
wall/body/polygon wall/body/polyhedron wall/colloid
wall/ees wall/gran wall/gran/region wall/harmonic
wall/lj1043 wall/lj126 wall/lj93 wall/lj93/kk wall/morse
wall/piston wall/reflect wall/reflect/kk wall/reflect/stochastic
wall/region wall/region/ees wall/srd widom
* Compute styles:
ackland/atom adf aggregate/atom angle angle/local
angmom/chunk ave/sphere/atom ave/sphere/atom/kk basal/atom
body/local bond bond/local born/matrix centro/atom
centroid/stress/atom chunk/atom chunk/spread/atom
cluster/atom cna/atom cnp/atom com com/chunk
contact/atom coord/atom coord/atom/kk damage/atom dihedral
dihedral/local dilatation/atom dipole dipole/chunk displace/atom
dpd dpd/atom edpd/temp/atom efield/atom entropy/atom
erotate/asphere erotate/rigid erotate/sphere erotate/sphere/atom
event/displace fabric fep fep/ta force/tally
fragment/atom global/atom group/group gyration gyration/chunk
gyration/shape gyration/shape/chunk heat/flux heat/flux/tally
heat/flux/virial/tally hexorder/atom hma improper
improper/local inertia/chunk ke ke/atom ke/atom/eff
ke/eff ke/rigid mesont mliap momentum
msd msd/chunk msd/nongauss nbond/atom omega/chunk
orientorder/atom orientorder/atom/kk pair
pair/local pe pe/atom pe/mol/tally pe/tally
plasticity/atom pressure pressure/uef property/atom property/chunk
property/local ptm/atom rdf reduce reduce/chunk
reduce/region rigid/local saed slice
smd/contact/radius smd/damage smd/hourglass/error
smd/internal/energy smd/plastic/strain
smd/plastic/strain/rate smd/rho smd/tlsph/defgrad
smd/tlsph/dt smd/tlsph/num/neighs smd/tlsph/shape smd/tlsph/strain
smd/tlsph/strain/rate smd/tlsph/stress
smd/triangle/vertices smd/ulsph/effm smd/ulsph/num/neighs
smd/ulsph/strain smd/ulsph/strain/rate smd/ulsph/stress
smd/vol sna/atom snad/atom snap snav/atom
sph/e/atom sph/rho/atom sph/t/atom spin stress/atom
stress/cartesian stress/cylinder pressure/cylinder
stress/mop stress/mop/profile stress/spherical
stress/tally tdpd/cc/atom temp temp/asphere temp/body
temp/chunk temp/com temp/cs temp/deform temp/deform/eff
temp/deform/kk temp/drude temp/eff temp/kk temp/partial
temp/profile temp/ramp temp/region temp/region/eff temp/rotate
temp/sphere temp/uef ti torque/chunk vacf
vcm/chunk viscosity/cos voronoi/atom xrd
* Region styles:
block block/kk cone cylinder ellipsoid
intersect plane prism sphere union
* Dump styles:
atom atom/gz atom/zstd cfg cfg/gz
cfg/uef cfg/zstd custom custom/gz custom/zstd
dcd image local local/gz local/zstd
molfile movie xtc xyz xyz/gz
xyz/zstd yaml
* Command styles
balance change_box create_atoms create_bonds create_box
delete_atoms delete_bonds reset_ids kim_init kim_interactions
kim_param kim_property kim_query message server
displace_atoms dynamical_matrix dynamical_matrix/kk
group2ndx hyper info mdi minimize
ndx2group neb neb/spin plugin prd
read_data read_dump read_restart replicate rerun
reset_atom_ids reset_mol_ids run set tad
temper temper/grem temper/npt third_order third_order/kk
velocity write_coeff write_data write_dump write_restart
Ok. That means your executable is parallel computing capable.
Now, I need to see the log file created with the quoted mpiexec command line for one of the examples bundled with LAMMPS, e.g. the in.lj example in the bench folder or in.melt in the examples/melt folder.
should i run with this:
mpiexec -n 8 lmp -sf intel -pk intel 0 omp 2
Doesn’t matter. I just need to know the exact command line used that created the log file.
LAMMPS (23 Jun 2022 - Update 4)
OMP_NUM_THREADS environment is not set. Defaulting to 1 thread. (src/comm.cpp:98)
using 1 OpenMP thread(s) per MPI task
package intel 1
package intel 0 omp 2
# 3d Lennard-Jones melt
units lj
atom_style atomic
lattice fcc 0.8442
Lattice spacing in x,y,z = 1.6795962 1.6795962 1.6795962
region box block 0 10 0 10 0 10
create_box 1 box
Created orthogonal box = (0 0 0) to (16.795962 16.795962 16.795962)
2 by 2 by 2 MPI processor grid
create_atoms 1 box
Created 4000 atoms
using lattice units in orthogonal box = (0 0 0) to (16.795962 16.795962 16.795962)
create_atoms CPU = 0.001 seconds
mass 1 1.0
velocity all create 3.0 87287 loop geom
pair_style lj/cut 2.5
pair_coeff 1 1 1.0 1.0 2.5
neighbor 0.3 bin
neigh_modify every 20 delay 0 check no
fix 1 all nve
#dump id all atom 50 dump.melt
#dump 2 all image 25 image.*.jpg type type # axes yes 0.8 0.02 view 60 -30
#dump_modify 2 pad 3
#dump 3 all movie 25 movie.mpg type type # axes yes 0.8 0.02 view 60 -30
#dump_modify 3 pad 3
thermo 50
run 250
----------------------------------------------------------
Using INTEL Package without Coprocessor.
Compiler: MinGW-w64 64bit 9.0 / GNU C++ 11.2.1 20211019 (Fedora MinGW 11.2.1-6.fc36)
SIMD compiler directives: Disabled
Precision: mixed
----------------------------------------------------------
Generated 0 of 0 mixed pair_coeff terms from geometric mixing rule
Neighbor list info ...
update every 20 steps, delay 0 steps, check no
max neighbors/atom: 2000, page size: 100000
master list distance cutoff = 2.8
ghost atom cutoff = 2.8
binsize = 1.4, bins = 12 12 12
1 neighbor lists, perpetual/occasional/extra = 1 0 0
(1) pair lj/cut/intel, perpetual
attributes: half, newton on, intel
pair build: half/bin/newton/intel
stencil: half/bin/3d
bin: intel
Per MPI rank memory allocation (min/avg/max) = 7.951 | 7.951 | 7.951 Mbytes
Step Temp E_pair E_mol TotEng Press
0 3 -6.7733683 0 -2.2744933 -3.7033504
50 1.6842867 -4.8082494 0 -2.2824509 5.5666134
100 1.671258 -4.7875608 0 -2.2813005 5.6613921
150 1.644473 -4.7470997 0 -2.2810069 5.8614345
200 1.6471534 -4.7508991 0 -2.2807866 5.880556
250 1.6645782 -4.777449 0 -2.281206 5.7525584
Loop time of 0.0666262 on 16 procs for 250 steps with 4000 atoms
Performance: 1620982.721 tau/day, 3752.275 timesteps/s
67.4% CPU use with 8 MPI tasks x 2 OpenMP threads
MPI task timing breakdown:
Section | min time | avg time | max time |%varavg| %total
---------------------------------------------------------------
Pair | 0.030592 | 0.031034 | 0.031567 | 0.2 | 46.58
Neigh | 0.005158 | 0.0056151 | 0.006386 | 0.5 | 8.43
Comm | 0.023531 | 0.024699 | 0.025673 | 0.4 | 37.07
Output | 0.000124 | 0.0001265 | 0.000135 | 0.0 | 0.19
Modify | 0.004195 | 0.0042853 | 0.00442 | 0.1 | 6.43
Other | | 0.0008665 | | | 1.30
Nlocal: 500 ave 510 max 492 min
Histogram: 1 2 0 1 0 1 1 0 1 1
Nghost: 1818.75 ave 1838 max 1805 min
Histogram: 1 0 1 4 0 0 1 0 0 1
Neighs: 18973.5 ave 19725 max 18329 min
Histogram: 1 2 0 0 2 0 1 1 0 1
Total # of neighbors = 151788
Ave neighs/atom = 37.947
Neighbor list builds = 12
Dangerous builds not checked
Total wall time: 0:00:00
This shows that the run is using all requested processor cores.
However, the 67% CPU utilization suggests that there are other tasks outside of LAMMPS occupying the CPU. So it is probably faster to not use threads on top of MPI, i.e. use omp 1
instead of omp 2
. With 2 threads per core, there should be something close to 200%. With 1 thread per core, it should be close to 100%. If this is still significantly lower with reducing the threads, then there is something else consuming significant CPU resources. You cannot run on more resources than what is available.
This clearly contradicts your original statement that LAMMPS is not using the processors. It does, and all of them.
can I use integrated gpu? if yes,than how?
What is the output of ocl_get_devices
?
What pair style are you using?
`Found 1 platform(s).
Platform 0:
Device 0: “gfx90c”
Type of device: GPU
Supported OpenCL Version: 2.0
Is a subdevice: No
Double precision support: Yes
Total amount of global memory: 6.0481 GB
Number of compute units/multiprocessors: 8
Total amount of constant memory: 5063639040 bytes
Total amount of local/shared memory per block: 32768 bytes
Maximum group size (# of threads per block) 256
Maximum item sizes (# threads for each dim) 1024 x 1024 x 1024
Clock rate: 2 GHz
ECC support: No
Device fission into equal partitions: No
Device fission by counts: No
Device fission by affinity: No
Maximum subdevices from fission: 8
Shared memory system: Yes
Subgroup support: No
Shuffle support: No`
I am tired of having to ask the same questions multiple times. You don’t seem to properly appreciate that people here volunteer their time and thus make an effort yourself to minimize how much time is spent on your questions (since several of those would be superfluous had you properly studied the available documentation). Don’t expect any responses from me in the future unless you make an effort to follow common internet forum etiquette more closely.
Your GPU is compatible with some pair styles but very weak. Not sure if it is worth using it. You have already been pointed to the part of the manual discussing accelerators. Please study it carefully.