Why energy/pressure are slighly off, what is possible solution for making them equal. Any suggestion/comment?
clear
package gpu 1 neigh no binsize 5.0 newton off
units metal
boundary p p p
atom_style atomic
variable lp equal "2.86"
lattice bcc ${lp} orient x 1 0 0 orient y 0 1 0 orient z 0 0 1
region whole block 0 1 0 1 0 1 units lattice
create_box 1 whole
create_atoms 1 region whole
replicate 30 30 30
mass 1 1
# ---------- Define Interatomic Potential ---------------------
pair_style lj/cut 2.5
pair_coeff * * 1 1
# ---------- Run Minimization ---------------------
reset_timestep 0
run 0
log.cpu
units metal
boundary p p p
atom_style atomic
variable lp equal "2.86"
lattice bcc ${lp} orient x 1 0 0 orient y 0 1 0 orient z 0 0 1
lattice bcc 2.86 orient x 1 0 0 orient y 0 1 0 orient z 0 0 1
Lattice spacing in x,y,z = 2.86 2.86 2.86
region whole block 0 1 0 1 0 1 units lattice
create_box 1 whole
Created orthogonal box = (0 0 0) to (2.86 2.86 2.86)
1 by 1 by 2 MPI processor grid
create_atoms 1 region whole
Created 2 atoms
using lattice units in orthogonal box = (0 0 0) to (2.86 2.86 2.86)
create_atoms CPU = 0.000 seconds
replicate 30 30 30
Replication is creating a 30x30x30 = 27000 times larger system...
orthogonal box = (0 0 0) to (85.8 85.8 85.8)
1 by 1 by 2 MPI processor grid
54000 atoms
replicate CPU = 0.001 seconds
mass 1 1
# ---------- Define Interatomic Potential ---------------------
pair_style lj/cut 2.5
pair_coeff * * 1 1
# ---------- Run Minimization ---------------------
reset_timestep 0
run 0
WARNING: No fixes with time integration, atoms won't move (../verlet.cpp:60)
Generated 0 of 0 mixed pair_coeff terms from geometric mixing rule
Neighbor list info ...
update: every = 1 steps, delay = 0 steps, check = yes
max neighbors/atom: 2000, page size: 100000
master list distance cutoff = 4.5
ghost atom cutoff = 4.5
binsize = 2.25, bins = 39 39 39
1 neighbor lists, perpetual/occasional/extra = 1 0 0
(1) pair lj/cut, perpetual
attributes: half, newton on
pair build: half/bin/atomonly/newton
stencil: half/bin/3d
bin: standard
Per MPI rank memory allocation (min/avg/max) = 10.22 | 10.22 | 10.22 Mbytes
Step Temp E_pair E_mol TotEng Press
0 0 -3726.0493 0 -3726.0493 -18820.611
Loop time of 9.615e-07 on 2 procs for 0 steps with 54000 atoms
0.0% CPU use with 2 MPI tasks x no OpenMP threads
MPI task timing breakdown:
Section | min time | avg time | max time |%varavg| %total
---------------------------------------------------------------
Pair | 0 | 0 | 0 | 0.0 | 0.00
Neigh | 0 | 0 | 0 | 0.0 | 0.00
Comm | 0 | 0 | 0 | 0.0 | 0.00
Output | 0 | 0 | 0 | 0.0 | 0.00
Modify | 0 | 0 | 0 | 0.0 | 0.00
Other | | 9.615e-07 | | |100.00
Nlocal: 27000 ave 27000 max 27000 min
Histogram: 2 0 0 0 0 0 0 0 0 0
Nghost: 14566 ave 14566 max 14566 min
Histogram: 2 0 0 0 0 0 0 0 0 0
Neighs: 351000 ave 351000 max 351000 min
Histogram: 2 0 0 0 0 0 0 0 0 0
Total # of neighbors = 702000
Ave neighs/atom = 13
Neighbor list builds = 0
Dangerous builds = 0
log.gpu
LAMMPS (2 Aug 2023)
Lattice spacing in x,y,z = 2.86 2.86 2.86
Created orthogonal box = (0 0 0) to (2.86 2.86 2.86)
1 by 1 by 2 MPI processor grid
Created 2 atoms
using lattice units in orthogonal box = (0 0 0) to (2.86 2.86 2.86)
create_atoms CPU = 0.000 seconds
Replication is creating a 30x30x30 = 27000 times larger system...
orthogonal box = (0 0 0) to (85.8 85.8 85.8)
1 by 1 by 2 MPI processor grid
54000 atoms
replicate CPU = 0.002 seconds
CITE-CITE-CITE-CITE-CITE-CITE-CITE-CITE-CITE-CITE-CITE-CITE-CITE
Your simulation uses code contributions which should be cited:
- GPU package (short-range, long-range and three-body potentials): doi:10.1016/j.cpc.2010.12.021, doi:10.1016/j.cpc.2011.10.012, doi:10.1016/j.cpc.2013.08.002, doi:10.1016/j.commatsci.2014.10.068, doi:10.1016/j.cpc.2016.10.020, doi:10.3233/APC200086
The log file lists these citations in BibTeX format.
CITE-CITE-CITE-CITE-CITE-CITE-CITE-CITE-CITE-CITE-CITE-CITE-CITE
WARNING: No fixes with time integration, atoms won't move (../verlet.cpp:60)
--------------------------------------------------------------------------
- Using acceleration for lj/cut:
- with 2 proc(s) per device.
- Horizontal vector operations: ENABLED
- Shared memory system: No
--------------------------------------------------------------------------
Device 0: NVIDIA GeForce RTX 3090, 82 CUs, 17/24 GB, 1.7 GHZ (Mixed Precision)
--------------------------------------------------------------------------
Initializing Device and compiling on process 0...Done.
Initializing Device 0 on core 0...Done.
Initializing Device 0 on core 1...Done.
Generated 0 of 0 mixed pair_coeff terms from geometric mixing rule
Neighbor list info ...
update: every = 1 steps, delay = 0 steps, check = yes
max neighbors/atom: 2000, page size: 100000
master list distance cutoff = 4.5
ghost atom cutoff = 4.5
binsize = 2.25, bins = 39 39 39
1 neighbor lists, perpetual/occasional/extra = 1 0 0
(1) pair lj/cut/gpu, perpetual
attributes: full, newton off
pair build: full/bin/atomonly
stencil: full/bin/3d
bin: standard
Setting up Verlet run ...
Unit style : metal
Current step : 0
Time step : 0.001
Per MPI rank memory allocation (min/avg/max) = 13.68 | 13.68 | 13.68 Mbytes
Step Temp E_pair E_mol TotEng Press
0 0 -3726.0488 0 -3726.0488 -18820.609
Loop time of 1.1825e-06 on 2 procs for 0 steps with 54000 atoms
0.0% CPU use with 2 MPI tasks x no OpenMP threads
MPI task timing breakdown:
Section | min time | avg time | max time |%varavg| %total
---------------------------------------------------------------
Pair | 0 | 0 | 0 | 0.0 | 0.00
Neigh | 0 | 0 | 0 | 0.0 | 0.00
Comm | 0 | 0 | 0 | 0.0 | 0.00
Output | 0 | 0 | 0 | 0.0 | 0.00
Modify | 0 | 0 | 0 | 0.0 | 0.00
Other | | 1.183e-06 | | |100.00
Nlocal: 27000 ave 27000 max 27000 min
Histogram: 2 0 0 0 0 0 0 0 0 0
Nghost: 14566 ave 14566 max 14566 min
Histogram: 2 0 0 0 0 0 0 0 0 0
Neighs: 0 ave 0 max 0 min
Histogram: 2 0 0 0 0 0 0 0 0 0
FullNghs: 702000 ave 702000 max 702000 min
Histogram: 2 0 0 0 0 0 0 0 0 0
Total # of neighbors = 1404000
Ave neighs/atom = 26
Neighbor list builds = 0
Dangerous builds = 0
---------------------------------------------------------------------
Device Time Info (average):
---------------------------------------------------------------------
Average split: 1.0000.
Lanes / atom: 4.
Vector width: 32.
Prefetch mode: None.
Max Mem / Proc: 7.71 MB.
CPU Cast/Pack: 0.0002 s.
CPU Driver_Time: 0.0000 s.
CPU Idle_Time: 0.0001 s.
---------------------------------------------------------------------
Total wall time: 0:00:01
output of ./lmp_gpu -h
Large-scale Atomic/Molecular Massively Parallel Simulator - 2 Aug 2023
Usage example: ./lmp_gpu -var t 300 -echo screen -in in.alloy
List of command line options supported by this LAMMPS executable:
-echo none/screen/log/both : echoing of input script (-e)
-help : print this help message (-h)
-in none/filename : read input from file or stdin (default) (-i)
-kokkos on/off ... : turn KOKKOS mode on or off (-k)
-log none/filename : where to send log output (-l)
-mdi '<mdi flags>' : pass flags to the MolSSI Driver Interface
-mpicolor color : which exe in a multi-exe mpirun cmd (-m)
-cite : select citation reminder style (-c)
-nocite : disable citation reminder (-nc)
-nonbuf : disable screen/logfile buffering (-nb)
-package style ... : invoke package command (-pk)
-partition size1 size2 ... : assign partition sizes (-p)
-plog basename : basename for partition logs (-pl)
-pscreen basename : basename for partition screens (-ps)
-restart2data rfile dfile ... : convert restart to data file (-r2data)
-restart2dump rfile dgroup dstyle dfile ...
: convert restart to dump file (-r2dump)
-reorder topology-specs : processor reordering (-r)
-screen none/filename : where to send screen output (-sc)
-skiprun : skip loops in run and minimize (-sr)
-suffix gpu/intel/opt/omp : style suffix to apply (-sf)
-var varname value : set index style variable (-v)
OS: Linux "Ubuntu 22.04.2 LTS" 5.15.0-86-generic x86_64
Compiler: GNU C++ 11.4.0 with OpenMP not enabled
C++ standard: C++11
MPI v3.1: Intel(R) MPI Library 2021.6 for Linux* OS
Accelerator configuration:
GPU package API:
GPU package precision: mixed
Compatible GPU present: yes
Active compile time flags:
-DLAMMPS_GZIP
-DLAMMPS_SMALLBIG
sizeof(smallint): 32-bit
sizeof(imageint): 32-bit
sizeof(tagint): 32-bit
sizeof(bigint): 64-bit
Available compression formats:
Extension: .gz Command: gzip
Extension: .bz2 Command: bzip2
Extension: .zst Command: zstd
Extension: .xz Command: xz
Extension: .lzma Command: xz
Extension: .lz4 Command: lz4
Installed packages:
GPU MANYBODY MEAM REPLICA