Hi (I’m using lammps-23Jun2022),
I am studying the migration of W (SIA, Self Interstitial Atom) and He (LIA, Light Impurity Atom) in a W grain boundary (GB). I started studying the migration of the SIA and I had no problem running the code. However, when trying to run the same configuration, but this time with a He, I got an error and I don’t understand why. My input script, that is the same for the SIA and light case, it’s the following:
# NEB simulation
units metal
atom_style atomic
atom_modify map array
boundary p p p
atom_modify sort 0 0.0
#------------------------------------------------------------------
# Define simulation box.
#------------------------------------------------------------------
read_data initial.lmp
#------------------------------------------------------------------
# Define Interatomic Potential
#------------------------------------------------------------------
mass 1 183.846
mass 2 4.003
mass 3 1.00784
pair_style hybrid/overlay eam/alloy table linear 10000 lj/cut 7.913
pair_coeff * * eam/alloy WHfff.eam.alloy W NULL H
pair_coeff 1 2 table W-He-Juslin.table WHe
pair_coeff 2 2 table He-Beck1968_modified.table HeHe
pair_coeff 2 3 lj/cut 5.9225E-4 1.333
# set up neb run
variable u uloop 48
# fixed atoms
region bottom block INF INF INF INF 0 3
region top block INF INF INF INF 18 20.5
group bottom region bottom
group top region top
fix freeze1 bottom setforce 0.0 0.0 0.0
fix freeze2 top setforce 0.0 0.0 0.0
# initial minimization to relax vacancy
minimize 1.0e-6 1.0e-4 10000 10000
fix 1 all neb 1.0
thermo 100
# run NEB
timestep 0.01
min_style fire
neb 0.0 1e-4 1000 1000 10 final final.lammpstrj
unfix 1
write_dump all custom dump.neb.w.$u id type x y z
run 0
In fact, the main difference between the two cases, it’s that in the initial configuration file (‘initial.lmp’), the He atom is type 2. Obviously, in the final file I have no type. (Also, both the initial and final configurations are previously relaxed). However, when I run it I get this error message after some timesteps (in this specific example, at step 301 it breaks):
remove mkl/2017.4 (LD_LIBRARY_PATH)
remove impi/2017.4 (PATH, MANPATH, LD_LIBRARY_PATH)
Set GNU compilers as MPI wrappers backend
load impi/2017.4 (PATH, MANPATH, LD_LIBRARY_PATH)
load mkl/2017.4 (LD_LIBRARY_PATH)
Fatal error in PMPI_Wait: Message truncated, error stack:
PMPI_Wait(219)....................: MPI_Wait(request=0x7fff41b0eb60, status=0x1) failed
MPIR_Wait_impl(100)...............: fail failed
MPIDI_CH3U_Receive_data_found(131): Message from rank 8 and tag 0 truncated; 10968 bytes received but buffer size is 10944
Fatal error in MPI_Irecv: Message truncated, error stack:
MPI_Irecv(170)......................: MPI_Irecv(buf=0x5015640, count=1368, MPI_DOUBLE, src=11, tag=0, MPI_COMM_WORLD, request=0x7ffcc2c02d70) failed
MPIDI_CH3U_Request_unpack_uebuf(618): Message truncated; 10968 bytes received but buffer size is 10944
Fatal error in PMPI_Wait: Message truncated, error stack:
PMPI_Wait(219)....................: MPI_Wait(request=0x7ffef4971520, status=0x1) failed
MPIR_Wait_impl(100)...............: fail failed
MPIDI_CH3U_Receive_data_found(131): Message from rank 15 and tag 0 truncated; 10968 bytes received but buffer size is 10944
Fatal error in MPI_Irecv: Message truncated, error stack:
MPI_Irecv(170)......................: MPI_Irecv(buf=0x5d87440, count=1368, MPI_DOUBLE, src=18, tag=0, MPI_COMM_WORLD, request=0x7ffc31eade70) failed
MPIDI_CH3U_Request_unpack_uebuf(618): Message truncated; 10968 bytes received but buffer size is 10944
Fatal error in PMPI_Wait: Message truncated, error stack:
PMPI_Wait(219)....................: MPI_Wait(request=0x7ffd55f448c0, status=0x1) failed
MPIR_Wait_impl(100)...............: fail failed
MPIDI_CH3U_Receive_data_found(131): Message from rank 22 and tag 0 truncated; 10968 bytes received but buffer size is 10944
Fatal error in PMPI_Wait: Message truncated, error stack:
PMPI_Wait(219)....................: MPI_Wait(request=0x7ffeb5efe900, status=0x1) failed
MPIR_Wait_impl(100)...............: fail failed
MPIDI_CH3U_Receive_data_found(131): Message from rank 23 and tag 0 truncated; 10968 bytes received but buffer size is 10944
Fatal error in PMPI_Wait: Message truncated, error stack:
PMPI_Wait(219)....................: MPI_Wait(request=0x7ffe472cd870, status=0x1) failed
MPIR_Wait_impl(100)...............: fail failed
MPIDI_CH3U_Receive_data_found(131): Message from rank 36 and tag 0 truncated; 10968 bytes received but buffer size is 10944
srun: Job step aborted: Waiting up to 32 seconds for job step to finish.
slurmstepd: error: *** STEP 25069041.0 ON s23r2b12 CANCELLED AT 2022-09-01T23:36:59 ***
srun: error: s23r2b12: tasks 0,2-15,17-35,37-39,41,43,45-47: Killed
srun: launch/slurm: _step_signal: Terminating StepId=25069041.0
srun: error: s23r2b12: task 40: Killed
srun: error: s23r2b12: task 16: Killed
srun: error: s23r2b12: task 1: Killed
srun: error: s23r2b12: tasks 36,42: Killed
srun: error: s23r2b12: task 44: Killed
The problem is with the He atom, because if I change just it to type 1 (W), it runs perfectly. Any suggestion? Sorry if i didn’t sum it up enough!
Jorge