OpenCL error when running Lammps with gpu

Hello,

I have a Lammps program that is supposed to displace a user inputted number of random atoms in a given cell into random positions using the displace_atoms command and then minimize its energy. The program is then looped as many times as the user wishes. When the program is run with a small number of these loops, the program runs without problems. However, when I input a large number of loops (over 100 for example), I get an OpenCL error after the atoms are displaced, before the energy minimization. Typically the error is something like

"Initializing Device and compiling on process 0…Done.
OpenCL error in file ‘/proj/ikpa/lammps-27May2021/lib/gpu/geryon/ocl_timer.h’ in line 90 : -9999.
Initializing Device 0 on core 0…Done.

Setting up cg style minimization …
Unit style : metal
Current step : 121470
OpenCL error in file ‘/proj/ikpa/lammps-27May2021/lib/gpu/geryon/ocl_timer.h’ in line 119 : -9999"

The error always happens two times in a row, both in ocl_timer.h, and always has code -9999, meaning “Illegal read or write to a buffer”.

This error only happens when running Lammps using gpu, and it seems to be affected by the size of the input cell. The larger the cell, the longer the program runs without the error. Also, when the size of the input cell is not changed, the error always seems to happend on the same loop despite changing the seeds for the random number generator.

I have no idea what this error is caused by. Any help would be greatly appreciated.

Please post the output of ocl_get_devices and lmp -h

ocl_get_devices:

"Found 1 platform(s).

Platform 0:

Device 0: “Tesla K80”
Type of device: GPU
Supported OpenCL Version: 1.20
Is a subdevice: No
Double precision support: Yes
Total amount of global memory: 11.173 GB
Number of compute units/multiprocessors: 13
Total amount of constant memory: 65536 bytes
Total amount of local/shared memory per block: 49152 bytes
Maximum group size (# of threads per block) 1024
Maximum item sizes (# threads for each dim) 1024 x 1024 x 64
Clock rate: 0.823 GHz
ECC support: Yes
Device fission into equal partitions: No
Device fission by counts: No
Device fission by affinity: No
Maximum subdevices from fission: 1
Shared memory system: No
Subgroup support: No
Shuffle support: Yes

Device 1: “Tesla K80”
Type of device: GPU
Supported OpenCL Version: 1.20
Is a subdevice: No
Double precision support: Yes
Total amount of global memory: 11.173 GB
Number of compute units/multiprocessors: 13
Total amount of constant memory: 65536 bytes
Total amount of local/shared memory per block: 49152 bytes
Maximum group size (# of threads per block) 1024
Maximum item sizes (# threads for each dim) 1024 x 1024 x 64
Clock rate: 0.823 GHz
ECC support: Yes
Device fission into equal partitions: No
Device fission by counts: No
Device fission by affinity: No
Maximum subdevices from fission: 1
Shared memory system: No
Subgroup support: No
Shuffle support: Yes

Device 2: “Tesla K80”
Type of device: GPU
Supported OpenCL Version: 1.20
Is a subdevice: No
Double precision support: Yes
Total amount of global memory: 11.173 GB
Number of compute units/multiprocessors: 13
Total amount of constant memory: 65536 bytes
Total amount of local/shared memory per block: 49152 bytes
Maximum group size (# of threads per block) 1024
Maximum item sizes (# threads for each dim) 1024 x 1024 x 64
Clock rate: 0.823 GHz
ECC support: Yes
Device fission into equal partitions: No
Device fission by counts: No
Device fission by affinity: No
Maximum subdevices from fission: 1
Shared memory system: No
Subgroup support: No
Shuffle support: Yes

Device 3: “Tesla K80”
Type of device: GPU
Supported OpenCL Version: 1.20
Is a subdevice: No
Double precision support: Yes
Total amount of global memory: 11.173 GB
Number of compute units/multiprocessors: 13
Total amount of constant memory: 65536 bytes
Total amount of local/shared memory per block: 49152 bytes
Maximum group size (# of threads per block) 1024
Maximum item sizes (# threads for each dim) 1024 x 1024 x 64
Clock rate: 0.823 GHz
ECC support: Yes
Device fission into equal partitions: No
Device fission by counts: No
Device fission by affinity: No
Maximum subdevices from fission: 1
Shared memory system: No
Subgroup support: No
Shuffle support: Yes"

lmp -h:

"Large-scale Atomic/Molecular Massively Parallel Simulator - 27 May 2021

Usage example: lmp -var t 300 -echo screen -in in.alloy

List of command line options supported by this LAMMPS executable:

-echo none/screen/log/both : echoing of input script (-e)
-help : print this help message (-h)
-in none/filename : read input from file or stdin (default) (-i)
-kokkos on/off … : turn KOKKOS mode on or off (-k)
-log none/filename : where to send log output (-l)
-mdi ‘’ : pass flags to the MolSSI Driver Interface
-mpicolor color : which exe in a multi-exe mpirun cmd (-m)
-cite : select citation reminder style (-c)
-nocite : disable citation reminder (-nc)
-package style … : invoke package command (-pk)
-partition size1 size2 … : assign partition sizes (-p)
-plog basename : basename for partition logs (-pl)
-pscreen basename : basename for partition screens (-ps)
-restart2data rfile dfile … : convert restart to data file (-r2data)
-restart2dump rfile dgroup dstyle dfile …
: convert restart to dump file (-r2dump)
-reorder topology-specs : processor reordering (-r)
-screen none/filename : where to send screen output (-sc)
-suffix gpu/intel/opt/omp : style suffix to apply (-sf)
-var varname value : set index style variable (-v)

OS: Linux “CentOS Linux 7 (Core)” 3.10.0-1062.9.1.el7.x86_64 on x86_64

Compiler: GNU C++ 4.8.5 20150623 (Red Hat 4.8.5-39) with OpenMP 3.1
C++ standard: C++11
MPI v1.2: LAMMPS MPI STUBS for LAMMPS version 27 May 2021

Accelerator configuration:

GPU package API: OpenCL
GPU package precision: mixed
USER-OMP package API: OpenMP
USER-OMP package precision: double

GPU present: yes

Active compile time flags:

-DLAMMPS_GZIP
-DLAMMPS_PNG
-DLAMMPS_JPEG
-DLAMMPS_SMALLBIG
sizeof(smallint): 32-bit
sizeof(imageint): 32-bit
sizeof(tagint): 32-bit
sizeof(bigint): 64-bit

Installed packages:

ASPHERE BODY CLASS2 COLLOID COMPRESS CORESHELL DIPOLE GPU GRANULAR KSPACE
MANYBODY MC MISC MLIAP MOLECULE OPT PERI PLUGIN POEMS PYTHON QEQ REPLICA RIGID
SHOCK SNAP SPIN SRD USER-BOCS USER-BROWNIAN USER-CGDNA USER-CGSDK USER-COLVARS
USER-DIFFRACTION USER-DPD USER-DRUDE USER-EFF USER-FEP USER-MEAMC USER-MESODPD
USER-MISC USER-MOFFF USER-OMP USER-PHONON USER-REACTION USER-REAXC USER-SDPD
USER-SMD USER-SPH USER-UEF USER-YAFF VORONOI

List of individual style options included in this LAMMPS executable

  • Atom styles:

angle atomic body bond charge
dipole dpd edpd electron ellipsoid
full hybrid line mdpd molecular
peri smd sph sphere spin
tdpd template tri

  • Integrate styles:

respa respa/omp verlet verlet/split

  • Minimize styles:

cg fire fire/old hftn quickmin
sd spin spin/cg spin/lbfgs

  • Pair styles:

adp adp/omp agni agni/omp airebo
airebo/morse airebo/morse/omp airebo/omp atm
beck beck/gpu beck/omp body/nparticle
body/rounded/polygon body/rounded/polyhedron bop
born born/coul/dsf born/coul/dsf/cs born/coul/long
born/coul/long/cs born/coul/long/cs/gpu
born/coul/long/gpu born/coul/long/omp born/coul/msm
born/coul/msm/omp born/coul/wolf born/coul/wolf/cs
born/coul/wolf/cs/gpu born/coul/wolf/gpu
born/coul/wolf/omp born/gpu born/omp brownian
brownian/omp brownian/poly brownian/poly/omp buck
buck6d/coul/gauss/dsf buck6d/coul/gauss/long buck/coul/cut
buck/coul/cut/gpu buck/coul/cut/omp buck/coul/long
buck/coul/long/cs buck/coul/long/gpu
buck/coul/long/omp buck/coul/msm buck/coul/msm/omp
buck/gpu buck/long/coul/long buck/long/coul/long/omp
buck/mdf buck/omp colloid colloid/gpu colloid/omp
comb comb3 comb/omp cosine/squared coul/cut
coul/cut/global coul/cut/global/omp coul/cut/gpu coul/cut/omp
coul/cut/soft coul/cut/soft/omp coul/debye coul/debye/gpu
coul/debye/omp coul/diel coul/diel/omp coul/dsf coul/dsf/gpu
coul/dsf/omp coul/long coul/long/cs coul/long/cs/gpu
coul/long/gpu coul/long/omp coul/long/soft coul/long/soft/omp
coul/msm coul/msm/omp coul/shield coul/slater/cut coul/slater/long
coul/streitz coul/tt coul/wolf coul/wolf/cs coul/wolf/omp
reax dpd dpd/ext dpd/ext/tstat dpd/fdt
dpd/fdt/energy dpd/gpu dpd/omp dpd/tstat dpd/tstat/gpu
dpd/tstat/omp drip dsmc e3b eam
eam/alloy eam/alloy/gpu eam/alloy/omp eam/alloy/opt eam/cd
eam/cd/old eam/fs eam/fs/gpu eam/fs/omp eam/fs/opt
eam/gpu eam/he eam/omp eam/opt edip
edip/multi edip/omp edpd eff/cut eim
eim/omp exp6/rx extep gauss gauss/cut
gauss/cut/omp gauss/gpu gauss/omp gayberne gayberne/gpu
gayberne/omp gran/hertz/history gran/hertz/history/omp
gran/hooke gran/hooke/history gran/hooke/history/omp
gran/hooke/omp granular gw gw/zbl
hbond/dreiding/lj hbond/dreiding/lj/omp
hbond/dreiding/morse hbond/dreiding/morse/omp hybrid
hybrid/overlay hybrid/scaled ilp/graphene/hbn
kolmogorov/crespi/full kolmogorov/crespi/z lcbop
lebedeva/z lennard/mdf line/lj list lj96/cut
lj96/cut/gpu lj96/cut/omp lj/charmm/coul/charmm
lj/charmm/coul/charmm/gpu lj/charmm/coul/charmm/implicit
lj/charmm/coul/charmm/implicit/omp lj/charmm/coul/charmm/omp
lj/charmm/coul/long lj/charmm/coul/long/gpu
lj/charmm/coul/long/omp lj/charmm/coul/long/opt
lj/charmm/coul/long/soft lj/charmm/coul/long/soft/omp
lj/charmm/coul/msm lj/charmm/coul/msm/omp
lj/charmmfsw/coul/charmmfsh lj/charmmfsw/coul/long lj/class2
lj/class2/coul/cut lj/class2/coul/cut/omp
lj/class2/coul/cut/soft lj/class2/coul/long
lj/class2/coul/long/cs lj/class2/coul/long/gpu
lj/class2/coul/long/omp lj/class2/coul/long/soft lj/class2/gpu
lj/class2/omp lj/class2/soft lj/cubic lj/cubic/gpu lj/cubic/omp
lj/cut lj/cut/coul/cut lj/cut/coul/cut/gpu
lj/cut/coul/cut/omp lj/cut/coul/cut/soft
lj/cut/coul/cut/soft/omp lj/cut/coul/debye
lj/cut/coul/debye/gpu lj/cut/coul/debye/omp lj/cut/coul/dsf
lj/cut/coul/dsf/gpu lj/cut/coul/dsf/omp lj/cut/coul/long
lj/cut/coul/long/cs lj/cut/coul/long/gpu
lj/cut/coul/long/omp lj/cut/coul/long/opt
lj/cut/coul/long/soft lj/cut/coul/long/soft/omp lj/cut/coul/msm
lj/cut/coul/msm/gpu lj/cut/coul/msm/omp lj/cut/coul/wolf
lj/cut/coul/wolf/omp lj/cut/dipole/cut
lj/cut/dipole/cut/gpu lj/cut/dipole/cut/omp
lj/cut/dipole/long lj/cut/dipole/long/gpu lj/cut/gpu
lj/cut/omp lj/cut/opt lj/cut/soft lj/cut/soft/omp
lj/cut/thole/long lj/cut/thole/long/omp lj/cut/tip4p/cut
lj/cut/tip4p/cut/omp lj/cut/tip4p/long
lj/cut/tip4p/long/gpu lj/cut/tip4p/long/omp
lj/cut/tip4p/long/opt lj/cut/tip4p/long/soft
lj/cut/tip4p/long/soft/omp lj/expand lj/expand/coul/long
lj/expand/coul/long/gpu lj/expand/gpu lj/expand/omp lj/gromacs
lj/gromacs/coul/gromacs lj/gromacs/coul/gromacs/omp lj/gromacs/gpu
lj/gromacs/omp lj/long/coul/long lj/long/coul/long/omp
lj/long/coul/long/opt lj/long/dipole/long
lj/long/tip4p/long lj/long/tip4p/long/omp lj/mdf
lj/relres lj/relres/omp lj/sdk lj/sdk/coul/long
lj/sdk/coul/long/gpu lj/sdk/coul/long/omp lj/sdk/coul/msm
lj/sdk/coul/msm/omp lj/sdk/gpu lj/sdk/omp lj/sf/dipole/sf
lj/sf/dipole/sf/gpu lj/sf/dipole/sf/omp lj/smooth
lj/smooth/gpu lj/smooth/linear lj/sf
lj/smooth/linear/omp lj/sf/omp lj/smooth/omp
lj/switch3/coulgauss/long local/density lubricate lubricateU
lubricateU/poly lubricate/omp lubricate/poly lubricate/poly/omp
mdpd mdpd/rhosum meam/spline meam/spline/omp meam/sw/spline
meam/c meam mie/cut mie/cut/gpu mliap
mm3/switch3/coulgauss/long momb morse morse/gpu
morse/omp morse/opt morse/smooth/linear
morse/smooth/linear/omp morse/soft multi/lucy multi/lucy/rx
nb3b/harmonic nm/cut nm/cut/coul/cut nm/cut/coul/cut/omp
nm/cut/coul/long nm/cut/coul/long/omp nm/cut/omp
oxdna2/coaxstk oxdna2/dh oxdna2/excv oxdna/coaxstk oxrna2/coaxstk
oxdna/excv oxdna/hbond oxdna2/hbond oxdna/stk oxdna2/stk
oxdna/xstk oxdna2/xstk oxrna2/dh oxrna2/excv oxrna2/hbond
oxrna2/stk oxrna2/xstk peri/eps peri/lps peri/lps/omp
peri/pmb peri/pmb/omp peri/ves polymorphic python
reax/c reax/c/omp rebo rebo/omp resquared
resquared/gpu resquared/omp sdpd/taitwater/isothermal smd/hertz
smd/tlsph smd/tri_surface smd/ulsph snap soft
soft/gpu soft/omp sph/heatconduction sph/idealgas
sph/lj sph/rhosum sph/taitwater sph/taitwater/morris
spin/dipole/cut spin/dipole/long spin/dmi spin/exchange
spin/exchange/biquadratic spin/magelec spin/neel srp
sw sw/gpu sw/omp table table/gpu
table/omp table/rx tdpd tersoff tersoff/gpu
tersoff/mod tersoff/mod/c tersoff/mod/c/omp tersoff/mod/gpu
tersoff/mod/omp tersoff/omp tersoff/table tersoff/table/omp
tersoff/zbl tersoff/zbl/gpu tersoff/zbl/omp thole tip4p/cut
tip4p/cut/omp tip4p/long tip4p/long/omp tip4p/long/soft
tip4p/long/soft/omp tri/lj ufm ufm/gpu
ufm/omp ufm/opt vashishta vashishta/gpu vashishta/omp
vashishta/table vashishta/table/omp wf/cut yukawa
yukawa/colloid yukawa/colloid/gpu yukawa/colloid/omp
yukawa/gpu yukawa/omp zbl zbl/gpu zbl/omp
zero

  • Bond styles:

class2 class2/omp fene fene/expand fene/expand/omp
fene/omp gaussian gromos gromos/omp harmonic
harmonic/omp harmonic/shift harmonic/shift/cut
harmonic/shift/cut/omp harmonic/shift/omp hybrid
mm3 morse morse/omp nonlinear nonlinear/omp
oxdna2/fene oxdna/fene oxrna2/fene quartic quartic/omp
special table table/omp zero

  • Angle styles:

charmm charmm/omp class2 class2/omp class2/p6
cosine cosine/buck6d cosine/delta cosine/delta/omp
cosine/omp cosine/periodic cosine/periodic/omp cosine/shift
cosine/shift/exp cosine/shift/exp/omp cosine/shift/omp
cosine/squared cosine/squared/omp cross dipole
dipole/omp fourier fourier/omp fourier/simple
fourier/simple/omp gaussian harmonic harmonic/omp
hybrid mm3 quartic quartic/omp sdk
sdk/omp table table/omp zero

  • Dihedral styles:

charmm charmm/omp charmmfsw class2 class2/omp
cosine/shift/exp cosine/shift/exp/omp fourier
fourier/omp harmonic harmonic/omp helix helix/omp
hybrid multi/harmonic multi/harmonic/omp nharmonic
nharmonic/omp opls opls/omp quadratic quadratic/omp
spherical table table/cut table/omp zero

  • Improper styles:

class2 class2/omp cossq cossq/omp cvff
cvff/omp distance distharm fourier fourier/omp
harmonic harmonic/omp hybrid inversion/harmonic
ring ring/omp sqdistharm umbrella umbrella/omp
zero

  • KSpace styles:

ewald ewald/dipole ewald/dipole/spin ewald/disp
ewald/omp msm msm/cg msm/cg/omp msm/omp
pppm pppm/cg pppm/cg/omp pppm/dipole pppm/dipole/spin
pppm/disp pppm/disp/omp pppm/disp/tip4p pppm/disp/tip4p/omp
pppm/gpu pppm/omp pppm/stagger pppm/tip4p pppm/tip4p/omp

  • Fix styles

accelerate/cos adapt adapt/fep addforce addtorque
append/atoms atom/swap ave/atom ave/chunk ave/correlate
ave/correlate/long ave/histo ave/histo/weight
ave/time aveforce balance bocs bond/break
bond/create bond/create/angle bond/react bond/swap
box/relax brownian brownian/asphere brownian/sphere
charge/regulation cmap colvars controller
deform deposit ave/spatial ave/spatial/sphere
dpd/energy drag drude drude/transform/direct
drude/transform/inverse dt/reset edpd/source efield
ehex electron/stopping electron/stopping/fit
enforce2d eos/cv eos/table eos/table/rx evaporate
external ffl filter/corotate flow/gauss freeze
gcmc gld gle gravity gravity/omp
grem halt heat hyper/global hyper/local
imd indent ipi langevin langevin/drude
langevin/eff langevin/spin lineforce meso/move momentum
momentum/chunk move msst mvv/dpd mvv/edpd
mvv/tdpd neb neb/spin nph nph/asphere
nph/asphere/omp nph/body nph/eff nph/omp nph/sphere
nph/sphere/omp nphug npt npt/asphere npt/asphere/omp
npt/body npt/cauchy npt/eff npt/gpu npt/omp
npt/sphere npt/sphere/omp npt/uef numdiff nve
nve/asphere nve/asphere/gpu nve/asphere/noforce nve/body
nve/dot nve/dotc/langevin nve/eff nve/gpu
nve/limit nve/line nve/noforce nve/omp nve/sphere
nve/sphere/omp nve/spin nve/tri nvk nvt
nvt/asphere nvt/asphere/omp nvt/body nvt/eff nvt/gpu
nvt/omp nvt/sllod nvt/sllod/eff nvt/sllod/omp nvt/sphere
nvt/sphere/omp nvt/uef oneway orient/bcc orient/eco
orient/fcc pafi phonon pimd planeforce
poems pour precession/spin press/berendsen print
propel/self property/atom python/invoke python python/move
qeq/comb qeq/comb/omp qeq/dynamic qeq/dynamic qeq/fire
qeq/fire qeq/point qeq/point qeq/reax qeq/reax/omp
qeq/shielded qeq/shielded qeq/slater qeq/slater rattle
reax/c/bonds reax/c/species recenter restrain rhok
rigid rigid/meso rigid/nph rigid/nph/omp rigid/nph/small
rigid/npt rigid/npt/omp rigid/npt/small rigid/nve rigid/nve/omp
rigid/nve/small rigid/nvt rigid/nvt/omp rigid/nvt/small rigid/omp
rigid/small rigid/small/omp rx saed/vtk setforce
setforce/spin shake shardlow smd smd/adjust_dt
smd/integrate_tlsph smd/integrate_ulsph
smd/move_tri_surf smd/setvel smd/wall_surface
sph sph/stationary spring spring/chunk spring/rg
spring/self srd store/force store/state tdpd/source
temp/berendsen temp/csld temp/csvr temp/rescale temp/rescale/eff
tfmc tgnpt/drude tgnvt/drude thermal/conductivity
ti/spring tmd ttm ttm/mod tune/kspace
vector viscosity viscous wall/body/polygon
wall/body/polyhedron wall/colloid wall/ees wall/gran
wall/gran/region wall/harmonic wall/lj1043 wall/lj126
wall/lj93 wall/morse wall/piston wall/reflect
wall/reflect/stochastic wall/region wall/region/ees wall/srd
widom

  • Compute styles:

ackland/atom adf aggregate/atom angle angle/local
angmom/chunk basal/atom body/local bond bond/local
centro/atom centroid/stress/atom chunk/atom
chunk/spread/atom cluster/atom cna/atom cnp/atom
com com/chunk contact/atom coord/atom damage/atom
dihedral dihedral/local dilatation/atom dipole/chunk displace/atom
dpd dpd/atom edpd/temp/atom entropy/atom erotate/asphere
erotate/rigid erotate/sphere erotate/sphere/atom event/displace
fep fragment/atom global/atom group/group gyration
gyration/chunk gyration/shape gyration/shape/chunk heat/flux
hexorder/atom hma improper improper/local inertia/chunk
ke ke/atom ke/atom/eff ke/eff ke/rigid
mliap momentum msd msd/chunk msd/nongauss
omega/chunk orientorder/atom pair pair/local
pe pe/atom plasticity/atom pressure
pressure/cylinder pressure/uef property/atom property/chunk
property/local rdf reduce reduce/chunk reduce/region
rigid/local saed slice smd/contact/radius
smd/damage smd/hourglass/error smd/internal/energy
smd/plastic/strain smd/plastic/strain/rate smd/rho
smd/tlsph/defgrad smd/tlsph/dt smd/tlsph/num/neighs
smd/tlsph/shape smd/tlsph/strain smd/tlsph/strain/rate
smd/tlsph/stress smd/triangle/vertices smd/ulsph/effm
smd/ulsph/num/neighs smd/ulsph/strain
smd/ulsph/strain/rate smd/ulsph/stress smd/vol
sna/atom snad/atom snap snav/atom sph/e/atom
sph/rho/atom sph/t/atom spin stress/atom stress/mop
stress/mop/profile tdpd/cc/atom temp temp/asphere
temp/body temp/chunk temp/com temp/cs temp/deform
temp/deform/eff temp/drude temp/eff temp/partial temp/profile
temp/ramp temp/region temp/region/eff temp/rotate temp/sphere
temp/uef ti torque/chunk vacf vcm/chunk
viscosity/cos voronoi/atom xrd

  • Region styles:

block cone cylinder intersect plane
prism sphere union

  • Dump styles:

atom atom/gz cfg cfg/gz cfg/uef
custom custom/gz dcd image local
local/gz movie xtc xyz xyz/gz

  • Command styles

balance change_box create_atoms create_bonds create_box
delete_atoms delete_bonds reset_ids kim_init kim_interactions
kim_param kim_property kim_query displace_atoms dynamical_matrix
group2ndx hyper info minimize ndx2group
neb neb/spin plugin prd read_data
read_dump read_restart replicate rerun reset_atom_ids
reset_mol_ids run set tad temper
temper/grem temper/npt third_order velocity write_coeff
write_data write_dump write_restart"

Thanks. This looks all fine.
Have you tried using CUDA instead of OpenCL when compiling the GPU package?

No I have not tried that yet. I will recompile it with CUDA once I have the time tomorrow and update you on if it works or not.

Hi,

Apologies for taking so long to respond. I was working on a lot of different projects and I ran into some problems while trying to compile LAMMPS with CUDA (which I thankfully managed to solve earlier today). Unfortunately my internship is coming to a close early next week, which means that I will most likely not be working on this program anymore in the future.

What I did learn about the issue during this time however, was that it might be because of something in the cluster itself I was running the program on. A node failure occurred at the cluster a bit over a week ago. After this failure I ran some tests with the program (where before the node failure the OpenCL error occurred every time on the same loop despite the seed), and the program actually ran for much longer than it did before the node failure. In addition, the loop the error occurred on changed when the seed was changed.

Thank you again for your help!

There is no new information here and no way to reproduce your failures elsewhere which means there is no way to provide help beyond what was already given.