I am running amset a multinode machine and the core dumps even on a single node. The code works fine on a normal workstation with 56 cores. What may be the reason?
Thank You!
I am running amset a multinode machine and the core dumps even on a single node. The code works fine on a normal workstation with 56 cores. What may be the reason?
Thank You!
It’s hard to know exactly. Are you setting nworkers and OMP_NUM_THREADS?
I tried with 24 and 48 nworkers. The job script looks like this…
#SBATCH --tasks-per-node=48
#SBATCH --job-name=AMS
#SBATCH --error=error
#SBATCH --partition=highmemory
#SBATCH --time=1-00:00:00
amset run --no-separate-mobility -z prefer
The process stops at…
1.00e+21 700.0 9.74e-03 2.00e+21
1.00e+21 725.0 9.67e-03 2.00e+21
1.00e+21 750.0 9.61e-03 2.00e+21
1.00e+21 775.0 9.55e-03 2.00e+21
1.00e+21 800.0 9.48e-03 2.00e+21
Initializing POP scattering
- average N_po: 27.4548
- w_po: 2.576 2pi THz
- hbar.omega: 0.0017 eV
Thank you!
Can you try with a smaller number of processors. E.g., 2 and then increase the number slowly. You could be running out of memory.
You can also try setting this in your settings.yaml:
cache_wavefunction: false
If this doesn’t work then there is likely an issue in your numba
installation. You can try uninstalling and re-installing this package using conda.