I am trying to run a structure (already calculated using QE) using exciting to calculate Spin-Orbit coupling. However, when i try to modify the GaAs input file that is provided, and run the commands, I am receiving the error message that is copied below. Any help would be appreciated.
Program received signal SIGSEGV: Segmentation fault - invalid memory reference.
Backtrace for this error:
#0 0x153ce21ecd21 in ???
#1 0x153ce21ebef5 in ???
#2 0x153ce1e4620f in ???
#3 0x562bd94a97ce in ???
#4 0x562bd90e6d39 in ???
#5 0x562bd93a2287 in ???
#6 0x562bd904d96b in ???
#7 0x562bd8edb3a4 in ???
#8 0x562bd8ec7083 in ???
#9 0x153ce1e270b2 in ???
#10 0x562bd8ec70bd in ???
#11 0xffffffffffffffff in ???
Segmentation fault (core dumped)
Elapsed time = 0m0s
Hi Hemanth, it’s impossible to tell from this backtrace.
Please:
a) Rebuild the code with make debugmpiandsmp
or make debug
and run again. This will fill in the symbol in the back trace.
b) Provide input.xml and species files used.
Cheers,
Alex
Hello Alex,
Thanks for the reply. I was able to rectify the issue with the input file, and was able to get the code to start up. However, I observed a few anomalous behaviors while trying the code out:
- My system has 14 atoms and belongs to a very high symmetry point group. I had estimated that it should not take much time to run an SCF calculation on the system, but it has been running since yesterday evening, and is still stuck at the line
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
- SCF iteration number : 1 +
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
and has not proceeded beyond this state. This is on a serial calculation.
- When I tried running the same calculation on multiple cores using the pure mpi build, the process stopped by flagging the message “mpiexec has detected the process has been stopped on node04”. I do not have a stack trace for this error, and when I tried to build a debugmpi case, it flagged a message saying such a target is unavailable. I am currently running the same calculation using the executable generated in debugmpiandsmp, and would update you with the stacktrace. Any advice in the mean time would be much appreciated.
Thanks and Warm Regards
Hemanth
Alex,
-
I am also attaching the input file that I have used for the test. I could not create a stack trace since the calculation has been running since yesterday, and has not moved forward from the SCF Iteration 1 step. The TOTENERGY file, which should get updated after each SCF cycle is empty. So this should mean that no SCF cycles have been completed in a span of 12 hours.
I have attached the input files as a github gist, since I cannot attach files here.
-
The MPI calculation failed with the error “mpiexec noticed that process rank 1 with PID 0 on node ssm exited on signal 9 (Killed).”
Thanks,
Hemanth
link to the input file
Hi Hemanth,
Uploading
Are you sure you can’t upload? I see an option to so do (the upward arrow). Else, put as text and convert to a snippet using </> button. Links die, so they’re no good for people in the future. If you can upload the corrected input file, I can check it out.
Debug builds
Indeed, you can’t make debugmpi
, there’s no option for it in the makefile. make debugmpiandsmp
will include MPI and openMP parallelism (MPIANDSMP should be the preferred build type).
INFO.OUT
Is SCF iteration from INFO.OUT? It does indeed sound like something’s hanging.
Second point, Pasquale noticed that your 3rd lattice vector is:
<basevect> -5.677366878677726E+001 -0.000000000000000E+000 -0.000000000000000E+000 </basevect>
but your k-grid is:
ngridk="30 30 1"
Looks like the direction perpendicular to the slab is x, not z, so should use ngridk="1 30 30"
Apologies for not linking the input file previously. I was not granted the privileges to upload files yesterday when I made the post. I have uploaded the input file with this post.
input.xml (2.3 KB)
Yes, and when I tried running the calculation for just Gamma point, It took nearly 4 hours (3 hours and 52 minutes) to complete one SCF iteration. This is what convinced me that I had messed up something while building the code.
Any pointers in this matter would be much appreciated.
Thanks and Warm Regards,
Hemanth