Bus Error with LAMMPS

Dear All,

I am running the thermal conductance calculation with LAMMPS. I upload my input file (in.heat), output files (3545706.out and 3545706.err) and log file into the attachment.

Would anyone please tell me why I encounter the ‘Bus error’ mistakes in the3545706.err file? Is the problem from my input file?
in.heat (5.2 KB)
log.lammps (11.2 KB)
3545706.out (16.8 KB)
3545706.err (4.7 KB)

This is the structure file content.

 # LAMMPS data file written by OVITO Basic 3.10.4
 
 87000 atoms
 12 atom types
 
 0.0 24.7284 xlo xhi
 0.0 23.795 ylo yhi
 0.0 1258.07568 zlo zhi
 
 Masses
 
 1 15.9994  # O
 2 26.981538  # Al
  3 12.0107  # C 
  4 12.0107  # C 
  5 12.0107  # C 
  6 12.0107  # C 
  7 12.0107  # C 
  8 12.0107  # C 
  9 12.0107  # C 
 10 12.0107  # C 
 11 12.0107  # C 
 12 12.0107  # C 
 
 Atoms  # charge
 
           1           1   0.0000000000000000        0.0000000000000000        1.4562539999999999        607.09070976240002     
           2           1   0.0000000000000000        4.1214082428000003        3.8357540000000001        607.09070976240002     
           3           1   0.0000000000000000        1.3738357188000001        3.8357540000000001        602.76041327179996     
           4           1   0.0000000000000000        5.4952192332000003        1.4562539999999999        602.76041327179996     
           5           1   0.0000000000000000        2.7475725240000002        1.4562539999999999        598.43011678129994     
           6           1   0.0000000000000000        6.8689807668000000        3.8357540000000001        598.43011678129994     
           7           1   0.0000000000000000        1.2611483999999999        4.0308729999999997        607.09070976240002
......
......
......
       86993          12   0.0000000000000000        18.546299999999999        22.605250000000002        644.09448973819997     
       86994          12   0.0000000000000000        22.667708242800000        20.225750000000001        644.09448973819997     
       86995          12   0.0000000000000000        19.233180766800000        19.036000000000001        644.09448973819997     
       86996          12   0.0000000000000000        23.354564281199998        21.415500000000002        644.09448973819997     
       86997          12   0.0000000000000000        18.546299999999999        20.225750000000001        644.09448973819997     
       86998          12   0.0000000000000000        22.667708242800000        22.605250000000002        644.09448973819997     
       86999          12   0.0000000000000000        19.233180766800000        21.415500000000002        644.09448973819997     
       87000          12   0.0000000000000000        23.354564281199998        19.036000000000001        644.09448973819997

Any suggestions appreciated.

Kieran

Is this error reproducible? Does it happen with fewer processors?

Without more information and knowing the machine the job is being run on and how well it is set up and operated, it is difficult to analyze and assess the cause of the error. Here are some speculations:

  • It crashes during writing of a dump file. Do you have sufficient disk space? Many HPC clusters have quotas which can lead to writes to files stalling when the quota is exhausted.
  • The other possible cause would be a connectivity problem of the infiniband network. This can be a configuration problem (unlikely) or a hardware problem (more likely) or a software/firmware issue (also possible). However, only a local system manager can confirm if any of that is the case.

Very unlikely.

Thank you for the reply.

I contacted the administrator running my cluster and they were doing some updates on the cluster and caused the memory changed. This is why the problem happened.

Thank you again.