Bad allocation of large system

Lynn_Huilin_Ye · April 3, 2018, 6:19pm

Dear Lammps users,

I encounter a problem when I run large system with thousands molecules(about 10 million atoms), indicating the information when reading data file:

terminate called after throwing an instance of ‘std::bad_alloc’
what(): std::bad_alloc

I’m sure this is caused by the system size, because when I try a small size system(200 molecules), it works.
I have used ulimit command to enlarge the stack size, but it doesn’t help.

I know the limitation of atom size in lammps is about 20 billion, so I’m confused what causes this problem?

Thanks,
Lynn

akohlmey · April 3, 2018, 6:25pm

Dear Lammps users,

I encounter a problem when I run large system with thousands molecules(about
10 million atoms), indicating the information when reading data file:

terminate called after throwing an instance of 'std::bad_alloc'
what(): std::bad_alloc

I'm sure this is caused by the system size, because when I try a small size
system(200 molecules), it works.
I have used ulimit command to enlarge the stack size, but it doesn't help.

I know the limitation of atom size in lammps is about 20 billion, so I'm
confused what causes this problem?

that limit is 2 billion atoms per processor. however, that is due to
indexing with 32-bit integers.
however, the exception you are showing happens, because you are
running out of memory or address space.

what platform are you running on? also, how much RAM does your machine have?
you are possibly running of of memory when allocating neighbor lists.
you need storage for several hundred atom indices per atom.
that may put you over the top with in a 32-bit environment.

axel.

Lynn_Huilin_Ye · April 3, 2018, 6:56pm

Hi Axel,

Many thanks for your answer. I ran the program in Stampede2, and the details is as follows:

Model: | Intel Xeon Platinum 8160 (“Skylake”) |

| - |
Total cores per SKX node: | 48 cores on two sockets (24 cores/socket) |
Hardware threads per core: | 2 |
Hardware threads per node: | 48 x 2 = 96 |
Clock rate: | 2.1GHz nominal (1.4-3.7GHz depending on instruction set and number of active cores) |
RAM: | 192GB (2.67GHz) |
Cache: | 32KB L1 data cache per core; 1MB L2 per core; 33MB L3 per socket. Each socket can cache up to 57MB (sum of L2 and L3 capacity). |
Local storage: | 144GB /tmp partition on a 200GB SSD. Size of /tmp partition as of 14 Nov 2017. |

As for my program, I used 12 nodes to perform the system with 10 million atoms. And they are read from data file in molecular type, including bond, angle etc…
According to the platform, do you think what’s the problem?

Thanks,
Lynn

akohlmey · April 4, 2018, 2:41am

Hi Axel,

Many thanks for your answer. I ran the program in Stampede2, and the
details is as follows:

Model: Intel Xeon Platinum 8160 ("Skylake")
Total cores per SKX node: 48 cores on two sockets (24 cores/socket)
Hardware threads per core: 2
Hardware threads per node: 48 x 2 = 96
Clock rate: 2.1GHz nominal (1.4-3.7GHz depending on instruction set and
number of active cores)
RAM: 192GB (2.67GHz)
Cache: 32KB L1 data cache per core; 1MB L2 per core; 33MB L3 per socket.
Each socket can cache up to 57MB (sum of L2 and L3 capacity).
Local storage: 144GB /tmp partition on a 200GB SSD. Size of /tmp partition
as of 14 Nov 2017.
As for my program, I used 12 nodes to perform the system with 10 million
atoms. And they are read from data file in molecular type, including bond,
angle etc...
According to the platform, do you think what's the problem?

i don't know for certain what the problem is, but it is most likely
somewhere between your chair and your screen.
there are *so* many ways to mess things up and without a crystal ball it is
difficult to say from remote what is wrong.
perhaps you can get some help from the user support folks at TACC?

axel