I would like to speed up the simulations of polymer dynamics.
So, in our institute we own several GPU and CPU based computers. I installed the CPU and GPU package and ran several tests. It turned out, the CPU computers were quite faster than the GPU units which we own. For this test I choose a simple lennard - jones melt and an polymer system too.
However, among the CPU units the speed of simulation for LJ and polymer systems strongly depends on the number of CPUs and Memory etc.
I know the information is maybe not sufficient but could you give me some advise/ hints/ experience what kind of computer systems would work best for polymer systems and then which and accelerator package to use in lammps ? I would need some advise of state of the art machines e.g. for $100k pls…?
Shall I go with multiple CPUs/ GPUs ? Shall I get as much as possible memory ? Is lammps prefering a special achitecture of the CPUs/ GPUs processor ?
it is near impossible to give a generic recommendation for a given budget without knowing more details about deployment, facilities etc. and investing a significant effort in figuring out for what specific objectives/targets and under which required boundary conditions the machine will have to be used and operated.
for a budget of about $100k you first have to make a decision on whether you want to get a big workstation/desktop style single machine (easy to deploy and manage) or a multi-node cluster (most cost efficient if it can be attached to an existing infrastructure, so you don’t have to invest in extra storage, network, front end etc.).
beyond that, some general remarks.
how much you can speed up a calculation with additional hardware strongly depends on the size of the system and the force field and what other features you are using at the same time. this is independent from whether you use a GPU or CPU based solutions
there is a limit (atoms per CPU or GPU) how far you can parallelize. with GPUs you need a much larger number because a) there is overhead associated with launching GPU kernels, and b) GPUs will not be fully utilized if the number of work units is limited.
not all functionality is fully accelerated for GPUs. even if the force computation can be well accelerated, your parallel performance may be negative impacted by doing time consuming analysis computations
the bottom line:
for a single workstation adding 2-4 compute GPUs is almost always useful (but may be expensive), since you cannot add more CPUs and thus you can get an additional performance boost without having to get a cluster (but you can by now have quite a few CPU cores in a single 4-way main board, so this is not as much of a speedup than it used to be 10 years ago).
for fixed size systems and with no budget limits, the absolute best performance is still only done with CPUs since they have better strong scaling.
everything else is a tradeoff. a high-throughput computing requirement will usually benefit more from GPUs, while an all-CPU machine will give you more balanced performance and flexibility. or in other words, you may not get the same peak performance you could get with GPUs for optimal calculations, but you would get better performance with CPU-only for anything that is not so well accelerated with GPUs.
with GPUs you also need to keep in mind that professional grade GPUs are significantly more expensive than consumer grade GPUs, but you will be hard pressed to find ways to put consumer grade GPUs into rackmounted cases (except perhaps when looking for gear intended for cryptocurrency mining).
thank you Axel, this was quite helpful… I will discuss with our IT department in more detail now and will take your advise strongly into account…
Michael,
it is near impossible to give a generic recommendation for a given budget without knowing more details about deployment, facilities etc. and investing a significant effort in figuring out for what specific objectives/targets and under which required boundary conditions the machine will have to be used and operated.
for a budget of about $100k you first have to make a decision on whether you want to get a big workstation/desktop style single machine (easy to deploy and manage) or a multi-node cluster (most cost efficient if it can be attached to an existing infrastructure, so you don’t have to invest in extra storage, network, front end etc.).
beyond that, some general remarks.
how much you can speed up a calculation with additional hardware strongly depends on the size of the system and the force field and what other features you are using at the same time. this is independent from whether you use a GPU or CPU based solutions
there is a limit (atoms per CPU or GPU) how far you can parallelize. with GPUs you need a much larger number because a) there is overhead associated with launching GPU kernels, and b) GPUs will not be fully utilized if the number of work units is limited.
not all functionality is fully accelerated for GPUs. even if the force computation can be well accelerated, your parallel performance may be negative impacted by doing time consuming analysis computations
the bottom line:
for a single workstation adding 2-4 compute GPUs is almost always useful (but may be expensive), since you cannot add more CPUs and thus you can get an additional performance boost without having to get a cluster (but you can by now have quite a few CPU cores in a single 4-way main board, so this is not as much of a speedup than it used to be 10 years ago).
for fixed size systems and with no budget limits, the absolute best performance is still only done with CPUs since they have better strong scaling.
everything else is a tradeoff. a high-throughput computing requirement will usually benefit more from GPUs, while an all-CPU machine will give you more balanced performance and flexibility. or in other words, you may not get the same peak performance you could get with GPUs for optimal calculations, but you would get better performance with CPU-only for anything that is not so well accelerated with GPUs.
with GPUs you also need to keep in mind that professional grade GPUs are significantly more expensive than consumer grade GPUs, but you will be hard pressed to find ways to put consumer grade GPUs into rackmounted cases (except perhaps when looking for gear intended for cryptocurrency mining).
Atomsk is a great tool for making all kind of nanostructures for LAMMPS and other simulation codes. You can access all its resources here: Atomsk - Tutorials