Dear lammps users,
I am somewhat new to boost scientific calculations with strong hardware and I need some guidelines from you. I have recently built a strong single-CPU desktop machine on which I also want to run lammps (it was partially built for that), and of course, I want to get the most out of the configuration.
I have spent quite some time googling some info, but for most of the time it is not specific enough for my case. Besides your general point of view I would also be very grateful if you could point me to the right thread that I might have missed on this mailing list.
So here is the configuration: i7 5820k (six-core, hyper-threading possible) @ 3.3 GHz, GeForce GTX 980, 16 GB DDR4 memory, running ubuntu 14.04.
I have Open MPI 1.10.1, Open MP and the latest cuda and lammps (7dec15).
My systems: For now I will need to run simulations of 1) dense LJ particles (not more than a several thousands, but many of such systems) and 2) simple fluids (water and small alkanes) between walls of LJ particles.
Later I will simulate larger molecules and ions with water.
My questions:
1) MPI or OpenMP. Before my search I thought I should go for OpenMP as my machine only has one CPU. Then I found out it is not that straightforward and actually MPI can also do a lot even with my single proc. Is this true?
2) Furthermore I am most interested to find out what libraries I should use to get the best performance. I understood there is no single rule applicable to all configurations (and systems) but I would like to know some basic thumb rules. As a matter of fact, I am planning to make tests to find the best combination of these accelerating tools.
-I found somewhere that hyper-threading is only efficient if one uses CPU acceleration but not in combination with GPUs.
-I understood the gpu library might be more suited than user-cuda due to the little size of my systems, but this also depends on what I want to compute on the fly. Is this correct?
-Should I try KOKKOS (it seems to me it is suited for multi-threaded CPUs and GPU acceleration) or is it better to use user-omp in combination with the gpu library?
I know all these are rather vague questions but any pieces of information will be highly appreciated that clarify any of the above points and can help me take the right direction and avoid some big fails.
Thanks a lot in advance.
Gyorgy