sorry to not let this thread die down, but i think it is in the
general interest of a few more users to demonstrate why the questions
you were asking are not good questions.
I'm Caterina, a researcher of Mechanical Engineering Departement of
University of Calabria, Italy. My Departement has to buy a complete equipment
(machine, post-processors...) in
order to run MD simulations with LAMMPS. Can you give me any suggestion?
LAMMPS runs on a wide variety of machines. the most common are linux
clusters. since LAMMPS is a tightly coupled parallel application, it
is important to have a low latency interconnect (like infiniband) in
addition to a regular tcp/ip network. for any more specific details,
one needs more specific boundary conditions.
What is the most powerful machine on the market today?
now that is a *very* difficult to answer question. what is "powerful"?
there is no uniform measure. the most commonly used benchmark is the
high-peformance linpack benchmark that is used to rank machines in the
top500 list. this benchmark has been criticized repeatedly over time,
since it favors (like any specific scientific application) a very
specific workload and one can "game" the system. the HPL benchmark is
a dense linear algebra problem and the test is effectively a so-called
"weak scaling" benchmark, i.e. the larger the machine, the larger the
problem to be solved. also, the problem has a high arithmetic
intensity, meaning it does a lot of math, and but has predictable and
well ordered data access with good chances to exploit CPU caches. it
is a linear algebra problem, which also makes it easy to fully exploit
the vector instructions in modern CPUs. as a consequence, you see that
often the difference between the maximum theoretical performance (in
FLOPS) and the actually achieved performance is relatively small.
typically in the range of 75% to 95% of the maximum. for many
scientific workloads this drops to 5% or less when run on a very, very
large machine. to get a good ranking in the top500 you need to
increase the storage per node, you benefit from accelerators and many
CPU cores per node a lot, you benefit from large CPU caches and you
are not that much limited by lower memory and communication bandwith.
what does that mean for using LAMMPS? are the top500 ranked machines
good to run LAMMPS on? yes
can you run LAMMPS fast? yes
and for large problems? yes
does the top500 ranking indicate how the same machines would rank, if
the benchmark was a LAMMPS input? no.
would the LAMMPS performance be the same with different LAMMPS inputs? no.
does LAMMPS run faster with adding GPUs to a node? most of the time.
does the machine that would run a given LAMMPS input the fastest have
GPUs? probably not.
the last few points are particularly important, since there are some
fundamental properties to consider.
LAMMPS uses spatial decomposition for MPI parallelization. for short
ranged potentials and homogeneous dense system, that results in almost
perfect parallel scaling. the basic approach is linear scaling, but at
some point technical limitations and communication overhead will
become dominant (amdahl's law!!) and scaling stops. with a well
designed CPU based machine with a powerful network and using
multi-core cpus with multi-threading you can often scale to have a few
10s of atoms per CPU core. less so with GPUs, since their architecture
requires overlapping computation and data access and thus they become
less effective with low numbers of atoms per GPU.
however, these factors differ based on the potential styles and
whatever other features in LAMMPS you are using. if you do a lot of
compute/reduce, your parallel scaling will suffer, since global
reductions are the more expensive the more nodes you have, i.e. the
larger your machine is and the less work per processor you have.
similar problems affect using fix rigid (hence the implementation of
fix rigid/small to get better parallel scaling for the case of many
small rigid objects). similarly, using long-range electrostatics is a
problem because of the require parallel 3d-FFTs, which can only be
made to parallelize only in 2d (and not 3d) and thus has particularly
for large system much less work units to parallelize over than with
only a short-range pairwise additive potential. also the FFTs need
all-to-all data transfers and a redistribution of the data different
from the spatial decomposition for the real space decomposition, which
introduces an overhead that will grow with the number of processors
used and - since the amount of compute work per processor shrinks -
will end in a "scaling catastrophe" at some point.
now LAMMPS has some features that can be used to delay when this
becomes a problem, but it cannot be completely avoided unless you use
a completely different approach to long-range interactions, which in
turn is much slower.
What kind of machine can simulate systems of very high dimensions?
pretty much any machine that is large enough.
What features should it have?
lots of fast CPUs, a fast interconnect, a large, parallel storage. in
short: a supercomputer.
My University would like to buy the best equipment on the market.
there is no "best". there is only "best for a purpose" or "best for a
budget" and even then there is the question of what is "best". there
is best value, best absolute performance for a single calculation,
best throughput, best usability, best power usage (i.e. lowest), best
to manage and operate. most of these overlap, some are opposite there
is no single solution.
I'm no expert on computers and I would be happy to receive your advice!
this is why i don't understand why you get upset when you actually do
get advice. only because the advice is not the kind of advice you
expected. but if you are the proverbial blind person asking somebody
to describe colors to you, what *can* you expect?