[lammps-users] Modification to lammps restart reads

Duncan - I just released a 16 Sept patch that
does the parallel I/O for read_restart. I generalized
your ideas to work with any number of procs. I.e. the
number that wrote the file doesn't have to match the
number that reads it. This required some extra communication
at the end to get atoms where they should be for a different
decomposition.

So please try it out and see if you still get a 2 minute read time
(down from 5 hours!) on your big problem.

Thanks,
Steve

Hi Steve.
Apologies for taking so long to get back to you on the testing; our
machine is a bit packed out at the moment and getting 2048 cores has
proved to be tricky.

My original testing was on a Cray XT3 and took 5 hours with the
original read_restart and about 2 minutes with my modifed version.
We've since bought a new machine, not a Cray, but with Lustre still.
I've rerun the original tests and also the 18Sep10 version and get
these read times:

Lustre, 16 OSTs, 2048 PEs, 10 billions atoms (EAM potential).

04Sep10 version - Job ran out of time after 10 hours trying to read the file.
My modification - 119 seconds - 1 file per processor
18Sep10 version - 150 seconds - 1 file per processor

So my specialised modification still comes out slightly quickly, but
your more general version is pretty much there too. In fact the 30
second difference could just be noise in the file system.

Cheers.
Duncan

ok - thanks for testing - the more general version
has to do some checking for possible communication
(and then do it if applicable), so that could easily be
the extra overhead.

Steve