Topic of the Month: What are your favorite pre- and postprocessing tools

akohlmey · January 26, 2022, 10:24pm

Since we had some success with starting an open discussion and have it pinned to the top of the category for a while, I am starting now an experiment and will try to regularly post a question about some topic that has no simple “do this, not that” answer, but where there are different recommendations for different kinds of simulations and different preferences of people. So please add your opinions and suggestions. Thanks, Axel.

I recently noticed that the list of tools on the https://www.lammps.org/prepost.html page has not been updated in a while and with a simple search on google found a few tools that should be mentioned, so here is my question for this time:

What are your favorite pre- and postprocessing tools to prepare LAMMPS input data and to analyze results?

So please let us know what tools you use and when and what they are good at and where they are tricky to use. This can be tools already listed on the LAMMPS website or tools that could (or should?) be added.
If you write your own tools, then please describe what programming language you use and why and what module or libraries or packages you are using to avoid having to write everything from scratch. If you have a small, standalone tool to do just one or a few particular kinds of analysis, or some representative examples for using some particular toolkit, you could also quote or attach them here, so that others can use it and comment. If feedback is good, we can also add the best examples to the LAMMPS distribution or add tools to the website. So this may be an opportunity to cash in on the 15 minutes of fame that everybody is entitled to (according to Andy Warhol, IIRC).

giacomo.fiorin · January 27, 2022, 3:45pm

I’m going to first suck up a bit, then sneak in some promotion.

I use frequently VMD, which hosts various plugins for pre- and post-processing including (and here’s the sucking up) TopoTools. Specifically, I personally found TopoTools useful for preparing or editing polymer or lipid topologies for simulations done with LAMMPS (here, here and here).

More recently (promotion of a tool), we are using the new Colvars Dashboard tool (pre-print here) to prepare and analyze Colvars configurations/inputs using a GUI.

Giacomo

srtee · January 28, 2022, 6:08pm

To prepare systems, I find that the in-built LAMMPS tools are often sufficient for my purposes – using some combination of replicate, create_atoms, change_box and appending data files will often get me where I need to go. If I needed to do something really complicated I would reach for Moltemplate + Packmol.

Like Giacomo, I use VMD and Topotools for visualisations. For post-processing, I use MDAnalysis – it’s a fantastic Python package for this purpose. You can:

Read multiple dump files (like from a glob) into a single trajectory
Iterate over trajectory frames using list comprehension to do trajectory analysis
Select groups of atoms as a list and sub-select atoms using slices
Read out (per-frame) coordinates as a NumPy array to calculate any observable you like

It’s also got the usual analyses inbuilt (like RMSD/F and density profiles), and IIRC you can even do visualisation if you use it in a Jupyter notebook.

Germain · January 31, 2022, 2:39pm

My 2 cents here. I would split that in 3 categories: preprocessing, visualizing and postprocessing.

Preprocessing:
- For atomic configurations, LAMMPS has a lot of perfectly fine features. I recently prepared a small cubic FCC crystal on top of a diamond surface like a breeze. It is a bit of reading through the manual but it clearly is worth it. I know build most of my crystalline configuration in pure LAMMPS scripts.
- For small molecules configurations: I tend to also rely on LAMMPS features for homogeneous configurations. For example, you can use molecule files from LAMMPS, place them on a large crystalline lattice and heat up the system, or insert them randomly and make careful minimization. A typical use case comes to my mind, which is when I used the LigParGen server from Prof. Jorgensen’s team: LigParGen Server (careful though, I would double check the forcefield parameter conversion to LAMMPS format). I could still use it to derive molecular templates.
  For more complicated systems, a combination of Moltemplate and Packmol usually does the trick.
- A particular case is polymeric materials: up to now the only method I know is to grow chains through random walk using homemade code or simulation and minimize carefully, either with nve/limit or using soft potentials to get relatively good starting points. As far as today, no tool I used could easily give me what I expected for these kind of systems.
Visualizing:
- Well VMD is a most go there since topotools makes it very convenient to load data files and visualize trajectories (Thanks @akohlmey). The command line interface makes a very convenient alternative to click-button interface (which I hate) but the big downside is that it is Tcl/tk. Investing in learning it becomes more and more tedious and I tend to script exclusively in Python. Also I really didn’t like some changes in the way .vmdrc file is now handled (since I am too lazy to translate it in the new automatic format). The POVray interface is still amazing and allows to produce great images/films.
- From my previous position I learned about Ovito which is, actually a great tool. However, the code is very dense (so I am not very able to modify it or script it) and there are full-time developers dedicated to it who need to eat. So I understand that they sell a license to get all the options of the software and I may only have seen the tip of the iceberg (big SO to Dr. Stukowsky and Dr. Kalcher, the tool is great, just not my cup of tea). I am also not a fan of click-button interface as I feel I am constrained by what the code is able to do. But for most users, I would call it a must try.
- Last but not least I step upon this slideshow (again, you’ll see who to thank for it) and gave a try into producing images directly from LAMMPS. The lib is great, works from rerunning trajectories, and allows to color atoms with on-the-fly properties like temperature or velocity. An example I did was from a 2d Rayleigh rolls simulation (sorry for the resolution, Youtube did a terrible job here, the video is actually way better). This example still needed minimal use of ffmpeg to make a single video out of 3, but the 3 videos were produced during the simulation, at the same time.
Postprocessing:
Depending on what I want I tend to come up with different ideas every time and end up either coding my own script if it is easy enough, or use a combination of LAMMPS and other software if available. It is a very good thing IMO to learn about what can be already accurately computed by rerunning trajectories with LAMMPS (like RDF) and what can be computed with better statistics in postprocessing tools (like MSD). I would then use Python. The thing is, if heavy maths are involved, I’d rather make my own script to know what is going on (or rely on libs I can relatively trust like numpy), if it is simple, I am faster making a quick and dirty thing. Recent simple examples I kept on Github are these scripts which I can detail a bit more here:
- plot_forces.py, as the name imply is intended to plot forces and energy (with comparison to kBT in real units) from tabulated output format. I used to work with tabulated forces in a previous project and this was quite handy to compare them to one another. It can plot all forces appended to a single file and compute the energy through finite difference.
- lmp_ave_post.py computes average AND standard error (which LAMMPS cannot do for now) from ave/* output files (maybe not that generic but worked for ave/time and ave/chunk as far as I remember). Quite nice to have an error bar to those pressure profile.
- Finally velplot.py is the one I used to plot Rayleigh_vel.pdf which is the velocity profile of the example just above with color coding according to the temperature (there is a circle variable in the script, don’t bother, it was here to check the ratio of the figure). It is an easy way to read a 2d chunk output.
The point of these scripts here is to show that you can already do things with a few lines of Python (or any scripting language you like). I am actually working on a small lib that would allow me to call directly some simple objects like data to extract box and atoms providing just the format at once or a dump class with an iterator to loop over configuration but that is (another) work in progress. The idea is to stick to the KISS philosophy and try not to be redundant with the possibilities of IPyLammps wrappers.

Wow, I must admit that this comment was longer than expected. If you did read this far, thank you and I hope you got something out of it.

akohlmey · January 31, 2022, 3:31pm

Two comments on your post-processing approach:

there is Pizza.py with functions to process LAMMPS output and files. It may need some updates for python 3 compatibility here and there
there is a how to in the manual showing how to print thermodynamic data in structured format like yaml or json which makes importing even easier.

Germain · January 31, 2022, 7:41pm

Hi @akohlmey just answering your comment here:

I gave a shot at trying to modernize pizza.py last summer for I had some time and wanted to give it a look so I though it would be good exercise. I think it would be a neat Python package with some updates to Python3, but the updates in question were more important than what I initially though (including update of the graphical library to pillow and maybe changing a lot of the way things are done). I think @rberger was the last person who send commits to the git repo more than one year ago, and there are some pending reviews for pull requests since longer, so I have no idea about the status of the project. That is one of the reason I started something from scratch on my own. If anyone is interested I can put some order in what I did and start a pull request on Github so this discussion can continue there.
I did not know about this section thanks for pointing it out!

kvn.chu · February 2, 2022, 7:01pm

Preprocessing:

A mix of Atomsk and the built-in LAMMMPS commands are sufficient for most of my needs. I mainly study dislocations in metals though, which are relatively simple configurations.

Visualization/Post-processing:

Primarily OVITO and its Python interface due to its dislocation extraction algorithm. What @Germain mentioned about the paywall is true of the GUI, but the you can access all the “Pro” features through the Python package.

akohlmey · February 3, 2022, 3:17am

That is a question for @sjplimp
It is a bit in limbo and some components of it have been outdated or non-functional for a very long time.

However there is a more portable subset and particularly parts usable for processing LAMMPS files in the tools/python/pizza folder.

One of the ideas that we were bouncing around at Temple when talking about Pizza.py would be to import that commonly used subset as a submodule into the LAMMPS python module and take the most reusable pieces from a “take whatever piece you need and adjust it for your purposes” library that it seems to have been conceived as to a “first class” member of the LAMMPS software environment (and with proper unit tests to boot). Having some more higher level analysis and post processing available might be useful as well.