How to improve communication between LAMMPS users, how to better support LAMMPS?

akohlmey · November 17, 2021, 9:19pm

Dear LAMMPS users and developers,

As you may have noticed, the communication in this forum is rather one directional: many people ask questions, very few provide answers, and there are next to no “open” discussions.

We had started the LAMMPS category in the MatSci discourse forum with the expectation that a forum might encourage participation of a larger number of people better than the lammps-users mailing list. Sadly, that has not happened.

Thus I am writing this message to start a discussion in order to collect some feedback on what could be done to improve the situation. Below are some discussion points:

Operating and moderating the mailing list and the forum at the same time is not going to work forever as it requires duplication of efforts. What speaks for closing the mailing list, what for making the LAMMPS categories in the forum read-only and only use it as a mailing list archive?
We need more participation from a larger number of LAMMPS users. Users with little experience should guide those new to LAMMPS, those with more experience should help those with less, and experts should focus on complicated topics or assist others to correct mistakes or provide additional insight. What could be done to encourage this? What are the reasons that people do not participate in discussions, even if they would know (part of) the answer?
LAMMPS has a large manual with lots of explanations and technical details, yet it seems that problems often arise because people don’t read enough of it, or have difficulties identifying the most relevant parts, or need more practical examples with explanations. What could be done by the LAMMPS community to improve this?
As a LAMMPS user, what are the problems that bother you the most? Where do you feel more effort should be spent? What would be a (scientific) software and community that could serve as an example for how the situation around LAMMPS could be improved?
As a developer working on software to be added to LAMMPS or writing a software that uses LAMMPS as a library or that is working on modifying LAMMPS, where do you see the biggest deficiencies? What would need to be done to make your development work easier?
As a long-time LAMMPS user or developer, which of the many changes that we have made over the last 5+ years have worked well for you? Where did LAMMPS go backwards and should revert to how things were done previously?

Please share your thoughts.

Many thanks in advance.

ahochwallner · November 18, 2021, 11:15am

Ad #2:
Having the questions grouped into beginner, intermediate and expert would maybe help lowering the entry barrier. One would no longer have to browse through posts that are directed to a different target group.

akohlmey · November 18, 2021, 1:24pm

FYI, there are already sub-categories (e.g. LAMMPS Beginners) set up. So it is possible to follow only one or multiple of them instead of the entire LAMMPS category (which includes the mailing list mirror/archive with sometimes a lot of messages).

Frade · November 18, 2021, 9:35pm

Hello dear akohlmey, I believe that LAMMPS still has a small user base, this makes consequently few people really fluent in using the software. As the questions on the forum are usually quite specific, intermediate users or beginners do not have the necessary knowledge to answer.

I firmly believe that the way to improve forum engagement would be to increase the number of LAMMPS users. Over time, these users would become more and more fluent in the software and would begin to feel more confident about answering the forum’s questions.

Currently the LAMMPS user manual is extremely detailed and well done. However I believe that many beginners do not understand some terms and methodologies that are contained in the manual. I believe that this causes the user to end up giving up using the program and directly reflects on the forum membership, as the number of users does not grow as it should (because LAMMPS is an amazing program and should have a larger user base) .

One way that I believe could facilitate the use of beginning users is the creation of tutorial videos of basic simulations in the software and making the scripts developed in the video available. Once the user has a basic notion of using the software, he will be able to use the LAMMPS manual with more dexterity, and with time he will become a fluent user, who will be more confident in answering the questions on the forum.

Extremely relevant subject! I hope I have given opinions that help developers and users. Thanks!

akohlmey · November 18, 2021, 10:07pm

Please check out https://lammpstutorials.github.io/ (mentioned on the LAMMPS homepage under tutorials).

We also have the recordings from the tutorial sessions of the last LAMMPS users workshops:

And please also check out (if you have not): GitHub - mrkllntschpp/lammps-tutorials: LAMMPS tutorials for Beginners

While there is room for improvement, it is a start.

The numbers we see speak against it. There are almost 1500 people subscribed to the mailing list, we had nearly 750 registrations for the LAMMPS workshop in Summer (up from about 150 people at the previous in-person meeting), there are 1300 forks of the LAMMPS repository on GitHub. And in all these cases, we have to assume that this is just the tip of the iceberg and that thus there are very likely 10s of thousands or users, if not more. Another indicator is the constant rise of citations of the (original) LAMMPS paper: View article

Furthermore, while the questions may look specific, most are not. One has to separate the (unfamiliar) research topic from the (often similar) technical problems. If this was not the case, there would be many questions that I would not been able to answer, because my MD training is rather limited and I have not done real research in many years. The trick is to answer to only the parts to which you can figure out the answer yourself. It is more like a puzzle (or a competition, i.e. can you figure out the answer from the documentation, when somebody else cannot) and I always felt that answering to questions from others is a very good training for improving my own understanding.
Please also consider that nobody has to write answers like “What you do is wrong, and here is why, and this is what you should do”. Instead one can formulate questions like “have you tried to set XXX to YYY?”, or “what does it look like when you visualize the simulation?” or “are you sure that ZZZ is correct?”. All of these kinds of formulations can provide useful information and nobody will expect a perfect answer.

Well, this is touching a topic that is a bit of pet peeve of mine: The LAMMPS manual/documentation is a bit like the owner’s manual of a car; it can show you how to operate it and where all the knobs and levers are, but it is not supposed to teach you how to drive. However, exactly that seems to be an expectation that a growing number of beginners have. In my observation that is to a large degree facilitated by PIs doing a much worse job in advising and particularly tutoring. I also understand that there is far more pressure on PIs these days to be productive and a school of thinking that leads to just putting more pressure on their students and hoping that they will find some way to manage. My personal experience (from many, many years ago) was very different and I desperately wish that students could have now the same kind of experience of support and tutoring on how to do MD simulations that I had. Most importantly, I was given much more time and freedom in how to do my projects and which projects to do than what is common now.

Now, it will be next to impossible to change the politics of research, especially not with a grassroots approach, but I strongly believe that a more open exchange of experiences and discoveries between beginners and people that have just masted enough skills so they would not consider themselves beginners would be a far more effective way of learning than looking for more pre-produced and easier to consume tutorial material. For the most part what it needs is the courage to make mistakes in public.

srtee · November 19, 2021, 12:26am

“How to build a thriving academic community” is not a problem that has been easily solved outside LAMMPS (otherwise academia wouldn’t be the garbage fire it is right now), so we shouldn’t be surprised that we haven’t cracked it here either.

I like the analogy between the LAMMPS manual and a car owner’s manual – to take it further, the best way to grow a community of people “driving LAMMPS well” would be to open a “driving school”. One model of that would be the Software Carpentry community (disclosure: I’m certified to instruct Python using their material), which hosts short courses designed to get people into using Python from scratch.

Putting an emphasis on building MD knowledge in the LAMMPS community (beyond frantic manual-searching) would involve moves like:

emphasising tutorial links and the MatSci forum link on the LAMMPS homepage
emphasising “learning MD” material on the LAMMPS User Guide (such as Learning MDAnalysis · MDAnalysis for MDAnalysis)
building standard short courses about using LAMMPS to do X-Y-Z – the tutorials posted here are excellent moves in that direction
building a LAMMPS social media presence and community [so I have heard – citation needed]
[stretch goal] building ultra-accessible web notebooks like Pablo Arantes’s Making it rain for OpenMM

In other words, it would require some serious reallocation of resources from improving the functionality and speed of LAMMPS as a code to building and growing LAMMPS as a community, and it would involve embracing (or at least accepting) the inevitable trade-offs that would involve.

And to be very realistic (some would say cynical), the best way to grow a healthy, thriving LAMMPS community would be to hire a “LAMMPS community organiser” and make that their paid job. Expecting a healthy community to grow and thrive without that is like expecting entropy to wind backwards – and as molecular modellers we know what the likelihood of that is!

srtee · November 22, 2021, 12:55am

The LAMMPS manual/documentation is a bit like the owner’s manual of a car; it can show you how to operate it and where all the knobs and levers are, but it is not supposed to teach you how to drive.

Another analogy would be the difference between a Microsoft Excel forum and a C++ forum. Most Excel users have a very narrow range of intended workflows, and so their questions will really be questions about how Excel does things and not what their workflow should be and whether it is appropriate to their question. Somebody who asks “what is the difference between SUM(A3:A14) and SUM($A$3:$A$14)” probably knows why they’re asking and what they’re trying to achieve.

On the other hand, you can write just about any program in C++ (it may even compile!). Given the very wide range of possible workflows and intended outcomes, people will come to C++ with highly varied backgrounds in programming, and therefore many questions on a C++ forum will really also be questions about programming. Somebody who asks “what is the difference between delete and delete []” will probably have wobbly knowledge about destructors in particular and OOP in general.

Is that a question about programming, or a question about C++? On the one hand, somebody who can’t look up the difference between delete and delete [] will definitely not be asking about how C++ in particular implements it, so under that framework it is almost certainly a question to be answered in a computer science course rather than on a C++ forum in particular, in theory. However, in practice, there will be lots of people asking that question on a C++ forum, and nobody is going to ask this on a Python forum because Python doesn’t really expose memory management to the average Python user – that is, a question about programming becomes de facto a question asked to C++ practitioners because C++ exposes those programming controls in a way that other popular languages don’t.

I think the same is true for LAMMPS and its community. LAMMPS is really an MD scripting language in a way that few other implementations are (OpenMM being the notable exception). LAMMPS will absolutely let you set your house on fire (because someone long ago wrote code for fix combust/home and compute home/combustibility for their PhD project on climate-resilient construction options) where other codes won’t even let you go near a box of matches. Therefore LAMMPS is always going to attract “why is my house burning down” questions in a way that other codes probably don’t, and in one sense these really are MD questions and not LAMMPS questions. But they are the kinds of questions that will uniquely populate [lammps-users] and/or the MatSci LAMMPS forum for as long as LAMMPS aims to be the kind of software that will let you set your house on fire (which it should).

What would really help is if the LAMMPS community consciously decides what the response to these questions should be, as a best allocation of everybody’s time and resources, and then be super crystal clear on what will happen. Right now it seems like the community norm is “well, you should have learned more about fire” and ask questioners to go read Allen and Tildesley or other foundational texts closely. And that’s okay! (I certainly think it’s the best use of limited time and resources right now.) But we should be conscious about how that will affect the community we build, and be open to revising our engagement model as the times change.

akohlmey · November 22, 2021, 1:41am

The reason why I am having this discussion now is that “the LAMMPS community” is very ill defined and there is obviously a growing number of LAMMPS users, there is little of a “LAMMPS community” visible. What sometimes worries me are requests or sometimes even demands that “the LAMMPS developers” would take care of many of the open problems that were so far mentioned by multiple people participating in the discussion. This seems to assume that there is a near infinite number of people doing LAMMPS development with infinite amounts of time that would be sufficient to give all people the kind of support and answers they are looking for.

I think with applications like LAMMPS the situation is even more complicated and I would liken LAMMPS more to Excel but with people expecting answers to problems that they are not likely to ask in an Excel forum. The situation with LAMMPS is that there can be problems as a very technical and introductory level (how to compile or run LAMMPS correctly, how to use the correct syntax, how to understand different semantics etc.) but you can also build rather complex and abstract high-level applications, say equivalent to building a business plan with Excel for a multi-million dollar business. Now, people that would ask for help with such a problem would not be likely to ask for help in a forum, but hire experts (if they can afford them), but with an application like LAMMPS the expectations seem to be more along the lines of:

experts already know the answers to all kinds of problems so it is not much effort for them to respond
as a user it is not necessary to have sufficient training in running a business (or rather do some complex MD simulation workflow) since the software will take care of it and expects now the (magic?) commands that can resolve any unexpected outcome
every problem has an easily identifiable reason and thus a simple solution, so it is mostly a matter to get to know that solution and enable it

This is all very different from my experience and expectations when I was learning MD. There was more of a sense of community (although it was mostly of those that would be writing their in-house MD codes, and that of a few of community codes) and there was an understanding that you needed to first have acquired some skills (to program, to use unfamiliar operating systems, to understand your craft from reading source code and text books).

So from my perspective the challenge is two-fold:

what needs to be done to facilitate that we have more of a community and that communication is carried by many more shoulders?
what needs to be done to adjust people’s expectations? this is particularly a problem with mailing lists and forums, since the most vocal participants asking for help tend to be those with limited common sense, which leads to giving people a wrong impression about how harshly questions will be treated

I think you are hitting an important issue right on the head here. My way of responding is certainly motivated by trying to be effective, i.e. to achieve the deepest learning effect with the limited effort available.

But I also like to caution that a more “inclusive” approach to tutoring and training may not be very effective. It is good and useful for the very beginning, but beyond that I would be too concerned that people get the wrong impression, i.e. they gain an undeserved confidence in their abilities because the training examples were engineered so they are successful. However, progress in research requires making mistakes and learning from them, so a more advanced training has to challenge people and facilitate people making mistakes, but mistakes that can detect and resolve with previously acquired skills. This is why I so strongly believe that good transfer of MD simulation skills is only possible with in-person tutoring.

akohlmey · November 22, 2021, 1:49am

Let me put a little more emphasis on this point. People that respond in forums and on mailing lists LOVE well posed good questions. One of those makes up for tens of trivial or unnecessary questions. I sometimes wish there was a way to demonstrate to people how frustrating it can be to read a post of the kind “my input doesn’t work” and then (perhaps) include (part of) their input without any specific questions or explanations.

srtee · November 22, 2021, 2:07am

Absolutely agreed!

My feeling is that the lowest hanging fruit is to update the LAMMPS presentation of examples and resources. To pull a random example from MatSci – the Exciting tutorial page (Tutorials for exciting Oxygen - exciting) organizes the tutorials in a logical sequence and goes through step by step. If at some point the LAMMPS team is doing QC and reorganizing of the examples folder, it would hopefully be an efficient use of extra work to simultaneously create a “tutorial” / “walkthrough”.

Even if the walkthrough can’t possibly teach people MD (it does absolutely require person-to-person instruction), people might at least be in the position where they “know what they don’t know” (as opposed to the Kruger-Dunning ground state of “don’t know what they don’t know”). We could then template a reply to terrible questions as “Step through the walkthrough, stop the moment you don’t understand something, and ask your supervisor. If you can’t get through the walkthrough, we will be of very little extra help to you.” A lot of this work has already been doing preparing the Beginners Course from the LAMMPS symposium, we would just need to translate this into webpages.

akohlmey · November 22, 2021, 7:45pm

I think the examples are not that good a point to start with because they are by their very nature “complete” examples. If we want to do a better “onboarding”, it would have to start even earlier by actually “building” an example from scratch. If you are curious, I have developed a set of such examples with some matching talk slides for an "introduction to LAMMPS " tutorial many years ago, but never found the time to have the explanations recorded into a video or the whole thing edited into Jupyter notebooks or similar for self-study. This would not only demonstrate how to write this successfully, but also try to demonstrate typical errors (I think that is one thing that is often missing from tutorials. They tend to be streamlined to always work, so people don’t learn what can go wrong and how to deal with those). This is following the philosophy of tutorials like “Learn Python The Hard Way”. If you are interested, I can share the archives with the inputs I have and the PDFs with talk slides. They all would need updating.

rkingsbury · November 23, 2021, 7:42pm

Thanks @akohlmey for raising this important topic. Having just gotten somewhat “up to speed” with LAMMPS over the last few months, I have several thoughts on this.

LAMMPS has a large manual with lots of explanations and technical details, yet it seems that problems often arise because people don’t read enough of it, or have difficulties identifying the most relevant parts, or need more practical examples with explanations. What could be done by the LAMMPS community to improve this?

The LAMMPS manual is indeed comprehensive and detailed. Almost too much so for a beginner. It has taken me quite some time to actually understand what the manual pages are telling me. Personally I see two primary reasons for this:

I had great difficulty finding a LAMMPS tutorial oriented at a true newcomer to LAMMPS. (Note: the tutorials you shared look excellent and I was unaware of them). As someone familiar with other computational codes but not LAMMPS, the first things I want to know are “how are the input files formatted”? “What files are required”? “What minimal set of commands is needed to define a simple simulation”? “Where are the outputs saved”? etc. Without this basic understanding in place, the information in the manual is just too specific to be very helpful.
I suggest featuring the tutorials you referenced more prominently in the documentation. In addition, I actually began writing a sort of “LAMMPS 101” tutorial as I went about learning it. It’s incomplete, but I’m linking it here as an example of what I think could be helpful to a complete LAMMPS newcomer. Please feel free to use or build upon it. I would be willing to help expand on this too, if there is a desire to add a “LAMMPS 101” page to the docs.
It is possible to do almost everything (maybe everything) in LAMMPS either by placing commands in the input file OR by specifying information in an external data file. However, the manual seems oriented almost exclusively towards the first approach and gives little description of the data file approach. Many tutorials, however, use a mixture of the two methods, which makes the tutorials confusing to follow and even harder to adapt to one’s specific needs. Making the documentation more explicit about the fact that 1) you CAN set commands in either place and 2) HOW to set equivalent commands in the data file could help.

As a LAMMPS user, what are the problems that bother you the most? Where do you feel more effort should be spent? What would be a (scientific) software and community that could serve as an example for how the situation around LAMMPS could be improved?

From a troubleshooting standpoint, I find it frustrating that LAMMPS errors usually do not provide the line number of the input file that cause a failure. For example, this error

ERROR: Illegal pair_coeff command (src/KSPACE/pair_lj_charmm_coul_long.cpp:646)
Last command: read_data       md.data

tells me that I have a problem with my pair_coeff commands, but it does not tell me which command is triggering the error. It does tell me the last command executed in the input file, but since I have defined a lot of information in md.data here, that information is of limited use. Adding more information to the exceptions would really speed troubleshooting and might reduce the number of requests for help.

CDenniston · November 23, 2021, 8:54pm

On the discussion points raised:

Mailing lists/forum: It does not make sense to have both the mailing lists and the forum. I am ambivalent which one is kept but you should probably decide based on what is easiest to maintain.

Getting more participation in the forum: The question is how, when everyone is busy, to get more people to share their expertise. One problem is browsing the forum/mailing list is done less and less as you gain more expertise so experts are less likely to see a question they could easily answer. One option is to have a list of topics and ask people to “sign up” to get an email when a question comes up on that topic. However, getting a list of potential future topics would probably be difficult. Another option is to try and incentivize graduate students and post-docs to answer questions by having them get an email once or twice a year summarizing their participation (they could potentially put on a line on their CV or something). A rating system that rewarded more useful participation might help here (similar to stack overflow) as that would be something tangible to point to on the CV.

Manual: The current manual is one of the real strengths of LAMMPS. Few other packages have anything remotely as useful. The only thing I guess I would think might be helpful for newer users would be a short section at the end of each command/fix page pointing to the already included LAMMPS examples that use that command/fix. I have to emphasize to my new students to look at the included examples and run/analyze them. This is the best way for new users to get up and running (once they have successfully installed/compiled LAMMPS).

I also have to say I have appreciated the improved developer documentation that has started to appear in the last few years. This has saved some time.

hothello · November 30, 2021, 4:29pm

Hi Axel,

You have raised many good points. You are doing a disproportionate amount of work on managing the code development and answering almost every question on the mailing list, forum, and countless other places. I really appreciate your monumental role in expanding the field of molecular dynamics and educating/guiding new users. Shame you don’t like the Berendsen thermostat too!
Here are my 5 cents.

I find the mailing list way too messy to be valid anymore. The problem is the sheer number of messages covering all sorts of topics and expertise levels. On the other side, the forum offers a neat way to categorise topics and engage in orderly conversations. The problem is that it is currently not the preferred way of offering/requesting support… which leads to the next point:
Many questions revolve around the same problems: missing bonds/angles/dihedrals during simulations, lost atoms, PPPM errors. The manual offers a minimal technical explanation but does not help understand the physical reason for these errors. A forum section where these problems are categorised could tackle this issue by grouping the mass of good advice already given, which is scattered across the mailing list. Of course, this service is no replacement for proper training in the field, but it could streamline the search for help.
Also, the examples cover specific cases but often hide precious clues on how LAMMPS works. It would be beneficial to prepare a minimal set of examples (e.g. an NPT simulation of a water box; NVT simulation of a slab, and so on) using a consistent style and verbose comments. I am fixated on this, but, as with good code, an input script too should be organised logically, defining first the units, atom_style, boundary, and variables. Then the force field style, parameters, sample (data or restart), and finally computes and fixes.
To me, the most appealing feature of LAMMPS is also the most troubling: the ability to control every physical aspect of a simulation. When I started using LAMMPS, this point was not clear to me: the advice I got from my group was a series of input files describing a particular system. It took me a while to get accustomed to LAMMPS’ philosophy and, from many of the questions being asked on the mailing list, I see that this is a recurring issue.
To me, #moltemplate was the key to unlocking LAMMPS’ potential. Sure, there are a lot of built-in functions to create and manipulate groups of atoms, but these capabilities are not suited for soft materials. At the last workshop, I learned how to run LAMMPS as a library inside #ovito, which is great for visualising and debugging input and data files interactively. Since both programs are open-source, it is worth pursuing a closer integration of the respective user base.

I hope this conversation gains momentum.

A side note to Axel: it is always a pleasure to read your witty responses (especially to crap questions).
Otello

Germain · December 1, 2021, 6:03pm

Hi @akohlmey,
as one of the silent majority of users, here are my shot at some of the issues you are bringing up:

I registered my former institutional mail address to the mailing-list but did not consider registering my new one. The reason is that I find more relevant answers by digging the archived mails than by keeping up with the current question flow. On the other hand, my mailbox got full of questions, sometime relevant but most of the time uninteresting to me. A forum might be a better way of both getting in touch and looking for previous answer. There is a radical improvement of reading quality and getting who got into which topic through the matsci site interface and its tag system. I guess that the possibility to edit answers might also help in the common back and forth discussions of “please provide more of your input file” when people just provided the error output or irrelevant elements. Finally the fact that there are several way to dig into the archives of the mailing list, at least with a SourceForge link, a Google search box and now the forum mirror might make things confusing as “where to look” for informations and understanding what is going on in a discussion. IMHO centralising the archives on the forum and redirecting to it while preparing to archive the mailing-list for good might be a good shot. This might also help in referring to related questions that where already answered, both in new topics and the archived mails, the same way some forums do like the Arch Linux forum.
A forum might help people to aggregate as a community more than a mailing-list. The thing is, in my case, I finally subscribed to it today as I am indeed looking for something specific concerning another software (OpenKim) closely related to LAMMPS. As you often had to answer “This is not a problem related to LAMMPS”, people might now just be redirected to nearest-neighbour forum like Ovito or OpenKim and just transfer the question. It is also a good thing that people directly involved in several software speak in a common place (Dr. Stukowski from Ovito or Dr. Elliott from OK are also here to help, at least in relieving the questions).
The forum can also be a platform through which you could organize events or advertise events related to molecular simulation. Maybe also turn the mailing-list into a newsletter and provide needed information, with redirection to the forum for more information (just a though).

Concerning the other questions

You are right in pointing out the lack of supervision of some new MD users. Part of the problem is also because MD is mostly considered a tool in some research fields and its subtlety is often overlooked. I think that master students or even maybe undergrads can easily end in research project with molecular modelling involved. Except with particularly gifted people, getting the statistical physics basis, the minimal coding skills and the software understanding is nearly impossible in 6 month long projects. So you have choices to make, and I wouldn’t be surprised if interns end up running script that they do not understand with teachers that have absolutely no time to explain them the how and why. This is what part of research goes to and we have to deal with it.

I actually think that LAMMPS dev did a fairly good job in keeping the manual the way it is. For the most part it tells you what something do with a minimal example, some related commands and that’s it. This is a bit dry compared to other software (compare it to GROMACS or BOSS that goes into long explanations of the physics and modelling involved) but if you know what you’re looking for and what you are doing it is the best way to go.

However I think the website is terrible at directing people to the tutorials you mentioned and it should be reforged. Yet I know how big of a project that is and how limited your time is. But there are things unclear: there are several links to “features/non-features” that actually link to the manual (independent pages with key points might be more clear), one goes to the former SourceForge repository with an outdated version from 2020 (which is still regularly downloaded by people), a link to Pizza.py (is it still developed? Python2 support dropped, I tried to update the code to Python3 this summer but some of the lib porting are a true pain), and references to other softwares/codes, some of which seem not developed any more (that might be one of the reason people are asking unrelated question wrt other software). Also the presentation in an horizontal table makes things unclear for a newcomer on where to start. This might be details and is, again, a lot of things to think about (and a lot of work in order to change them in a coherent way) but are elements that, I think, are very confusing at first, especially when your workflow is not established.

As a “developer” I mostly made my own lib to read I/O from LAMMPS. I can’t say much about the code itself (I have stuff I’d like to submit at some point but not ready yet). But I think that overall, it is fairly easy to “do stuff” once you understand how things are organised. I did it by reading other files as advised, but this might be tedious for people who prefer to see the big picture before digging in. For this, I don’t have much solution. I am currently thinking about outside tools that could be combined with text editor to help in catching some errors before runtime but they are embryonic projects at the moment.

Thanks again for all the work you put into this project. and the insightful advices you gave over the year. I hope to read more from you.

Germain.

akohlmey · December 1, 2021, 8:13pm

Thanks to everybody that has shared their thoughts and given some feedback. This is much appreciated. Below are some conclusions and ideas for moving forward. Again, comments to those are highly welcome.

There needs to be a delicate balance between keeping what has been traditionally useful and attractive about LAMMPS, especially for more experienced users, with how information needs to be presented to inexperienced LAMMPS users. The development of LAMMPS reaches back to times where using MD coincided with most knowing well how to program and knowing how to read source code when documentation is lacking or to understand details of the implementation. On top of that, the complexity of systems and models and methods has significantly increased which adds to the burden of new users to find their way to using it correctly and well. So we need to re-think how to make people familiar with LAMMPS. My current thinking is that the manual, because of its level of detail, is not a good place for it, but rather some dedicated “new user” document might be a better way (alternatively it could be an additional section in the manual). This could be maintained as a separate git repository which would make it easier to have people contribute content. In contrast to the LAMMPS manual, which is only supposed how to use LAMMPS (cf. “driver’s manual”), this could be more focused to also learn MD basics, sort of a “Learning MD with LAMMPS” guide.
We need to try and better organize the “historic knowledge” and make it more accessible. As has been acknowledged multiple times, the mailing list archive is disorganized and using Google searches is not always very effective. However many errors and mistakes are recurring and beginners that do not do their due diligence are likely to repeat errors that many others made and ask questions that have been answered many times before. My current thinking is that this could be in a format somewhere in the middle between what stackoverflow has and a regular FAQ and a tutorial. One way to organize this would be not by features or keywords (like the manual) but rather by error messages. For each error message this could then provide a (minimal) example demonstrating how to create the error message (so unlike the examples bundled with LAMMPS which only show the correct input) it would demonstrate the failure and then - step by step - explain the source of the error(s) and resolve the issue. Since some error messages can be triggered by multiple different problems, there may be multiple different examples associated with the same error message. Suitable examples could be “harvested” from the mailing list archives and if this “document” would also be organized as a git repository, it would be easier to contribute and particularly easy for the LAMMPS developers to integrate contributions. Give how pervasive using git has become in any form of computational research, people should understand that learning to use it - at least at a basic level - is an extremely valuable skill.
One unsolved problem is how to acquire funding to spend time on maintaining LAMMPS and especially supporting users. To me personally (but that may just be my upbringing and socialization speaking) I view volunteering time to help others as a way to pay back for having access to the work of those before me. We have already discussed at length that this has changed. But my thinking is that we might try following some the path some other projects have taken.
- For example, there could be a LAMMPS non-profit organization which could collect donations or contributions from individuals, research groups (as part of “service contracts” in the research grants), or companies as sponsors.
- Under that umbrella also “for a fee” consulting could be offered or “for a fee” custom training and other services.
- People that are in need of particular feature(s) but are not able to implement it themselves (for which there could be multiple possible reasons) could offer a “bounty” for those feature(s).
- People applying for grants could consider adding a LAMMPS developer to their projects to both strengthen their proposal (if it would require code customization(s)) and divert some funding to LAMMPS maintenance. This could work on a small scale, but it may also be possible for larger “research infrastructure” projects that some funding agencies occasionally solicit. Again, there could be a mutual benefit: what makes people good developers is often interfering with being effective in writing grants and doing research projects. It is rather simple to join a team for a proposal, but it is a much bigger effort to set it up, not to mention that software maintenance on its own is rarely a topic for a successful proposal. Those usually have to be more “high-concept”.
We need some more “community building” activities. So far the major event in that respect is the LAMMPS workshop, but - because of the associated cost and effort - this is too infrequent and thus not very effective. This year’s “virtual” Workshop version, however, has been rather inspiring in that regard. It worked much better than we expected, the technical challenges were rather straightforward to meet, and there was a much larger participation (~750 actual participants from over 1000 registration requests, versus ~150 registrations at the last in-person event) and in the slack channel communication there was some signs of exchange and communication beyond what usually happens on the mailing list and forum. So my thinking is that we could have these “virtual” events more regularly, but with a much reduced agenda. Say, one presentation to changing topics and with changing focus (research results, development, tutorial) and some period of discussion and exchange. The biggest problem from the side of the LAMMPS developers is the effort required to organize this and particularly to maintain the slack channel (with the mailing list and forum, there is too much to handle already and people asking interactively usually require much more effort to respond to and tend to be more demanding, not to mention that - unlike with the forum or e-mail - one has to respond immediately and cannot let it sit or look something up first). That is why we purged the slack channel after the workshop. Doing the actual Zoom hosting and being available for a specific time and preparing the occasional talk, especially when summarizing recent developments or discussion ongoing plans and projects, is not so difficult.

Matt_Bone · December 7, 2021, 3:13pm

I would definitely agree with your last point. As a PhD student outside of the US I would never be able to attend an in-person LAMMPS workshop, but certainly got a lot of value out of the virtual workshop. A regular, focused seminar series would be a great addition to the LAMMPS community and I’ve personally learnt a lot from similar series on https://nanohub.org/.

Equally, I feel like I’ve learnt much of my LAMMPS knowledge from reading old Sandia presentations that have been put online afterwards. A recording to go along with these kinds of presentation would be much appreciated.

akohlmey · December 7, 2021, 9:30pm

We won’t have much of a “community” unless more people get actively involved. Unlike projects like NanoHub, there is no funding, no sponsoring, and most of the developers can only work part time on LAMMPS (and sometimes that part is very small). Organizing a series of talks doesn’t require any programming skills or a deep understanding of the software, same would go for starting a “knowledge base” kind of website where good questions/problems and their answers from the mailing list/forum are collected into a digested and easy to find format. Or perhaps running a “study group” like chat where people with limited experience first try to solve problems on their own and then can ask much more concise and well thought out questions on the mailing list or forum. Like I was stating when I started this discussion, the most exhausting and frustrating part is that currently so much only happens if it is done by one of the (core) LAMMPS developers.

If recordings would exist, they would be available online. The workshop this year was the exception because the recording option comes with little additional overhead in Zoom meetings.

Let me point out that anybody who started following LAMMPS on the mailing list or the forum or on GitHub within the last couple of years, has seen a very lopsided situation. LAMMPS development has benefited massively from the pandemic and the associated lock-downs and restrictions due to it. That has freed a lot of time and capacity for improving/refactoring the code and the documentation with the high point being the virtual workshop. Development will slow down significantly and also response times on the forum and mailing list will increase.