Solvation Free Energy of methane in TIP4P/2005f

Hello everyone.
I hope you are all doing great.
Recently, I was doing a test check on FEP in LAMMPS. In doing so, I started from solvation of methane in TIP4P-ew, an available example on Github from Agilio Padua (compute_fep/examples/USER/fep/CH4hyd/fdti01/fdti01.lmp at 7adab86f0ec19539d6a526666e1452b9bceb3c63 · agiliopadua/compute_fep · GitHub)
The first thing that got my attention was how to handle the dummy atom in TIP4P when there is a need to use soft-core potential. In other words, if I want to have efficiency when it comes to simulating 4-point water models, it is better to use the implicit approach, so basically I use long-range corrections with the style of tip4p (lj/cut/tip4p/long) with a Kspace as pppm/tip4p.
Thus, when I use this pair style and combine it with a different pair style for the solute, let’s say lj/cut/coul/long, I get the WARNING as it says the computation of coulomb interactions could be inconsistent. I read several topics here and studied the manual as well. Axel a couple of times mentioned if tip4p style is used, it needs to be used for all, otherwise in the case of soft/non-soft mix pair styles, the explicit approach should be taken.
I did two simulations for methane (OPLS-AA) in water (TIP4P-ew) using LAMMPS and GROMACS, with identical system geometry and particle numbers, tho the time step was bigger in GROMACS (2 fs) than LAMMPS (1 fs).
when I do finite difference thermodynamic integration on the original output from github I get this value 2.377 kcal/mol. keep in mind, they are combining tip4 style with a non tip4p style as below:
pair_style hybrid &
lj/cut/coul/long 10.0 10.0 &
lj/cut/tip4p/long/soft 3 4 2 2 0.125 1 0.5 10.0 10.0 10.0

Like I said, this brings on the warning.
So, I did tip4p style for all in this job script, I did the same simulation with the same time spent as the original example from github, and I got around 2.67 kcal/mol. To have a solid result, I increased the equlibration time and FEP time so that I equilibrated it for 10 ns, got FEP averages for 10 ns. By doing so I made sure, there is no poor sampling, and I got this value 2.303 kcal/mol.
Also, in the case of the GROMACS, I got the following results from TI, BAR, and MBAR respectively as follows
2.308 kcal/mol, 2.305 kcal/mol, 2.309 kcal/mol
I did the simulation in GROMACS for 10 ns as equilibration, and 10 ns in each window state to get those derivatives.
Now, it seems reasonable results to me. But I am not sure if I did correct in LAMMPS when I was doing FEP, I mean tip4p style. Here are my LAMMPS input commands. Could you please give me a hint if that is the way or not because I did not understand quite well when Axel said it is gotta be all tip4p style.

data.lmp (139.1 KB)
fdti01.lmp (2.8 KB)
in.sim (1.9 KB)

Other than this, if even this is the way, I do want to do it using the explicit approach for TIP4P as well. I did search a lot around here, and did not figure out how one person can model TIP4P/2005f using the explicit approach. I found an example here (WaterModelComparison/NPT simulations at main · VisualizationAndModelProduction/WaterModelComparison · GitHub) where they were doing TIP4P/2005f as a rigid model in the NPT / NVT ensembles. What is the point in having TIP4P/2005f if I am going to treat it as a rigid model? Could you please direct me to the right resource?

Same as Comparing LAMMPS and GROMACS, this is a discussion primarily about the science behind your research and how to validate your input and not about LAMMPS itself. I will thus re-categorize this post to Science Talk.

Overall, you need to be discussing this with people that have a vested interest in your research. A bunch or random people on the internet do not qualify and will not want to invest the necessary effort.

Thank you so much Axel,
I am just studying, thought someone might be able to give me a hint.
the manual is the key I believe.
Thanks again.

No, the manual doesn’t teach you how to do good simulations. It merely explains the features included in LAMMPS or how to add new features.

This is more like with driving a car. The manual of a car shows you where all the knobs, buttons, and levers are and how they work. It does not teach you how to drive. That you usually have to learn from a practitioner, e.g. a driving instructor or somebody with a lot of experience in driving.

You are absolutely right. That is why I want to learn better and better.
Thank you again.

This alone is a large difference between simulation protocols – papers are starting to report that a 2fs timestep (or even 1fs) can leave a meaningful (but minor) mis-integration water’s fast rotational dynamics, visible in equipartitioning violations.

More importantly, why are you using TIP4P? If a 4-point water model causes significant difficulties in the package you want to use, three-point water models are always available. Logically, you are in one of three possible situations:

  1. TIP4P/methane thermodynamics line up with experiments as confirmed by prior literature better than any other water model. In which case, that is a good reason to use TIP4P.
  2. You think TIP4P/methane thermodynamics might line up with experiments better than any other water model and this is your research task. In which case, you can either just use GROMACS, or you will have to really familiarise yourself properly with LAMMPS including possibly modifying source code (see below)
  3. You don’t know if TIP4P/methane thermodynamics might line up with experiments better than any other water model, and specifically studying 4-point water isn’t your research task. In which case – why not use SPC/E? Or OPC3? Or even TIP3P? 4-point water is a different water model and is not guaranteed to be a better water model across all observables. This is not like physics where (for example) quantum mechanics is always more accurate than Newtonian physics and Schrodinger’s equation is always more fundamental than a density functional theory. Almost all water models are primarily benchmarked against either modelling bulk water phases, or encouraging proper hydration of peptide residues to model protein behaviour correctly – every other use of them is provisional and there is no guarantee that a more intricate model is “better” in any given scenario.

I feel your pain, honestly. In 2018 when I first started trying to do things in LAMMPS, based on my GROMACS experience, it was the most natural thing in the world for me to start with

pair_style hybrid lj/cut/coul/long ... lj/cut/tip4p/long ...

and then get completely lost when my simulations wouldn’t run. So, I’ve been there. What helped me a lot was that my subsequent work involved diving deep into LAMMPS’s source code so I could see for myself what the *tip4p/long styles actually are – they really are just wrappers for *coul/long, with the only difference being that every interaction with an O’s negative charge gets calculated with the virtual site’s position offset. That’s it. That’s all the tip4p pair styles do. Which is why you can’t add a coul style on to it via pair hybrid – the tip4p style is supposed to do everything a coul style does.

Hi Shern,
Thank you so much for your suggestions and hints. As you said, I am gonna go through source code to get more familiar with what LAMMPS performs.