Coarse-grained simulation of RNA using LAMMPS

friedesc · February 16, 2023, 7:23pm

Hello everyone,

I want to analyse a coarse-grained system of 2 RNA strands using LAMMPS.
I was working with Gromacs and Martini coarse graining before, but since I did not get sufficient results I am looking for a program that allows for long (microsecond and longer) simulation times of coarse grained RNA systems. The paper of Coarse-grained (cg) simulation of DNA using LAMMPS by Heinrich et al. (https://link.springer.com/article/10.1140/epje/i2018-11669-8) encouraged me to use Lammps. Lammps has the advantage that oxRNA coarse graining is implemented.

My goal is to understand conformational changes upon binding of this two RNA strands. They have a length of 350 nt and 150 nt and are in a simulation box. Since the system is kind of big and I want to see formation of bonds between this two strands, coarse graining seems to be the best method to get long simulations. I used Martini and oxRNA since they allow for RNA cg. For simulations using Martini caorse graining I added Water as solvent and NA,CL ions. OxRNA uses an implicit solvent.

Now, I mainly have too questions:

1. How can I improve my Lammps script and input file so that the two RNA strands interact?

I got oxRNA cg files using tacoxDNA (http://tacoxdna.sissa.it) by converting from .pdb to .gro and .gro to .lammps.
I was using the oxRNA2 tutorial files as an input script for my simulation. You can find the file attached. I run the simulation for 1000000 steps with a tilmestep of 1e-2 fs.
However, when I run the simulation using a higher number of time steps, I get the error that bonds between two atoms are missing.
Also, when I look at the results from my simulation, the RNA strands are nicely moving, but they don’t interact. Do you have an idea, what I could change? I decreased their distance, but then the simulation does not run through.
Another thing that I observe is, that the structure is just collapsing. I suppose that this is due to missing solvent molecules. Is they a way to introduce a force that acts like a solvent and supports the system to reach an energy minimum of the whole system and not just the RNA structure?

2. How can I get a lamps data file from a .gro file with the atom coordinates: Atom-ID, type, position, molecule-ID, ellipsoid flag, density, nx, ny, nz?

Or, how can I use different formats. I used VMD to generate my file for the Martini simulations, but I Geta different output (like atom ID, atom number, atom type, something, position). I used:

topo retypebonds
topo guessangles
topo guessdihedrals
topo writelammpsdata data.data full ← do I need to change full to something else?

Thank you so much for your feedback. I am looking forward to improve my lamps skills. Also, if you have any suggestions on Lammps and RNA, I am happy to hear them

Have a nice day!

---------------------------------------- input ----------------------------------------------------------
variable number equal 2
variable ofreq equal 1000
variable efreq equal 1000
variable T equal 0.1
variable rhos equal 0.5

units lj

dimension 3

newton on

boundary p p p

atom_style hybrid bond ellipsoid oxdna
atom_modify sort 0 1.0

Pair interactions require lists of neighbours to be calculated

neighbor 2.0 bin
neigh_modify every 1 delay 0 check yes

read_data RNA_AgrB_complex_close_pdb_oxdna.lammps

set atom * mass 3.1575

group all type 1 4

oxRNA2 bond interactions - FENE backbone

bond_style oxrna2/fene
bond_coeff * 2.0 0.25 0.761070781051
special_bonds lj 0 1 1

oxRNA2 pair interactions

pair_style hybrid/overlay oxrna2/excv oxrna2/stk oxrna2/hbond oxrna2/xstk oxrna2/coaxstk oxrna2/dh
pair_coeff * * oxrna2/excv 2.0 0.7 0.675 2.0 0.515 0.5 2.0 0.33 0.32
pair_coeff * * oxrna2/stk seqdep {T} 1.40206 2.77 6.0 0.43 0.93 0.35 0.78 0.9 0 0.95 0.9 0 0.95 1.3 0 0.8 1.3 0 0.8 2.0 0.65 2.0 0.65 pair_coeff * * oxrna2/hbond seqdep 0.0 8.0 0.4 0.75 0.34 0.7 1.5 0 0.7 1.5 0 0.7 1.5 0 0.7 0.46 3.141592653589793 0.7 4.0 1.5707963267948966 0.45 4.0 1.5707963267948966 0.45 pair_coeff 1 4 oxrna2/hbond seqdep 0.870439 8.0 0.4 0.75 0.34 0.7 1.5 0 0.7 1.5 0 0.7 1.5 0 0.7 0.46 3.141592653589793 0.7 4.0 1.5707963267948966 0.45 4.0 1.5707963267948966 0.45 pair_coeff 2 3 oxrna2/hbond seqdep 0.870439 8.0 0.4 0.75 0.34 0.7 1.5 0 0.7 1.5 0 0.7 1.5 0 0.7 0.46 3.141592653589793 0.7 4.0 1.5707963267948966 0.45 4.0 1.5707963267948966 0.45 pair_coeff 3 4 oxrna2/hbond seqdep 0.870439 8.0 0.4 0.75 0.34 0.7 1.5 0 0.7 1.5 0 0.7 1.5 0 0.7 0.46 3.141592653589793 0.7 4.0 1.5707963267948966 0.45 4.0 1.5707963267948966 0.45 pair_coeff * * oxrna2/xstk 59.9626 0.5 0.6 0.42 0.58 2.25 0.505 0.58 1.7 1.266 0.68 1.7 1.266 0.68 1.7 0.309 0.68 1.7 0.309 0.68 pair_coeff * * oxrna2/coaxstk 80 0.5 0.6 0.42 0.58 2.0 2.592 0.65 1.3 0.151 0.8 0.9 0.685 0.95 0.9 0.685 0.95 2.0 -0.65 2.0 -0.65 pair_coeff * * oxrna2/dh {T} ${rhos} 1.02455

NVE ensemble

fix 1 all nve/asphere
#fix 2 all langevin {T} {T} 2.5 457145 angmom 10

timestep 1e-2

#comm_style tiled
fix 3 all balance 1000 1.03 shift xyz 10 1.03
comm_modify cutoff 3.8

compute quat all property/atom quatw quati quatj quatk

compute erot all erotate/asphere
compute ekin all ke
compute epot all pe
variable erot equal c_erot
variable ekin equal c_ekin
variable epot equal c_epot
variable etot equal c_erot+c_ekin+c_epot
fix 5 all print {efreq} "(step) ekin = {ekin} | erot = {erot} | epot = {epot} | etot = {etot}" screen yes

dump out all custom {ofreq} out.{number}.lammpstrj id mol type x y z ix iy iz vx vy vz c_quat[1] c_quat[2] c_quat[3] c_quat[4] angmomx angmomy angmomz
dump_modify out sort id
dump_modify out format line “%d %d %d %22.15le %22.15le %22.15le %d %d %d %22.15le %22.15le %22.15le %22.15le %22.15le %22.15le %22.15le %22.15le %22.15le %22.15le”

run 1000000

write_data last_config.{number}.* nocoeff #write_restart last_config.{number}.*

akohlmey · February 16, 2023, 8:48pm

What you are asking about is very specific to the model and the corresponding science.
I suggest you contact the author of the CG-DNA package directly. It is not very likely that people here can give you such specific advice.

As the developer of TopoTools, I can also tell you that there are currently no provisions to generate data files compatible with the requirements of the CG-DNA package.

friedesc · February 16, 2023, 9:05pm

Thank you so much for your answer. Yes, it is really specific. I will take this into account!

srtee · February 17, 2023, 7:35am

Ah, fond memories – I did my PhD using oxDNA.

The ox[DR]NA models are primarily polymer models with interactions that are a bit more sensitive to blowing up with bad overlaps (especially FENE). So you have to be very careful with potential contacts.

Generating initial structures was indeed a pain. I found that using the oxDNA original software’s own tools was the easiest way to proceed. You should use those tools to generate the topology, and then you can substitute your own coordinates in.

I have not personally worked with oxRNA but if it is at all realistic then you will also have to face the “RNA folding problem”, which is that RNA (unlike DNA and like proteins) has a diverse range of possible secondary structures (for example, that’s how RNAzymes work). So unless you want to throw some fancy metadynamics at it, you run the risk of running an entire simulation without chancing upon your conformation of interest.

All the best!