I tried to upgrade to the newest version of LAMMPS (actually I am going to try LAMMPS-ICMS) but kept getting seg faults–I finally realized that pair_style soft has changed. I am confused about the syntax. I just want to have a soft potential that increases from 0 to a fairly large number in order to push the atoms in my initial configuration off of each other before I start my simulation.
Previously, I had:
pair_style soft 1.12
pair_coeff * * 0.0 60.0
The new syntax of pair_style soft will take the second number, 60.0, to be the cutoff distance, which is huge and is presumably what causes the seg fault. So I looked over the new doc page for pair_style soft and copied the lines from there to get:
pair_style soft 1.12
variable prefactor equal 60.0*elapsed/5000
fix 1 all adapt 1 pair soft a * * prefactor
pair_coeff * * $a
However, I get errors like this:
MPI_ABORT invoked on rank 8 in communicator MPI_COMM_WORLD with errorcode 1
ERROR on proc 10: Substitution for illegal variable
I imagine the problem is something simple, but I’ve looked over the doc pages of fix adapt, variable, and pair_style soft, and tried a few variations of the above, and none of them worked. That’s a lot of new things to read just to use the same old command again (although I’m sure it will be very nice for the people who needed more flexibility)!
For now I put in a few separate run commands, redefining the pair_coeff to be slightly larger before each one, which works just fine. But it would be nice to know how it’s supposed to be done.
Lisa
pair_style soft 1.12
variable prefactor equal 60.0*elapsed/5000
fix 1 all adapt 1 pair soft a * * prefactor
pair_coeff * * $a
However, I get errors like this:
MPI_ABORT invoked on rank 8 in communicator MPI_COMM_WORLD with
errorcode 1
ERROR on proc 10: Substitution for illegal variable
please try:
pair_coeff * * 0.0
I imagine the problem is something simple, but I’ve looked over the
doc pages of fix adapt, variable, and pair_style soft, and tried a few
variations of the above, and none of them worked. That’s a lot of new
things to read just to use the same old command again (although I’m
sure it will be very nice for the people who needed more flexibility)!
i agree. it is quite confusing.
cheers,
axel.
I must still be doing something wrong. Trying ‘pair_coeff * * 0.0’ changed the error to a bunch of these:
[s9:37656] *** Process received signal ***
[s9:37656] Signal: Bus error (10)
[s9:37656] Signal code: (2)
[s9:37656] Failing at address: 0x0
Followed by a bunch of these:
[ 1] [0xbfffe6f8, 0x00000000] (-P-)
[ 2] (_ZN9LAMMPS_NS8FixAdapt9pre_forceEi + 0x1cb) [0xbfffe768, 0x00084225]
[ 3] (_ZN9LAMMPS_NS8FixAdapt4initEv + 0x2c5) [0xbfffe7d8, 0x00084051]
[ 4] (_ZN9LAMMPS_NS6Modify4initEv + 0x295) [0xbfffe868, 0x0010aa89]
[ 5] (_ZN9LAMMPS_NS6LAMMPS4initEv + 0x41) [0xbfffe888, 0x000f8f55]
[ 6] (_ZN9LAMMPS_NS3Run7commandEiPPc + 0x7d8) [0xbfffe978, 0x001bc516]
[ 7] (_ZN9LAMMPS_NS5Input15execute_commandEv + 0x167f) [0xbfffea78, 0x000f609d]
[ 8] (_ZN9LAMMPS_NS5Input4fileEv + 0x2f6) [0xbffff2f8, 0x000f6dd8]
[ 9] (main + 0x60) [0xbffff328, 0x000fbb06]
[10] (start + 0x36) [0xbffff344, 0x00001d16]
[11] [0x00000000, 0x00000001] (FP-)
[s9:37656] *** End of error message ***
BTW, for a small test system of a coarse grained polymer in which 1/2 the ‘atoms’ were charged, your pppm/cg option in lammps-icms sped it up by ~7%. Using ‘atom_modify first charged’ sped it up by only 1% more. Will try with a less charged system next!
Lisa
I must still be doing something wrong. Trying ‘pair_coeff * * 0.0’ changed
the error to a bunch of these:
[s9:37656] *** Process received signal ***
[s9:37656] Signal: Bus error (10)
[s9:37656] Signal code: (2)
[s9:37656] Failing at address: 0x0
ok. that looks like a NULL pointer dereference.
Followed by a bunch of these:
[ 1] [0xbfffe6f8, 0x00000000] (-P-)
[ 2] (_ZN9LAMMPS_NS8FixAdapt9pre_forceEi + 0x1cb) [0xbfffe768, 0x00084225]
...and it is in fix adapt. (if you pipe the text through c++flit, it might look
more readable).
could be some missing sanity check or an uninitialized variable.
hard to say what is going on w/o being able to reproduce it.
[ 3] (_ZN9LAMMPS_NS8FixAdapt4initEv + 0x2c5) [0xbfffe7d8, 0x00084051]
[ 4] (_ZN9LAMMPS_NS6Modify4initEv + 0x295) [0xbfffe868, 0x0010aa89]
[ 5] (_ZN9LAMMPS_NS6LAMMPS4initEv + 0x41) [0xbfffe888, 0x000f8f55]
[ 6] (_ZN9LAMMPS_NS3Run7commandEiPPc + 0x7d8) [0xbfffe978, 0x001bc516]
[ 7] (_ZN9LAMMPS_NS5Input15execute_commandEv + 0x167f) [0xbfffea78,
0x000f609d]
[ 8] (_ZN9LAMMPS_NS5Input4fileEv + 0x2f6) [0xbffff2f8, 0x000f6dd8]
[ 9] (main + 0x60) [0xbffff328, 0x000fbb06]
[10] (start + 0x36) [0xbffff344, 0x00001d16]
[11] [0x00000000, 0x00000001] (FP-)
[s9:37656] *** End of error message ***
BTW, for a small test system of a coarse grained polymer in which 1/2 the
‘atoms’ were charged, your pppm/cg option in lammps-icms sped it up by ~7%.
Using ‘atom_modify first charged’ sped it up by only 1% more. Will try
with a less charged system next!
yeah, about 10% is what i get with %5 charged particles.
unfortunately that speedup is quickly eaten up by the
not so attractive scaling behavior of the 3d FFTs.
this is where using OpenMP+MPI comes into play,
where you run kspace only across the MPI tasks.
sounds crazy, that you get faster by not
using processors, but it works in this case.
cheers,
axel.
Thanks Axel. If Steve doesn’t chime in, I’ll eventually get around to making a small test script to better show the issue.
Lisa
examples/micelle/in.micelle uses the pair_style soft
command with fix adapt. Your error below is due
to using $a in the pair_coeff command. There is no
variable named "a". Just set pair_coeff to 0.0 if
you are going to override it with fix adapt.
pair_style soft 1.12246
pair_coeff * * 0.0
variable prefactor equal 1.0+elapsed*(20.0-1.0)/1000
fix 3 all adapt 1 pair soft a * * prefactor
The advantage of the one new command (fix adapt) is it
can drive changes in any pair style's coeffs, and the formula
for time dependence can now be arbitrary.
Steve
Well that saves me the trouble of making a test script for the problem—I can’t use the example in.micelle either. This is getting interesting–it works fine for 1 processor, but I typically run 14 when testing on my computer, for which it gives me the errors I mentioned earlier which are a bunch of these:
[s9:38616] *** Process received signal ***
[s9:38616] Signal: Bus error (10)
[s9:38616] Signal code: (2)
Followed by a bunch like these:
[ 1] [0xbfffe708, 0x00000000] (-P-)
[ 2] (_ZN9LAMMPS_NS8FixAdapt9pre_forceEi + 0x1cb) [0xbfffe778, 0x0007cf15]
[ 3] (_ZN9LAMMPS_NS8FixAdapt4initEv + 0x2c5) [0xbfffe7e8, 0x0007cd41]
[ 4] (_ZN9LAMMPS_NS6Modify4initEv + 0x295) [0xbfffe878, 0x00102e49]
[ 5] (_ZN9LAMMPS_NS6LAMMPS4initEv + 0x41) [0xbfffe898, 0x000f189d]
[ 6]
[ 6] (_ZN9LAMMPS_NS3Run7commandEiPPc + 0x7d8) [0xbfffe988, 0x00191c06]
[ 7] (_ZN9LAMMPS_NS5Input15execute_commandEv + 0x163c) [0xbfffea88, 0x000ee9e6]
[ 8] (_ZN9LAMMPS_NS5Input4fileEv + 0x2f6) [0xbffff308, 0x000ef720]
[ 9] (main + 0x60) [0xbffff338, 0x000f444e]
[10] (start + 0x36) [0xbffff338, 0x000f444e]
[10] (start + 0x36) [0xbffff354, 0x00001956]
[11] [0x00000000, 0x00000001] (FP-)
Even more interesting is that it also runs fine on 2-8 processors! Possibly relevant is that I use the following to run lammps:
mpirun -np 14 ~/lmp_mac_mpi_Jul13 <in.micelle
And that I have only 8 processors, however running with –np 14 is much faster for my system than –np 8 because I can have 2 threads per processor. I do not get an error for in.micelle using –np 1 through 8, but I do get it for –np 9 and above.
Lisa
Well that saves me the trouble of making a test script for the problem
—I can’t use the example in.micelle either. This is getting
interesting--it works fine for 1 processor, but I typically run 14
when testing on my computer, for which it gives me the errors I
mentioned earlier which are a bunch of these:
well, i cannot reproduce them on my machine.
but that can have a lot of reasons. if it is due to
an uninitialized variable (which is quite possible,
if it depends on the number of processors used).
i am using more than the number of physical processors
frequently, too, on my desktop. i get up to 20% extra
from hyper-threading.
[s9:38616] *** Process received signal ***
[s9:38616] Signal: Bus error (10)
[s9:38616] Signal code: (2)
Followed by a bunch like these:
[ 1] [0xbfffe708, 0x00000000] (-P-)
[ 2] (_ZN9LAMMPS_NS8FixAdapt9pre_forceEi + 0x1cb) [0xbfffe778,
0x0007cf15]
can you recompile your executable with the '-g' flag added
to both compiler and linker flags? this would then give us
the exact line number where the failure happens at this point
(i hope). otherwise you'd need to enable coredumps and generate
a stack trace to get this information.
i have tried running with valgrid and found the following
issue with this input. FixAdapt::pre_force() _does_ call
modify->clearstep_compute() and is called in an "unusual"
location, i.e. from FixAdapt::init() instead of FixAdapt::setup()
as the comment indicates. and this exactly the next level
entry on your stack frame. i'm committing this to lammps-icms
ASAP and push it so you can pull, recompile and test.
diff --git a/src/modify.cpp b/src/modify.cpp
index 2655473..2ef5399 100644
--- a/src/modify.cpp
+++ b/src/modify.cpp
@@ -67,6 +67,7 @@ Modify::Modify(LAMMPS *lmp) : Pointers(lmp)
n_initial_integrate_respa = n_post_integrate_respa = 0;
n_pre_force_respa = n_post_force_respa = n_final_integrate_respa = 0;
n_min_pre_exchange = n_min_post_force = n_min_energy = 0;
+ n_timeflag = 0;
fix = NULL;
fmask = NULL;
cheers,
axel.
See the 15Jul10 patch - it fixes an initialization bug in fix adapt, which
likely will solve your problem.
Steve
Thanks Axel and Steve! The 15Jul10 patch does work for me. Also, Axel’s suggestion of adding n_timeflag=0; to modify.cpp also works for me. Incidentally, adding –g to the compiler and linker flags did not give me any additional info in my error output in this case.
Lisa