Using LAMMPS from Kotlin, some fields looks to be interpreted as integers instead of floats

Long short story, I’m porting Mala to Kotlin

my pre-process pipeline is the following:

Python Mala has the very same args and input file

but when it’s time to lammps_extract_compute, all the fields that are 0 are fine (ie same number of them at the same indices), but the other ones are not…

If I open the logs I can spot some differences:

  • python triclinic box = (0 0 0) to (2.2598259 1.9570657 3.5698786) with tilt (-1.1299129 0 0)

  • kotlin triclinic box = (0 0 0) to (2 1 3) with tilt (-1 0 0)

  • python compute bgrid all sna/grid grid 18 18 27 4.67637 0.99363 10 0.5 1 rmin0 0 bzeroflag 0 quadraticflag 0 switchflag 1

  • kotlin compute bgrid all sna/grid grid 18 18 27 4.67637 0 10 0 1 rmin0 0 bzeroflag 0 quadraticflag 0 switchflag 1

  • python variable rcutneigh equal 2.0*4.67637*0.5

  • kotlin variable rcutneigh equal 2.0*4.67637*0

  • python

  master list distance cutoff = 6.67637
  ghost atom cutoff = 6.67637
  binsize = 3.338185, bins = 2 1 2
  • kotlin
  master list distance cutoff = 0
  ghost atom cutoff = 0
  binsize = 3, bins = 1 1 1

For the life of me, I cant get to find the culprit, does anybody see where the problem might be?

Thanks in advance

Ps: at the same gist, you can find the input and the compute file and the python and kotlin logs

On this I can only comment with a quote from a movie: Smokey, my friend, you are entering a world of pain.

These differences can only happen, if there are differences in the commands (or files) that are sent to LAMMPS, or if some index variables are set to different values (or the interface does not pass them properly). I cannot give any more specific suggestions, since I know neither Kotlin nor Mala.

1 Like

:grinning:

I had my bumps along the road, but it’s been a nice trip so far, this is the first real blocker for me

That’s what I thought as well, and this is why I’m doing everything to replicate exactly the same inputs/variables:

  • the args are the same list of strings, it uses utf8 for the encoding, I presume so does the python wrapper as well
  • the input file is exactly the same file
  • the command file is a copy of the same file

Let’s try to debug the easiest step

Reading data file …
triclinic box = (0 0 0) to (2 1 3) with tilt (-1 0 0)

which should correspond to the input file

0.0 2.2598258677677969 xlo xhi
0.0 1.9570656971336065 ylo yhi
0.0 3.5698785608749377 zlo zhi
-1.1299129338838985 0 0 xy xz yz

Now, I do understand your point of view, but here we are talking about something very curious and particular: some fields beings cast or interpreted as integers

I tried to modify the input file on purpose to

0.0 21.2598258677677969 xlo xhi

to see if Lammps detects the difference, and it does

Reading data file …
triclinic box = (0 0 0) to (21 1 3) with tilt (-1 0 0)

So, what else can differ other than the args for lammps_open_no_mpi, the input and the compute file and bring Lammps to do that?

Ps: I’m available to modify the source code to find the issue, but I’d need some hints

So, I’ve been playing a little with Lammps code and I found where the code diverges

read_data.cppReadData::header here:

    } else if (utils::strmatch(line, "^\\s*\\f+\\s+\\f+\\s+xlo\\s+xhi\\s")) {
      boxlo[0] = utils::numeric(FLERR, words[0], false, lmp);
      boxhi[0] = utils::numeric(FLERR, words[1], false, lmp, 1);

Second line parse the higher box coord, that should be 2.2598258677677969 but it’s 2 instead

if I go into utils::numeric (I added a default function int parameter a to print the log only when set to 1) here:

  if (!is_double(buf)) {
    ..
  } else if (a == 1) utils::logmesg(lmp, "is_double {} {:.5f}...\n", buf.c_str(), atof(buf.c_str()));

which from python

is_double 2.2598258677677969 2.25983…

from kotlin

is_double 2.2598258677677969 2.00000…

So the question now is why atof is parsing the same damn input as float in one case and int in the latter…

wait, could be a locale issue maybe? Comma/point

Where is this set or I can check that?

This looks related

Edit: if I switch the point to the comma, I do get this instead

ERROR: Unknown identifier in data file: 0.0 2,2598258677677969 xlo xhi (src/read_data.cpp:1365)
Last command: read_data ${atom_config_fname}

but if I do

  } else if (a == 1) //utils::logmesg(lmp, "is_double {} {:.5f}...\n", buf.c_str(), atof(buf.c_str()));
    return 2.2598258677677969;

then I do see it properly in the logs

triclinic box = (0 0 0) to (2.2598259 1 3) with tilt (-1 0 0)

These are not input file commands but part of the data file and that is ready by the read_data command and thus it should not make any difference unless the two files are different. I suggest to insert the command line:

shell head -20 ${atom_config_fname}

And compare the output. This should echo what the box dimension information from the data file looks like for each case.

Doing computational science with any other locale setting than the C locale (i.e. plain ASCII) is asking for trouble. Native language support is something for fancy GUI apps, but not science.

You can turn off any NLS with export LC_ALL=C

Lammps is the same, as the files as well

I already added the following to compare

    } else if (utils::strmatch(line, "^\\s*\\f+\\s+\\f+\\s+xlo\\s+xhi\\s")) {
      boxlo[0] = utils::numeric(FLERR, words[0], false, lmp);
      boxhi[0] = utils::numeric(FLERR, words[1], false, lmp, 1);
      utils::logmesg(lmp, "line {} ...\n", line);

and I get

line 0.0 2.2598258677677969 xlo xhi

If I copy without the incipit “line” and search in the lammps_input.tmp file I find a match, so there are no other blank weird chars that differ

Where/how shall I execute that in order to be seen from within my IDE when I run the program?

Also, is this normal?

elect@5800x:~/PycharmProjects/lammps/build$ locale
LANG=en_US.UTF-8
LANGUAGE=
LC_CTYPE=“en_US.UTF-8”
LC_NUMERIC=it_IT.UTF-8
LC_TIME=it_IT.UTF-8
LC_COLLATE=“en_US.UTF-8”
LC_MONETARY=it_IT.UTF-8
LC_MESSAGES=“en_US.UTF-8”
LC_PAPER=it_IT.UTF-8
LC_NAME=it_IT.UTF-8
LC_ADDRESS=it_IT.UTF-8
LC_TELEPHONE=it_IT.UTF-8
LC_MEASUREMENT=it_IT.UTF-8
LC_IDENTIFICATION=it_IT.UTF-8
LC_ALL=
elect@5800x:~/PycharmProjects/lammps/build$ export LC_ALL=C
elect@5800x:~/PycharmProjects/lammps/build$ locale
LANG=en_US.UTF-8
LANGUAGE=
LC_CTYPE=“C”
LC_NUMERIC=“C”
LC_TIME=“C”
LC_COLLATE=“C”
LC_MONETARY=“C”
LC_MESSAGES=“C”
LC_PAPER=“C”
LC_NAME=“C”
LC_ADDRESS=“C”
LC_TELEPHONE=“C”
LC_MEASUREMENT=“C”
LC_IDENTIFICATION=“C”
LC_ALL=C

Also, localeconv()->decimal_point is indeed , on my system…

This will not compile unless you have modified additional parts of LAMMPS.
If you make these kinds of changes to the LAMMPS sources, then you are on your own.

In general, LAMMPS will execute in the POSIX or C locale by default unless some application changes the global locale. So the LAMMPS executable will always expect numbers with a decimal point and no comma. However, if you embed the LAMMPS library into an executable that does set the locale, then LAMMPS will adhere to that.

I don’t use IDEs (unless you would call the emacs editor an IDE), but either inside the IDE or on the command line before you launch it or globally in your environment.

At this point, this is no longer a LAMMPS issue, but an issue with how you set up and use your machine and what you do in the code outside of LAMMPS. These are issues that are off-topic for a LAMMPS forum. You’ll have to search the web or talk to experts in those matters.

it looks like I have it already set to C

      auto name = std::locale::global(std::locale("en_DK.utf8")).name();
      utils::logmesg(lmp, "old locale {}", name);

old locale C

or am I wrong?

I already commented on this and any other thoughts I have on this matter would violate the forum policies and thus I cannot post them.

If you are not afraid of changing the LAMMPS library, you could add #include <clocale> and add a call:

setlocale(LC_ALL, "C");

to the LAMMPS class constructor in lammps.cpp and thus reset the locale setting from within LAMMPS.

But as mentioned before, this is now “off-territory” and you are on your own with any further problems along those lines.

I can’t say and I don’t really care since the problems have to come from either your environment or your LAMMPS wrapper. Neither is a LAMMPS issue.

Sorry, just to be sure, do you mean here?

Ps: have I been silenced? It seems I cant reply anymore to you, or? I’m sorry if I bothered you and I couldn’t fully grasp what you told me, but I lack your background and expertise in the matter for that

You have hit the spam detection by being a new user and linking to the same website in multiple posts in close succession. I have restored the flagged posts from the moderator interface.

There is the saying “If you cannot stand the heat, get out of the kitchen”. Your problems are self-inflicted and I am not interested in spending any more time on this after I have determined that this is not a LAMMPS issue or something that I an interested in in general.

This was indeed right, I just had to find the right spot for it to work: in Idea, run configuration, environment variables