Lmp.numpy.extract_fix() returns a sugle float when used with LMP_STYLE_GLOBAL and LMP_TYPE_ARRAY

eltonfc · March 20, 2024, 1:54am

When I have an ave/histo fix and try to read it in python with lmp.numpy.extract_fix('histogram', LMP_STYLE_GLOBAL, LMP_TYPE_ARRAY), it returns the value stored in row 0, column 0, which are the default values given in numpy_wrapper.py line 220, but not what the documentation says: “Return type: integer or double value, pointer to 1d or 2d double array or None”.

Reading the documentation, I would expect that a global fix which is an array would make lmp.numpy.extract_fix() return a 2-D Numpy Array. Instead, it returns the (0,0) element of that array.

I believe there’s a bug in the numpy wrapper implementation which does not call self.darray() to convert the data if the style is LMP_STYLE_GLOBAL.

Maybe the documentation is also imprecise when telling me it returns a pointer to a numpy array? I read that as meaning a numpy array, just like other methods do, that’s why I also suspect a documentation bug.

lmp.numpy.extract_compute() seems to adequately treat global computes, so I may be able to provide a patch for that tomorrow.

I’m running version 2 Aug 2023 - Update 2, but the code in the develop branch is the same.

Please let me know if I missed something.

Thank you!

akohlmey · March 20, 2024, 2:07am

Please provide a minimal working example file (ideally based on an existing LAMMPS example that runs very fast) so that we don’t have to reverse engineer what you did and spend a lot of time just to figure out how to reproduce what you are seeing.

akohlmey · March 20, 2024, 2:32am

I just remembered…
This is not going to be as simple as you would expect. That is due to a design choice in the LAMMPS C++ code that goes back a very, very long time (long before the python module and the C-library interface).
When accessing global values from a fix only a single scalar may be looked up and will be returned. Please see the first “Note” block for the documentation of the underlying C-library interface function: 1.1.5. Compute, fixes, variables — LAMMPS documentation
That is why the arguments “nrow” and “ncol” exist.

Computes are different in this regard.

TL;DR: this is not a bug but the expected behavior. The only problem is that the documentation of the python module functions does not mention it (only the C-library interface docs). This will be fixed in the next version of LAMMPS.

eltonfc · March 20, 2024, 3:38pm

Sorry for the delay. Now properly rested and caffeinated I can make a proper report.

The Minimal Working Example is in this Gist, and the python code is also copied below.

#!/usr/bin/env python

import numpy as np

from lammps import lammps, PyLammps, constants

lmp = lammps()
L = PyLammps(ptr=lmp)

L.units('real')
L.atom_style('full')
L.timestep(1)

L.pair_style('zero', 10)
L.bond_style('zero')

L.read_data('MWE.data')
L.pair_coeff('* *')
L.bond_coeff('* 1.5')
L.read_dump('MWE.lammpstrj', 0, 'x y z', 'box yes')
L.reset_timestep(0)

L.compute('bonds', 'all', 'bond/local', 'dist')
L.fix('bond_histo',
      'all',
      'ave/histo',
      1, 5, 5,
      1.48, 1.62, 5,
      'c_bonds',
      'mode vector',
      'file bond_histo.dat')

L.rerun('MWE.lammpstrj',
        'dump x y z',
        'box yes')

histo = lmp.numpy.extract_fix('bond_histo',
                              constants.LMP_STYLE_GLOBAL,
                              constants.LMP_TYPE_ARRAY)

print(f'({histo=}, {type(histo)=})')

Per the documentation, I would expect histo to be either a numpy array or a pointer to be interpreted as gibberish if not properly cast. Instead I get a Float which is exactly the (0, 0) term of the array.

eltonfc · March 20, 2024, 3:48pm

Ah, ok, good point. I was only reading the python documentation. So to obtain the histogram as a numpy array I should loop through the values in range(0,nbins), for each column in the array. I can do that in my code, no problem, that’s what I was about to do in order to not depend on a development version of LAMMPS.

I wonder if it would be more beneficial to implement this loop in the numpy wrapper itself, because if the data is internally an array, a regular user would expect that a numpy wrapper for an API would return the array, not an element, silently defaulting to (0,0). On the other hand, I can see the argument that the numpy wrapper, the python API and the C API should be consistent. Either way, the documentation should be fixed indeed.

Thanks for the quick response, Axel!

akohlmey · March 20, 2024, 4:15pm

I was thinking the same thing. But it is not for me alone to make that decision. If the change would be made, it would best be done in the C-library interface and then propagated throughout. The code already needs to malloc() space for a single double value and then pass it to the calling code as a pointer which is then required to free the pointer. Switching this to a malloc() for an entire vector or array would not make that much of a difference.

I would suggest that you describe the situation and submit this as a feature request issue on GitHub Issues · lammps/lammps · GitHub and I can bring this up at the next meeting of the LAMMPS developers in early April.

eltonfc · March 20, 2024, 11:26pm

Thank you! I’ll make a proper request there.

Thanks for updating the docs too! Cheers!