Numpy arrays error

Hi.
I’m writing a parser, and I get the error “nomad.metainfo.metainfo.MetainfoError: Only numpy arrays and dtypes can be used for higher dimensional quantities”. My python script looks something like this:

import numpy as np
from nomad.datamodel.metainfo.simulation.run import Run
from nomad.datamodel.metainfo.simulation.calculation import Calculation
from .metainfo.battery_parser import Experiments

sec_run =  archive.m_create(Run)
sec_calc = sec_run.m_create(Calculation)
sec_experiments = sec_calc.m_create(Experiments)
sec_experiments.value = np.zeros((1, 2))

The file battery_parser is the schema and looks something like this:

class Experiments(MSection):
    value = Quantity(type=float, shape=[1,2])

In other words, I created a custom quantity called value, which is a 1 by 2 array with floats, and I initialize it with 0’s.
But now I get the above mentioned error message that Nomad needs numpy arrays for higher-dimensional quantites, but sec_experiments.value IS a numpy array ?!
Any help would be greatly appreciated !
Best
Fabian

Hello Fabian,

I think you should try to give a shape as following

shape=[2]

so far, NOMAD does not support higher dimensionality for arrays so the first dimension you gave is implicit

Let us know if you can sort it out!
Kind regards

Andrea

Thanks for the quick rely. So I changed into the following:

shape=[2]

Everything else I left untouched. Now the error message is: “TypeError: The value [[0.0, 0.0]] with type <class ‘list’> for quantity … is not of type <class ‘float’>”

I see, I just tried initializing an array as my_array = np.zeros(2) and it looks different, you may trying doing so and checking again

also as a general tip, whenever you don’t want to fix the shape of your array, use shape = ['*']

Hi @fabian_li,

Thanks for reaching out. Editing just to not confuse you: type should be indeed set up to a numpy dtype class, as @NateD said :slight_smile:

I have a couple of other extra suggestions:

  • You can define the shape of your Quantities in another integer quantity inside the section. In your case, something like:
class Experiments(MSection):
    dimension1 = Quantity(type=np.int32)
    dimension2 = Quantity(type=np.int32)
    value = Quantity(type=np.float64, shape=['dimension1', 'dimension2'])

This is useful when you want to use a section which can vary its dimensions depending on the problem.

  • You can also define shapes as '*' if the dimensions are not known or you don’t want to keep that information, something like: value = Quantity(type=np.float64, shape=['*', '*']).

  • This might be not relevant, but in your example you are storing Experiments inside Run.Calculation. This section is used for simulations, rather than experiments. I think you can store experimental data under Measurement.

Let us know what happens with the changes.

Thanks @JosePizarro for support, looks like I was wrong in claiming we can’t have multidimensional shapes. I thinks then all boils down to correctly matching the quantity you initialize and it’s definition in terms of shape.

Dear Fabian,

please be careful with your definition of Quantity.type.
In NOMAD, we use numpy types for numbers, i.e. np.float64, np.int32, np.complex128 (where the trailing numbers denote the number of bits used for storage).
This (along with the len(shape) > 1) is what triggers your error message.
You can check for it yourself in the NOMAD project under nomad.metainfo.metainfo.py/Quantity/__get__.
I’ll see to update the definition of Quantity to better reflect this.

As @JosePizarro pointed out, in case your matrix size follows some kind of complex logic, you can add those dimensions as other Quantity’s, but this is not necessary. Obviously, you can assign a Quantity value directly.
Moreover, as suggested by @Andrea93, there is no need to define a tensor of shape [1, 2]. In that case, you can just stick to an array. Just fix the shape and/or type and your script should work as is.

Lastly, it is great to hear that you are working on a parser!
If you have it on a public git repo, feel free to share the link, this can help us get a more complete understanding in case of more complex errors.

My apologies for the long response chain.

Best,
Nathan

Solved. The problem was not the shape but the dtype. This solves it:

value  =  Quantity(type = np.float64, shape=....)

I didn’t expect the dtype to be the problem because I used Quantity(type=float), and the default dtype when using np.zeros is also float, but it’s float64 to be precise and apparently Nomad needs this info about the bit-size as well, just like @NateD said.
Thanks alot to all of you @JosePizarro @Andrea93 and @NateD !

1 Like