MPRester.summary.search shows the opposite(?) of query for a field if that field is not in the fields parameter

Problem

Including a search field in mpr.summary.search as a kwarg but not in fields returns SummaryDoc objects with default (often incorrect, non-None) values of that attribute. This leads to weird scenarios like searching for non-theoretical materials, and getting back the correct subset, but all of their theoretical attributes are True.

Example

If I run the following code:

from mp_api import MPRester

with MPRester(api_key='UR_API_KEY') as mpr:
    docs = mpr.summary.search(theoretical=False, fields=["energy_above_hull", "formula", "material_id"])
    docs = [d for d in docs]


print("n materials", len(docs))
print("n theoretical materials:", len([d for d in docs if d.theoretical]))

I get

/Users/ardunn/alex/lbl/projects/common_env/textenv_py310/lib/python3.10/site-packages/mp_api/client.py:138: builtins.UserWarning: Problem loading MPContribs client: Duplicate operationId: download_entries
Retrieving SummaryDoc documents: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 49794/49794 [00:09<00:00, 5094.30it/s]
n materials 49794
n theoretical materials: 49794

Perplexingly, my query of non-theoretical materials returned all theoretical materials(!)

After actually looking at some of these entries and doing some more queries, I realized these 49k entries were in fact not theoretical, but their theoretical attribute is being set incorrectly. Note that I did not include theoretical as a field to be returned, and yet for all the entries it returned True

If I do the same query, but this time including theoretical in the fields arg, I actually do get the correct theoretical attrs:

from mp_api import MPRester

with MPRester(api_key='UR_API_KEY') as mpr:
    docs = mpr.summary.search(theoretical=False, fields=["energy_above_hull", "formula", "material_id", "theoretical"])
    docs = [d for d in docs]


print("n materials", len(docs))
print("n theoretical materials:", len([d for d in docs if d.theoretical]))
/Users/ardunn/alex/lbl/projects/common_env/textenv_py310/lib/python3.10/site-packages/mp_api/client.py:138: builtins.UserWarning: Problem loading MPContribs client: Duplicate operationId: download_entries
Retrieving SummaryDoc documents: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 49794/49794 [00:09<00:00, 5375.82it/s]
n materials 49794
n theoretical materials: 0

The Point

I’m not sure if this is intended behavior for the API (since according to the API doc the default value of theoretical is True), but does not this behavior seem a bit confusing? In contrast, why not have the default value just be None?

Apologies if this is the way things are supposed to work, it just seemed unintuitive to an end user

Using version 0.24.4 of mp-api in python 3.10.1 on MacOS Monterey 12.4.

2 Likes

Hi @ardunn, this is absolutely not intended behavior. Thank you for bringing this up, I see where the issue lies with the document model. The change you suggested should have the intended effect. I’ll make the fix, and release a new API client today.

– Jason

I noticed the same issue which also occurs for the is_stable field where it defaults to False.

-Peter

2 Likes

Thanks @peterschindler!

– Jason

1 Like

This should be fixed as of mp-api v0.24.5

– Jason

2 Likes