Possible duplication of some "task_id" values in Electrolyte Genome Database

Konstantin_Karandash · February 4, 2021, 4:51pm

Dear everyone,

I think that using requests.get to download all Electrolyte Genome data results in some different molecules having the same “task_id” value. The problem is exemplified by the following python script (sorry for the awkward formatting):

import requests, sys, os

if sys.version_info[0] == 2:
from urllib import quote_plus
else:
from urllib.parse import quote_plus

def MAPI_KEY():
try:
return os.environ[“MAPI_KEY”]
except LookupError:
print(“MAPI_KEY environmental variable needs to be set.”)
quit()

urlpattern = {
“results”: “https://materialsproject.org/molecules/results?query={spec}”,
“mol_json”: “https://materialsproject.org/molecules/{mol_id}/json”,
“mol_svg”: “https://materialsproject.org/molecules/{mol_id}/svg”,
“mol_xyz”: “https://materialsproject.org/molecules/{mol_id}/xyz”,
}

def get_results(spec, fields=None):
“”“Take a specification document (a dict), and return a list of matching molecules.
“””
# Stringify spec, ensure the string uses double quotes, and percent-encode it…
str_spec = quote_plus(str(spec).replace("’", ‘"’))
# …because the spec is the value of a “query” key in the final URL.
url = urlpattern[“results”].format(spec=str_spec)
return (requests.get(url, headers={‘X-API-KEY’: MAPI_KEY()})).json()

problematic_ids=[“mol-38777”, “mol-38770”, “mol-39643”, “mol-22363”, “mol-25918”, “mol-23146”,
“mol-39001”, “mol-39068”, “mol-14809”, “mol-9187”]

results=get_results({})
for cur_id in problematic_ids:
counter=0
MWs=[]
for molecule in results:
if molecule[“task_id”] == cur_id:
counter+=1
MWs.append(molecule[“MW”])
print(“task_id:”, cur_id, “; times occuring:”, counter, “; MWs:”, MWs)

Could someone tell me whether it’s an issue with the script or the database? Unfortunately, for my application I need to get the minimal energy geometry for each database entry I use, so it’s important for me to be sure that “{mol_id}/xyz” corresponds to the correct entry.

Sam_Blau · March 3, 2021, 10:54pm

Hi Konstantin,

My apologies for the delayed response. Thank you for bringing this to our attention. I’m working with some other MP developers to resolve the issue and will let you know.

Sincerely,
Sam