HTTPS Connection Error

casfrueh · October 10, 2023, 7:08am

I am running the code, with my api key:

with MPRester(api_key=“my api key”) as mpr:
docs1 = mpr.thermo.search(fields=[“nsites”, “composition”, “volume”, “symmetry”, “formula_pretty”, “material_id”, “last_updated”,
“uncorrected_energy_per_atom”, “energy_per_atom”, “formation_energy_per_atom”, “is_stable”, “deprecated”, “deprecation_reasons”])

I was running it a month ago with no problems but for the last two days I get the error:

MPRestError: HTTPSConnectionPool(host=‘api.materialsproject.org’, port=443): Max retries exceeded with url: /materials/thermo/?_limit=1000&_fields=nsites%2Ccomposition%2Cvolume%2Csymmetry%2Cformula_pretty%2Cmaterial_id%2Clast_updated%2Cuncorrected_energy_per_atom%2Cenergy_per_atom%2Cformation_energy_per_atom%2Cis_stable%2Cdeprecated%2Cdeprecation_reasons&_skip=182000 (Caused by ReadTimeoutError(“HTTPSConnectionPool(host=‘api.materialsproject.org’, port=443): Read timed out. (read timeout=20)”))

tschaume · October 10, 2023, 8:51pm

Thanks for letting us know. Our API started seeing significantly increased traffic due to inefficient usage by another user about 15 hours ago. The issue should be resolved now. Sorry for the inconvenience.

casfrueh · October 11, 2023, 7:07am

Hi Patrick,

Thanks for your reply. I am still getting the same error today.

tschaume · October 13, 2023, 7:01pm

Hi Cassie,

sorry for the troubles. We unfortunately can’t reproduce the issue. We’re able to get all 342588 documents in about 6 min without issue (@munrojm). For now, my only advice would be to make sure you’re running the latest mp-api client and keep trying. If the issue continues to persist, please let us know.

There are also some tips and tricks for large downloads here. HTH.

thanks,
Patrick

casfrueh · October 17, 2023, 7:24am

Hi Patrick,

Unfortunately, the issue is persisting. I normally pull these fields=[“nsites”, “composition”, “volume”, “symmetry”, “formula_pretty”, “material_id”, “last_updated”, “uncorrected_energy_per_atom”, “energy_per_atom”, “formation_energy_per_atom”, “is_stable”, “structure”, “theoretical”,“symmetry”]) from the mpr.summary.search a couple times of the year. I can still pull it right now with summary but back in July Jason said that the new r2SCAN data was only with the thermo endpoint and you were working on releasing the data to the summary endpoint with the next data release (to be fair that might have happened already and I missed it). So I am trying to pull most of these fields from the thermo endpoint. I can pull from the thermo endpoint but only for a few material ids. If I try to pull it all, I get the to about 41% and then get the read time out error:

MPRestError: HTTPSConnectionPool(host=‘api.materialsproject.org’, port=443): Max retries exceeded with url: /materials/thermo/?_limit=1000&_fields=nsites%2Ccomposition%2Cvolume%2Csymmetry%2Cformula_pretty%2Cmaterial_id%2Clast_updated%2Cuncorrected_energy_per_atom%2Cenergy_per_atom%2Cformation_energy_per_atom%2Cis_stable&_skip=140000 (Caused by ReadTimeoutError(“HTTPSConnectionPool(host=‘api.materialsproject.org’, port=443): Read timed out. (read timeout=20)”))

I tried a bunch of different things to get it to work but still have had no luck.

Best Regards,

Cassie

tschaume · October 17, 2023, 10:01pm

Thank you for reporting back! We took a closer look and noticed that an index in our database was missing after a recent data patch. We double-checked your query on our end (elapsed time ~3.5min) and are confident that the timeout issue is fixed now. Please let us know if that isn’t the case for you. Thanks!

casfrueh · January 12, 2024, 8:28am

Hi Patrick, I am back again with a few months delay. After your last message it was working fine until now. I am getting once again a time out error when using the Thermo endpoint. Here is the error:

HTTPSConnectionPool(host='api.materialsproject.org', port=443): Read timed out.

Here is my code again:

with MPRester(api_key="PLEASE-DONT-SHARE-YOUR-APIKEY!!!!!") as mpr:
    docs1 = mpr.thermo.search(fields=["nsites", "composition", "volume", "symmetry", "formula_pretty", "material_id", "last_updated",
    "uncorrected_energy_per_atom", "energy_per_atom", "formation_energy_per_atom", "is_stable", "energy_type","entry_types"])
    #list_of_available_fields = mpr.thermo.available_fields
print("docs1 is done")

I have three questions

can you help me with the time out error
I have to pull basically your whole data set a couple times a year, is there a better way to do it?
I am pulling some data from the summary endpoint which is working fine and some data from the thermo endpoint because I want the R2SCAN data, is going to be added to the summary endpoint anytime soon?
Thanks for your help. Hope you are having a good start to your New Year.

tschaume · January 12, 2024, 7:28pm

Hi Cassie,

thanks for reaching out again.

The timeout error should only be temporary. It can happen during the midnight hours (pacific time) when traffic from Asia is at its peak. We’ve also been fighting new scrapers, botnets, and abusive traffic to our website over the last couple of weeks We are working on making these endpoints more resilient by integrating the mp-api client with our AWS OpenData repositories (see #2).
If you’re only interested in downloading the entire thermo dataset (or as backup for any MP data retrieval), you can do so directly through our OpenData repos. See our docs and bucket browser for more info. The code snippet below is likely the fastest way to download all the thermo data for a specific database version. In the future, the mp-api client will hopefully make this seamless for our users.
Yes, I’d expect the summary endpoint to almost always work fine since it is our most performant and used endpoint. I’m tagging @munrojm to comment on the R2SCAN data in the summary endpoint.

aws s3 cp --no-sign-request --recursive \
    s3://materialsproject-build.s3.amazonaws.com/collections/2023-11-01/thermo/ \
    mp-thermo/

HTH
Patrick

Xujiahao1210 · November 26, 2024, 3:09am

Dear sir
I am running my code,with own api-key
from mp_api.client import MPRester

with MPRester(“api_key”) as mpr:

structure = mpr.get_structure_by_material_id("mp-1143")
print(structure)

It showed the error. I have turned off the VPN or changed my wifi settings, but it still does not work.

ProxyError: HTTPSConnectionPool(host=‘api.materialsproject.org’, port=443): Max retries exceeded with url: /heartbeat (Caused by ProxyError(‘Unable to connect to proxy’, SSLError(SSLZeroReturnError(6, ‘TLS/SSL connection has been closed (EOF) (_ssl.c:1149)’))))"
}

Xujiahao1210 · November 26, 2024, 10:08am

How can you solve this error? Waiting for your reply

tschaume · November 26, 2024, 7:25pm

This is likely an issue on your end. Make sure your proxy and wifi configuration can access this URL.