MPRester through proxy server

Hello,
I’m trying to use the MPRester in Pymatgen from a national lab computing resource (the login nodes), which proxies all HTTPS connections. However, attempts to retrieve Materials Project data produce the errors:

requests.exceptions.ProxyError: HTTPSConnectionPool(host=‘materialsproject.org’, port=443): Max retries exceeded with url: /rest/v2/query (Caused by ProxyError(‘Cannot connect to proxy.’, OSError(‘Tunnel connection failed: 403 Forbidden’,)))

pymatgen.ext.matproj.MPRestError: HTTPSConnectionPool(host=‘materialsproject.org’, port=443): Max retries exceeded with url: /rest/v2/query (Caused by ProxyError(‘Cannot connect to proxy.’, OSError(‘Tunnel connection failed: 403 Forbidden’,)))

I have the URL of the lab’s HTTPS proxy server; is there a way I can connect to Materials Project though this proxy?

Thanks

1 Like

Hi,
I always do this. You should be able to connect through proxy if your Unix variable https_proxy and http_proxy is set properly. Check https://www.golinuxcloud.com/set-up-proxy-http-proxy-environment-variable/

From python you can do this also by

In [1]: import os
        os.environ['http_proxy'] = "http://user:passwd@host:port" 
        os.environ['https_proxy'] = "https://user:passwd@host:port" 

Hope it helps.
Regards
Sandip

Thanks for the advice, however, I have already done this by setting the proxy in my .bash_profile. I can use utilities such as pip through the proxy server, it’s only Materials Project which is not accepting the connection.

Which national lab computing resource? I haven’t seen this before, imagine it might be difficult to investigate without interactive access but I can ask a colleague to investigate.

The Badger cluster at Los Alamos National Laboratory. I’m currently working with LANL tech support to try to figure it out, but so far we haven’t made any progress.

I’d start with using the requests library and trying to ping https://materialsproject.org/rest/v2/api_check. If you can find a way to get that to work via a proxy in Python then we’d have a route forward. I’m not sure if I know anyone with access to Badger right now but I’ll ask around.

I tried:
import requests
requests.get(‘https://materialsproject.org/rest/v2/api_check’, proxies= [LANL proxy info])
and got the same 403 error message. Trying to use the UNIX curl utility with https://www.materialsproject.org (or any other website, for that matter) also gives a 403 error. I can still do some things through the proxy, like use pip to install packages, but it might be a LANL-specific issue if it’s blocking such a wide variety of sites and functions. I’ll update this thread if I can figure things out with LANL tech support.

Yes, perhaps a LANL specific issue, it could be that https is causing the issue(?) unfortunately I don’t think we support requests over http alone.

1 Like

It’s actually likely HTTPS. You can’t have a https connection out from the client, the proxy has to initiate that. Basically your request would be to http://materialsproject.org/ and the proxy server has to then convert that to a HTTPS request when it initiates. HTTPS basically guarantees that no one is in the middle of the client and the server, but you’re putting a proxy server in between so it has to act as the HTTPS client.

Hello,

I have a slightly different error message, pointing at SSL verification.
I am behind corporate proxy, and using jupyter on a windows machine.
I can ping materialsproject.org with requests (note the InsecureRequestWarning message)

PMG_MAPI_KEY = '******'
import requests

import os
os.environ['http_proxy'] = 'http://uname:pwd@address:port'
os.environ['https_proxy'] = 'http://uname:pwd@address:port'

response = requests.get("https://materialsproject.org/rest/v2/api_check",
                       {"API_KEY" : PMG_MAPI_KEY },
                        verify = False
                       )
print(response.text)
C:\ProgramData\Anaconda3\envs\py38_64\lib\site-packages\urllib3\connectionpool.py:1043: InsecureRequestWarning: Unverified HTTPS request is being made to host '150.45.87.133'. Adding certificate verification is strongly advised. See: https://urllib3.readthedocs.io/en/1.26.x/advanced-usage.html#ssl-warnings
  warnings.warn(
{"valid_response": true, "response": {"api_key_valid": true, "version": {"db": "2020_09_08", "pymatgen": "2022.0.8", "rest": "2.0"}}}

but I cannot get the MPRester connected:

from pymatgen.ext.matproj import MPRester
with MPRester(api_key=PMG_MAPI_KEY, notify_db_version=False) as mpr:
    print( mpr.get_materials_ids("TaC") )
---------------------------------------------------------------------------
SSLCertVerificationError                  Traceback (most recent call last)
File C:\ProgramData\Anaconda3\envs\py38_64\lib\site-packages\urllib3\connectionpool.py:700, in HTTPConnectionPool.urlopen(self, method, url, body, headers, retries, redirect, assert_same_host, timeout, pool_timeout, release_conn, chunked, body_pos, **response_kw)
    699 if is_new_proxy_conn and http_tunnel_required:
--> 700     self._prepare_proxy(conn)
    702 # Make the request on the httplib connection object.

File C:\ProgramData\Anaconda3\envs\py38_64\lib\site-packages\urllib3\connectionpool.py:994, in HTTPSConnectionPool._prepare_proxy(self, conn)
    992     conn.tls_in_tls_required = True
--> 994 conn.connect()

File C:\ProgramData\Anaconda3\envs\py38_64\lib\site-packages\urllib3\connection.py:414, in HTTPSConnection.connect(self)
    412     context.load_default_certs()
--> 414 self.sock = ssl_wrap_socket(
    415     sock=conn,
    416     keyfile=self.key_file,
    417     certfile=self.cert_file,
    418     key_password=self.key_password,
    419     ca_certs=self.ca_certs,
    420     ca_cert_dir=self.ca_cert_dir,
    421     ca_cert_data=self.ca_cert_data,
    422     server_hostname=server_hostname,
    423     ssl_context=context,
    424     tls_in_tls=tls_in_tls,
    425 )
    427 # If we're using all defaults and the connection
    428 # is TLSv1 or TLSv1.1 we throw a DeprecationWarning
    429 # for the host.

File C:\ProgramData\Anaconda3\envs\py38_64\lib\site-packages\urllib3\util\ssl_.py:449, in ssl_wrap_socket(sock, keyfile, certfile, cert_reqs, ca_certs, server_hostname, ssl_version, ciphers, ssl_context, ca_cert_dir, key_password, ca_cert_data, tls_in_tls)
    448 if send_sni:
--> 449     ssl_sock = _ssl_wrap_socket_impl(
    450         sock, context, tls_in_tls, server_hostname=server_hostname
    451     )
    452 else:

File C:\ProgramData\Anaconda3\envs\py38_64\lib\site-packages\urllib3\util\ssl_.py:493, in _ssl_wrap_socket_impl(sock, ssl_context, tls_in_tls, server_hostname)
    492 if server_hostname:
--> 493     return ssl_context.wrap_socket(sock, server_hostname=server_hostname)
    494 else:

File C:\ProgramData\Anaconda3\envs\py38_64\lib\ssl.py:500, in SSLContext.wrap_socket(self, sock, server_side, do_handshake_on_connect, suppress_ragged_eofs, server_hostname, session)
    494 def wrap_socket(self, sock, server_side=False,
    495                 do_handshake_on_connect=True,
    496                 suppress_ragged_eofs=True,
    497                 server_hostname=None, session=None):
    498     # SSLSocket class handles server_hostname encoding before it calls
    499     # ctx._wrap_socket()
--> 500     return self.sslsocket_class._create(
    501         sock=sock,
    502         server_side=server_side,
    503         do_handshake_on_connect=do_handshake_on_connect,
    504         suppress_ragged_eofs=suppress_ragged_eofs,
    505         server_hostname=server_hostname,
    506         context=self,
    507         session=session
    508     )

File C:\ProgramData\Anaconda3\envs\py38_64\lib\ssl.py:1040, in SSLSocket._create(cls, sock, server_side, do_handshake_on_connect, suppress_ragged_eofs, server_hostname, context, session)
   1039             raise ValueError("do_handshake_on_connect should not be specified for non-blocking sockets")
-> 1040         self.do_handshake()
   1041 except (OSError, ValueError):

File C:\ProgramData\Anaconda3\envs\py38_64\lib\ssl.py:1309, in SSLSocket.do_handshake(self, block)
   1308         self.settimeout(None)
-> 1309     self._sslobj.do_handshake()
   1310 finally:

SSLCertVerificationError: [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate (_ssl.c:1131)

During handling of the above exception, another exception occurred:

MaxRetryError                             Traceback (most recent call last)
File C:\ProgramData\Anaconda3\envs\py38_64\lib\site-packages\requests\adapters.py:489, in HTTPAdapter.send(self, request, stream, timeout, verify, cert, proxies)
    488 if not chunked:
--> 489     resp = conn.urlopen(
    490         method=request.method,
    491         url=url,
    492         body=request.body,
    493         headers=request.headers,
    494         redirect=False,
    495         assert_same_host=False,
    496         preload_content=False,
    497         decode_content=False,
    498         retries=self.max_retries,
    499         timeout=timeout,
    500     )
    502 # Send the request.
    503 else:

File C:\ProgramData\Anaconda3\envs\py38_64\lib\site-packages\urllib3\connectionpool.py:785, in HTTPConnectionPool.urlopen(self, method, url, body, headers, retries, redirect, assert_same_host, timeout, pool_timeout, release_conn, chunked, body_pos, **response_kw)
    783     e = ProtocolError("Connection aborted.", e)
--> 785 retries = retries.increment(
    786     method, url, error=e, _pool=self, _stacktrace=sys.exc_info()[2]
    787 )
    788 retries.sleep()

File C:\ProgramData\Anaconda3\envs\py38_64\lib\site-packages\urllib3\util\retry.py:592, in Retry.increment(self, method, url, response, error, _pool, _stacktrace)
    591 if new_retry.is_exhausted():
--> 592     raise MaxRetryError(_pool, url, error or ResponseError(cause))
    594 log.debug("Incremented Retry for (url='%s'): %r", url, new_retry)

MaxRetryError: HTTPSConnectionPool(host='materialsproject.org', port=443): Max retries exceeded with url: /rest/v2/materials/TaC/mids (Caused by SSLError(SSLCertVerificationError(1, '[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate (_ssl.c:1131)')))

During handling of the above exception, another exception occurred:

SSLError                                  Traceback (most recent call last)
File C:\ProgramData\Anaconda3\envs\py38_64\lib\site-packages\pymatgen\ext\matproj.py:263, in MPRester._make_request(self, sub_url, payload, method, mp_decode)
    262 else:
--> 263     response = self.session.get(url, params=payload, verify=True)
    264 if response.status_code in [200, 400]:

File C:\ProgramData\Anaconda3\envs\py38_64\lib\site-packages\requests\sessions.py:600, in Session.get(self, url, **kwargs)
    599 kwargs.setdefault("allow_redirects", True)
--> 600 return self.request("GET", url, **kwargs)

File C:\ProgramData\Anaconda3\envs\py38_64\lib\site-packages\requests\sessions.py:587, in Session.request(self, method, url, params, data, headers, cookies, files, auth, timeout, allow_redirects, proxies, hooks, stream, verify, cert, json)
    586 send_kwargs.update(settings)
--> 587 resp = self.send(prep, **send_kwargs)
    589 return resp

File C:\ProgramData\Anaconda3\envs\py38_64\lib\site-packages\requests\sessions.py:701, in Session.send(self, request, **kwargs)
    700 # Send the request
--> 701 r = adapter.send(request, **kwargs)
    703 # Total elapsed time of the request (approximately)

File C:\ProgramData\Anaconda3\envs\py38_64\lib\site-packages\requests\adapters.py:563, in HTTPAdapter.send(self, request, stream, timeout, verify, cert, proxies)
    561 if isinstance(e.reason, _SSLError):
    562     # This branch is for urllib3 v1.22 and later.
--> 563     raise SSLError(e, request=request)
    565 raise ConnectionError(e, request=request)

SSLError: HTTPSConnectionPool(host='materialsproject.org', port=443): Max retries exceeded with url: /rest/v2/materials/TaC/mids (Caused by SSLError(SSLCertVerificationError(1, '[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate (_ssl.c:1131)')))

During handling of the above exception, another exception occurred:

MPRestError                               Traceback (most recent call last)
Input In [6], in <cell line: 2>()
      1 from pymatgen.ext.matproj import MPRester
      2 with MPRester(api_key=PMG_MAPI_KEY, notify_db_version=False) as mpr:
----> 3     print( mpr.get_materials_ids("TaC") )

File C:\ProgramData\Anaconda3\envs\py38_64\lib\site-packages\pymatgen\ext\matproj.py:362, in MPRester.get_materials_ids(self, chemsys_formula)
    351 def get_materials_ids(self, chemsys_formula):
    352     """
    353     Get all materials ids for a formula or chemsys.
    354 
   (...)
    360         ([str]) List of all materials ids.
    361     """
--> 362     return self._make_request(f"/materials/{chemsys_formula}/mids", mp_decode=False)

File C:\ProgramData\Anaconda3\envs\py38_64\lib\site-packages\pymatgen\ext\matproj.py:279, in MPRester._make_request(self, sub_url, payload, method, mp_decode)
    277 except Exception as ex:
    278     msg = f"{ex}. Content: {response.content}" if hasattr(response, "content") else str(ex)
--> 279     raise MPRestError(msg)

MPRestError: HTTPSConnectionPool(host='materialsproject.org', port=443): Max retries exceeded with url: /rest/v2/materials/TaC/mids (Caused by SSLError(SSLCertVerificationError(1, '[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate (_ssl.c:1131)')))

could anyone advise?
Thanks
Marco

I have found a couple of solutions -

either remove authentication

os.environ['http_proxy'] = 'http://address:port'
os.environ['https_proxy'] = 'http://address:port'

or set CURL_CA_BUNDLE to an empty string:

!CURL_CA_BUNDLE=""