Cif and json file structures


I am new to DFT calculations but I do have some experience with python programming. So, excuse my naive questions. I have tried to find answers on web but no luck so far.

  1. where I can find exact meaning of each entry of the .cif files and .json files. For example, what are the numbers under the ‘matrix’ key?. somethings i can guess like lattice vectors a,b,c, the angles \alpha,\beta etc.

  2. Eventually, I want to create my own .cif file for a material I am studying say single-layer GeS and then converted to json type. The reason is that the program I am using only reads .json type of files. Is there an easy way to do that?


Hi Ben,

The JSON files we use in the materials project correspond to a schema used in the pymatgen code. More specifically, the file keys correspond to inputs necessary to construct a pymatgen Structure object (and sometimes more or less information based on the verbosity you choose for the output). “matrix” corresponds to a matrix where the rows are the vectors corresponding to the unit cell lattice.

I don’t know as much about the cif standard, but you can find a pretty explicit description here.

Pymatgen has a pretty robust library for converting to and from file formats. Here’s a snippet that would achieve what you’ve described in (2). Note that you can also output any of the structures you get from our REST interface to these formats similarly.

from pymatgen import Structure
ges_structure = Structure.from_file("GeS.cif")"GeS.json")

Thanks! very helpful

Hi Ben,

I have problem reading “.cif” files by using the method you mention. I use Structure.from_file(“filename.cif”) but got the following error.

Traceback (most recent call last): File “”, line 1, in strc = Structure.from_file(“mp-23939H2O2.cif”) File “/Users/Hemanta/anaconda/lib/python3.6/site-packages/pymatgen/core/”, line 1572, in from_file merge_tol=merge_tol) File “/Users/Hemanta/anaconda/lib/python3.6/site-packages/pymatgen/core/”, line 1510, in from_str parser = CifParser.from_string(input_string) File “/Users/Hemanta/anaconda/lib/python3.6/site-packages/pymatgen/io/”, line 367, in from_string return CifParser(stream, occupancy_tolerance) File “/Users/Hemanta/anaconda/lib/python3.6/site-packages/pymatgen/io/”, line 306, in init self.cif = CifFile.from_string( File “/Users/Hemanta/anaconda/lib/python3.6/site-packages/pymatgen/io/”, line 278, in from_string c = CifBlock.from_string("data" + x) File “/Users/Hemanta/anaconda/lib/python3.6/site-packages/pymatgen/io/”, line 237, in from_string
assert len(items) %n ==0
Assertion error.

However I can convert those CIF to POSCAR and read structure from POSCAR.
Can you please provide me the feedback on this. Thanks in advance.

Hi Hemanta,

Can you give us a bit more detail on how you got the CIF? Usually errors in CIF parsing are a result of a improperly formatted CIF, but I suspect yours came from MP, so if you can let us know how you got the CIF we’ll try to debug the issue.

Hi Joseph,

Thank you so much for the clarification. Actually, the problem arises because of the CIF file I downloaded. I found at the end of CIF, there is one line with MPID when I remove that line and use the script mention above it works.