Quickstart Tutorial¶
Once you have installed parsnip, most workflows involve reading a CIF file. Let’s assume we have the file my_file.cif in the current directory, and these are its contents:
data_cif_file
_journal_year 1999
_journal_page_first 0
_journal_page_last 123
_chemical_name_mineral 'Copper FCC'
_chemical_formula_sum 'Cu'
_cell_length_a 3.6
_cell_length_b 3.6
_cell_length_c 3.6
_cell_angle_alpha 90.0
_cell_angle_beta 90.0
_cell_angle_gamma 90.0
loop_
_atom_site_label
_atom_site_fract_x
_atom_site_fract_y
_atom_site_fract_z
_atom_site_type_symbol
_atom_site_Wyckoff_label
Cu1 0.0000000000 0.0000000000 0.0000000000 Cu a
_symmetry_space_group_name_H-M 'Fm-3m'
# Note that this table is only a subset of the full symmetry of the crystal, but
# it is sufficient to reconstruct the unit cell.
loop_
_symmetry_equiv_pos_site_id
_symmetry_equiv_pos_as_xyz
1 x,y,z
96 z,y+1/2,x+1/2
118 z+1/2,-y,x+1/2
192 z+1/2,y+1/2,x
Reading Keys¶
Now, let’s read extract the key-value pairs from our cif file. This subset of data usually contains information to reconstruct the system’s unit cell, and provides information regarding the origin of the data.
>>> from parsnip import CifFile
>>> filename = "example_file.cif"
>>> cif = CifFile(filename)
>>> cif.pairs
{'_journal_year': '1999',
'_journal_page_first': '0',
'_journal_page_last': '123',
'_chemical_name_mineral': "'Copper FCC'",
'_chemical_formula_sum': "'Cu'",
'_cell_length_a': '3.6',
'_cell_length_b': '3.6',
'_cell_length_c': '3.6',
'_cell_angle_alpha': '90.0',
'_cell_angle_beta': '90.0',
'_cell_angle_gamma': '90.0',
'_symmetry_space_group_name_H-M': "'Fm-3m'"}
A dict-like getter syntax is provided to key-value pairs. Single keys function exactly
as a python dict, while lists of keys return lists of values. Keys not present in the
pairs dict instead return None.
>>> cif["_journal_year"]
'1999'
>>> assert cif["_not_in_pairs"] is None
# Multiple keys can be accessed simultaneously!
>>> cif[["_cell_length_a", "_cell_length_b", "_cell_length_c"]]
['3.6', '3.6', '3.6']
Note that all data is stored and returned as strings by default. It is not generally
feasible to determine whether a piece of data should be processed, as conversions may
be lossy. Setting the cast_values property to True reprocesses the
data, converting to float or int where possible. Note that once data is reprocessed,
a new CifFile object must be created to restore the original string data
>>> cif.cast_values = True # Reprocess our `pairs` dict
>>> cif["_journal_year"]
1999
>>> cif[["_cell_length_a", "_cell_length_b", "_cell_length_c"]]
[3.6, 3.6, 3.6]
Reading Tables¶
CIF files store tables in loop_ delimited blocks. These structures begin with a list of column labels (in a similar format to keys like above), followed by space-delimited data.
This segment of the table shown above contains the table data, with 6 columns and 1 row:
loop_
_atom_site_label
_atom_site_fract_x
_atom_site_fract_y
_atom_site_fract_z
_atom_site_type_symbol
_atom_site_Wyckoff_label
Cu1 0.0000000000 0.0000000000 0.0000000000 Cu a
Now, let’s read the table. parsnip stores data as Numpy structured arrays, which
allow for a dict-like access of data columns. The loops property returns a
list of such arrays, although the get_from_loops method is often more
convenient.
>>> len(cif.loops)
2
>>> cif.loops[0]
array([[('Cu1', '0.0000000000', '0.0000000000', '0.0000000000', 'Cu', 'a')]],
dtype=[('_atom_site_label', '<U12'),
('_atom_site_fract_x', '<U12'),
('_atom_site_fract_y', '<U12'),
('_atom_site_fract_z', '<U12'),
('_atom_site_type_symbol', '<U12'),
('_atom_site_Wyckoff_label', '<U12')])
>>> cif.loops[0]["_atom_site_label"]
array([['Cu1']], dtype='<U12')
# (Unstructured) slices of tables can be easily accessed!
>>> xyz = cif.get_from_loops(["_atom_site_fract_x", "_atom_site_fract_y", "_atom_site_fract_z"])
>>> xyz
array([['0.0000000000', '0.0000000000', '0.0000000000']], dtype='<U12')
>>> xyz.astype(float)
array([[0., 0., 0.]])
Building Unit Cells¶
CIF files are commonly used to reconstruct atomic positions for a particular crystal. While the example file shown throughout this tutorial corresponds to FCC copper, it only contains a single atomic position, in contrast to the 4 expected for FCC’s primitive cell. parsnip can reconstruct tilable unit cells from symmetry operations and symmetry-irreducible (Wyckoff) positions contained in the file.
Cu1 0.0000000000 0.0000000000 0.0000000000 Cu a
# Note that this table is only a subset of the full symmetry of the crystal, but
# it is sufficient to reconstruct the unit cell.
loop_
_symmetry_equiv_pos_site_id
_symmetry_equiv_pos_as_xyz
1 x,y,z
96 z,y+1/2,x+1/2
118 z+1/2,-y,x+1/2
192 z+1/2,y+1/2,x
Only one line is required to build a tilable unit cell! The positions returned here
are in fractional coordinates, and can be imported into tools like freud to rapidly
build out supercells. For absolute coordinates, matrix multiply the fractional
coordinates by transpose of the cell’s lattice_vectors.
>>> pos = cif.build_unit_cell()
>>> print(pos)
[[0. 0. 0. ]
[0. 0.5 0.5]
[0.5 0. 0.5]
[0.5 0.5 0. ]]
Once freud is installed, crystal structures can be easily replicated!
>>> import freud
>>> box = freud.Box(*cif.box)
>>> uc = freud.data.UnitCell(box, basis_positions=pos)
>>> box, pos = uc.generate_system(num_replicas=2)
>>> assert len(pos) == 4 * 2**3
>>> np.testing.assert_allclose(box.L / 2, 3.6, atol=1e-15)