Reconstrucing Noisy Unit Cells¶
Diffraction experiments and other experimental techniques for quantifying structure typically offer limited precision in the measurements that can be made. As a result, the Wyckoff position data recorded in some CIF files – particularly older ones – may make reproduction of the original structure challenging. In this example, we explore how parsnip’s build_unit_cell method can be tuned to accurately reproduce structures with complicated geometries, using alpha-Selenium as an example.
# A header describing this portion of the file
data_cif_Se-hP3
_chemical_name_mineral 'alpha-Selenium'
_chemical_formula_sum 'Se'
# Key-value pairs describing the unit cell (Å and °)
# Note the cell angles 90-90-120 indicate a hexagonal structure.
_cell_length_a 4.36620
_cell_length_b 4.36620
_cell_length_c 4.95360
_cell_angle_alpha 90.00000
_cell_angle_beta 90.00000
_cell_angle_gamma 120.00000
loop_
_space_group_symop_id
_space_group_symop_operation_xyz
1 x,y,z
2 -y,x-y,z+1/3
3 -x+y,-x,z+2/3
4 x-y,-y,-z+2/3
5 y,x,-z
6 -x,-x+y,-z+1/3
loop_
_atom_site_type_symbol
_atom_site_symmetry_multiplicity
_atom_site_Wyckoff_label
_atom_site_fract_x
_atom_site_fract_y
_atom_site_fract_z
Se 3 a 0.22540 0.00000 0.33333
Note that the basis positions for alpha-Selenium are provided to five decimal places of accuracy, while the symmetry operations are provided in a rational form.
>>> from parsnip import CifFile
>>> cif = CifFile("hP3.cif")
>>> # Let's make sure we reconstruct the unit cell's three atoms
>>> correct_uc = cif.build_unit_cell()
>>> correct_uc
array([[0.2254 , 0. , 0.33333 ],
[0. , 0.2254 , 0.66666333],
[0.7746 , 0.7746 , 0.99999667]])
>>> site_multiplicity = int(cif["_atom_site_symmetry_multiplicity"].squeeze())
>>> assert len(correct_uc) == site_multiplicity
parsnip’s default settings are able to correctly reproduce the unit cell – but the mismatch between numerical data and the symmetry operation strings can cause issues. If we truncate the Wyckoff position data, even by one decimal place, the reconstructed crystal contains duplicate atoms:
--- /home/docs/checkouts/readthedocs.org/user_builds/parsnip-cif/checkouts/124/doc/source/hP3.cif
+++ /home/docs/checkouts/readthedocs.org/user_builds/parsnip-cif/checkouts/124/doc/source/hP3-four-decimal-places.cif
@@ -30,4 +30,4 @@
_atom_site_fract_x
_atom_site_fract_y
_atom_site_fract_z
-Se 3 a 0.22540 0.00000 0.33333
+Se 3 a 0.2254 0.0000 0.3333
Rebuilding our crystal results in an error:
>>> lower_precision_cif = CifFile("hP3-four-decimal-places.cif")
>>> uc = lower_precision_cif.build_unit_cell()
>>> uc
array([[0.2254 , 0. , 0.3333 ], # A
[0. , 0.2254 , 0.66663333], # B
[0.7746 , 0.7746 , 0.99996667], # C
[0.2254 , 0. , 0.33336667], # A
[0. , 0.2254 , 0.6667 ]]) # B
>>> uc.shape == correct_uc.shape # Our unit cell has duplicate atoms!
False
By default, parsnip uses four decimal places of accuracy to reconstruct crystals. This yields the best overall accuracy (tested with several thousand CIFs), but is not always the best choice in general. A good rule of thumb is to use one fewer decimal places than the CIF file contains. This ensures positions are rounded sufficiently to detect duplicate atoms, and avoids issues in complex structures where Wyckoff positions may be very close to one another. Making this change results in the correct structure once again.
>>> cif = CifFile("hP3-four-decimal-places.cif")
>>> four_decimal_places = cif.build_unit_cell(n_decimal_places=3)
>>> four_decimal_places
array([[0.2254 , 0. , 0.3333 ],
[0. , 0.2254 , 0.66663333],
[0.7746 , 0.7746 , 0.99996667]])
>>> assert four_decimal_places.shape == correct_uc.shape
Important
Rounding of Wyckoff positions is an intermediate step in the unit cell reconstruction, and does not negatively impact the accuracy of the returned data. The unit cell is always returned in the full precision of the input CIF:
>>> cif = CifFile("hP3-four-decimal-places.cif")
>>> one_decimal_place = cif.build_unit_cell(n_decimal_places=1)
>>> np.testing.assert_array_equal(one_decimal_place, four_decimal_places)
Symbolic Parsing¶
In some cases, particularly in structures with many atoms, careful tuning of numerical
precision is not enough to accurately reproduce a crystal. parsnip includes a
specialized parser that uses rational arithmetic to correctly compare fractions that
only differ by a few units in last place. To enable this, install the sympy library
and set parse_mode="sympy" when building the unit cell.
>>> cif = CifFile("hP3.cif")
>>> symbolic = cif.build_unit_cell(n_decimal_places=4, parse_mode="sympy")
>>> symbolic
array([[0.2254 , 0. , 0.33333 ],
[0. , 0.2254 , 0.6666633],
[0.7746 , 0.7746 , 0.99999667]])
>>> assert symbolic.shape == correct_uc.shape