1
0

Delete article

Deleted articles cannot be recovered.

Draft of this article would be also deleted.

Are you sure you want to delete this article?

結晶構造のcifデータから結晶学的データを自動で取得し表にしてみたよ

Last updated at Posted at 2025-04-29

結晶構造のcifデータから結晶学的データを自動取得

科学において結晶構造のデータのフォーマットが決まっていまして、そのデータをcifと呼びます。このデータファイルにはいろんなことが書いてありますので、その中から結晶構造に関する重要なデータだけ取り出したいです。それを自動化しましょうっていうのが今回の内容です。

pythonではこのcifデータを扱うモジュールがいくつかあります。有名なのはRDkitpymatgenopenbableとかでしょうか。しかし、どれも癖があって痒いところに手が届かないような印象を受けております。今回、データを取得したいだけなので、モジュールは使わずcifのテキストを直接読み取ります。cifは、例えば、「"_cell_length_a"の後ろに$a$軸の長さを記述する」のようにフォーマットが決まってますのでそれらを参照して読み取るだけです。データベースから引っ張ってきた古いcifデータなどはフォーマットが若干崩れてたりして、一部のデータがうまく読み取れなかったり欠落したりしますが、大体動くと思います。出力はcsvです。

スクリプト

4行目の"filename.cif"を実際に読み込みたいcifのパスに変えていただければ走ります。

import csv
import numpy as np

file = "filename.cif"

crystaloglaph_deta = [["name"],
                      ["Chemical formula"],
                      ["Formula mass"],
                      ["Crystal system"],
                      ["a"],
                      ["b"],
                      ["c"],
                      ["alpha"],
                      ["beta"],
                      ["gamma"],
                      ["Unit cell volume"],
                      ["Temperature"],
                      ["Space group"],
                      ["Z"],
                      ["Rint"],
                      ["Final R1 values (I>2sigma(I))"],
                      ["Final wR(F2) values (I>2sigma(I))"],
                      ["Final R1 values (all data)"],
                      ["Final wR(F2) values (all deta)"],
                      ["Goodness of fit on F2"],
                      ["No. of reflections measured"],
                      ["No. of independent reflections"],
                      ]

path = file.split('/')
name_head = path[-1][:-4]
        
with open(file, "r") as f:
    lines = f.readlines()
    
num = np.zeros(len(crystaloglaph_deta))

crystaloglaph_deta[0].append(name_head)
    
num[0] = num[0] + 1

crystaloglaph_deta2 = crystaloglaph_deta

for j in range(len(lines)):
    if "_chemical_formula_sum" in lines[j]:
        line = lines[j+1].replace(' ', '')
        line = line.replace('\n', '')
        line = line.replace("'", '')
        crystaloglaph_deta[1].append(line)
                    
        num[1] = num[1] + 1

    if "_chemical_formula_weight" in lines[j]:
        line = lines[j].replace('_chemical_formula_weight', '')
        line = line.replace(' ', '')
        line = line.replace('\n', '')
        crystaloglaph_deta[2].append(line)
        num[2] = num[2] + 1

    if "_space_group_crystal_system" in lines[j]:
        line = lines[j].replace('_space_group_crystal_system', '')
        line = line.replace(' ', '')
        line = line.replace('\n', '')
        crystaloglaph_deta[3].append(line)
        num[3] = num[3] + 1

    if "_cell_length_a" in lines[j]:
        line = lines[j].replace('_cell_length_a', '')
        line = line.replace(' ', '')
        line = line.replace('\n', '')
        crystaloglaph_deta[4].append(line)
        num[4] = num[4] + 1

    if "_cell_length_b" in lines[j]:
        line = lines[j].replace('_cell_length_b', '')
        line = line.replace(' ', '')
        line = line.replace('\n', '')
        crystaloglaph_deta[5].append(line)
        num[5] = num[5] + 1

    if "_cell_length_c" in lines[j]:
        line = lines[j].replace('_cell_length_c', '')
        line = line.replace(' ', '')
        line = line.replace('\n', '')
        crystaloglaph_deta[6].append(line)
        num[6] = num[6] + 1

    if "_cell_angle_alpha" in lines[j]:
        line = lines[j].replace('_cell_angle_alpha', '')
        line = line.replace(' ', '')
        line = line.replace('\n', '')
        crystaloglaph_deta[7].append(line)
        num[7] = num[7] + 1

    if "_cell_angle_beta" in lines[j]:
        line = lines[j].replace('_cell_angle_beta', '')
        line = line.replace(' ', '')
        line = line.replace('\n', '')
        crystaloglaph_deta[8].append(line)            
        num[8] = num[8] + 1

    if "_cell_angle_gamma" in lines[j]:
        line = lines[j].replace('_cell_angle_gamma', '')
        line = line.replace(' ', '')
        line = line.replace('\n', '')
        crystaloglaph_deta[9].append(line)
        num[9] = num[9] + 1

    if "_cell_volume" in lines[j]:
        line = lines[j].replace('_cell_volume', '')
        line = line.replace(' ', '')
        line = line.replace('\n', '')
        crystaloglaph_deta[10].append(line)
        num[10] = num[10] + 1

    if "_diffrn_ambient_temperature" in lines[j]:
        line = lines[j].replace('_diffrn_ambient_temperature', '')
        line = line.replace(' ', '')
        line = line.replace('\n', '')
        crystaloglaph_deta[11].append(line)
        num[11] = num[11] + 1

    if "_space_group_name_H-M_alt" in lines[j]:
        line = lines[j].replace('_space_group_name_H-M_alt', '')
        line = line.replace(' ', '')
        line = line.replace('\n', '')
        line = line.replace("'", '')
        crystaloglaph_deta[12].append(line)
        num[12] = num[12] + 1

    if "_cell_formula_units_Z " in lines[j]:
        line = lines[j].replace('_cell_formula_units_Z ', '')
        line = line.replace(' ', '')
        line = line.replace('\n', '')
        crystaloglaph_deta[13].append(line)
        num[13] = num[13] + 1

    if "_diffrn_reflns_av_R_equivalents" in lines[j]:
        line = lines[j].replace('_diffrn_reflns_av_R_equivalents', '')
        line = line.replace(' ', '')
        line = line.replace('\n', '')
        crystaloglaph_deta[14].append(line)
        num[14] = num[14] + 1

    if "_refine_ls_R_factor_gt" in lines[j]:
        line = lines[j].replace('_refine_ls_R_factor_gt', '')
        line = line.replace(' ', '')
        line = line.replace('\n', '')
        crystaloglaph_deta[15].append(line)
        num[15] = num[15] + 1

    if "_refine_ls_wR_factor_gt" in lines[j]:
        line = lines[j].replace('_refine_ls_wR_factor_gt', '')
        line = line.replace(' ', '')
        line = line.replace('\n', '')
        crystaloglaph_deta[16].append(line)
        num[16] = num[16] + 1

    if "_refine_ls_R_factor_all" in lines[j]:
        line = lines[j].replace('_refine_ls_R_factor_all', '')
        line = line.replace(' ', '')
        line = line.replace('\n', '')
        crystaloglaph_deta[17].append(line)
        num[17] = num[17] + 1

    if "_refine_ls_wR_factor_ref" in lines[j]:
        line = lines[j].replace('_refine_ls_wR_factor_ref', '')
        line = line.replace(' ', '')
        line = line.replace('\n', '')
        crystaloglaph_deta[18].append(line)
        num[18] = num[18] + 1

    if "_refine_ls_goodness_of_fit_ref" in lines[j]:
        line = lines[j].replace('_refine_ls_goodness_of_fit_ref', '')
        line = line.replace(' ', '')
        line = line.replace('\n', '')
        crystaloglaph_deta[19].append(line)
        num[19] = num[19] + 1

    if "_diffrn_reflns_number" in lines[j]:
        line = lines[j].replace('_diffrn_reflns_number', '')
        line = line.replace(' ', '')
        line = line.replace('\n', '')
        crystaloglaph_deta[20].append(line)
        num[20] = num[20] + 1

    if "_refine_ls_number_reflns" in lines[j]:
        line = lines[j].replace('_refine_ls_number_reflns', '')
        line = line.replace(' ', '')
        line = line.replace('\n', '')
        crystaloglaph_deta[21].append(line)
        num[21] = num[21] + 1
        
for k in range(len(crystaloglaph_deta)):
    if num[k] == 0:
        crystaloglaph_deta[k].append("-")
    else:
        pass

f = open(name_head + '.csv', 'w', newline='')

writer = csv.writer(f)
writer.writerows(crystaloglaph_deta)
f.close()

簡単ですが、以上です

1
0
0

Register as a new user and use Qiita more conveniently

  1. You get articles that match your needs
  2. You can efficiently read back useful information
  3. You can use dark theme
What you can do with signing up
1
0

Delete article

Deleted articles cannot be recovered.

Draft of this article would be also deleted.

Are you sure you want to delete this article?