More than 3 years have passed since last update.

Pandasのread_fwfで固定長のデータを変換（震度データ）

Posted at 2021-11-06

参考

日本の地震データを加工してわかりやすくしてみた
https://qiita.com/T_programming/items/2dae8f40941ff3581036

read_fwfで固定長のデータを変換

read_fwfだと全角文字を１文字と認識してしまうためずれてしまう
「震央地名」が全角文字のためずれる
「震央地名」以降を一旦全部取得後に分離する

ダウンロード

!wget https://www.data.jma.go.jp/svd/eqev/data/bulletin/data/shindo/i2019.zip

プログラム

import pandas as pd

widths = [1, 4, 2, 2, 2, 2, 4, 4, 3, 4, 4, 4, 4, 4, 5, 3, 2, 1, 2, 1, 1, 1, 1, 1, 1, 1, 1, 3, 28]

names = [
    "ヘッダー",
    "西暦",
    "月",
    "日",
    "時",
    "分",
    "秒",
    "標準誤差(秒)",
    "緯度(度)",
    "緯度(分)",
    "緯度標準誤差(分)",
    "経度(度)",
    "経度(分)",
    "経度標準誤差(分)",
    "深さ(km)",
    "標準誤差(km)",
    "マグニチュード１",
    "マグニチュード１種別",
    "マグニチュード２",
    "マグニチュード２種別",
    "使用走時表",
    "震源評価",
    "震源補助情報",
    "最大震度",
    "被害規模",
    "津波規模",
    "大地域区分番号",
    "小地域区分番号",
    "震央地名",
]

df = pd.read_fwf(
    "i2019.zip",
    encoding="cp932",
    header=None,
    widths=widths,
    names=names,
)

# 抽出

df1 = df[df["ヘッダー"].isin(["A", "B", "D"])].copy().reset_index(drop=True)

# 震央地名から「観測点数」と「震源決定フラグ」を分離

df1["観測点数"] = pd.to_numeric(df1["震央地名"].str[-6:-1].str.strip()).astype("Int64")
df1["震源決定フラグ"] = df1["震央地名"].str[-1]
df1["震央地名"] = df1["震央地名"].str[:-6].str.strip()

# 単位調整

df1["秒"] = df1["秒"].astype(float) / 100
df1["標準誤差(秒)"] = df1["標準誤差(秒)"].astype(float) / 100

df1["緯度(分)"] = df1["緯度(分)"].astype(float) / 100
df1["緯度標準誤差(分)"] = df1["緯度標準誤差(分)"].astype(float) / 100

df1["経度(分)"] = df1["経度(分)"].astype(float) / 100
df1["経度標準誤差(分)"] = df1["経度標準誤差(分)"].astype(float) / 100

df1["標準誤差(km)"] = df1["標準誤差(km)"].astype(float) / 100

df1["マグニチュード１"] = df1["マグニチュード１"].astype(float) / 10
df1["マグニチュード２"] = df1["マグニチュード２"].astype(float) / 10

# 日付変換

df_date = (
    df1[["西暦", "月", "日", "時", "分", "秒"]]
    .copy()
    .set_axis(["year", "month", "day", "hour", "minute", "seconds"], axis=1)
)

df1["datetime"] = pd.to_datetime(df_date)

df1.to_csv("2019.csv", encoding="utf_8_sig")

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up