More than 3 years have passed since last update.

特定文字列を含む行のみを抽出しdf化/グラフ化

Posted at 2022-02-26

実験で得られるログデータについて、必要箇所のみを抽出し、グラフ化するまでを備忘録としてまとめる。

import pandas as pd
import japanize_matplotlib #日本語文字化け回避

path = 'log.txt'

#テキストファイルの中身をリストとして取得
#ファイルを開き、readlines()で各行を要素とするリストを取得する
with open(path) as f:
    lines = f.readlines()
#readlines()で取得できるリストは行末の改行文字\nを含んでいる。除去したい場合はリスト内包表記で各要素（各行）にstrip()メソッドを適用する。
lines_strip = [line.strip() for line in lines]
print(lines_strip)
#各行がリスト形式で表示される


#各行を要素とするリストからリスト内包表記で条件を満たす行のみを抽出
#今回はTempを含む行を抽出
Temp = [line for line in lines_strip if 'Temp' in line]
print(Temp)
#Tempを含む行のみがリストとして表示される

#リストをデータフレーム化
df = pd.DataFrame(Temp)
#dfをテキストファイルに出力
df.to_csv("test_pd.txt", header=False, index=False)
#解析するデータの出てくる位置が固定(fixed width format)なので、read_fwfを使う
df = pd.read_fwf(
    'test_pd.txt',
    colspecs=[(2, 30), (48, 51), (56, 59), (64, 67), (72, 75), (80, 83)], #ファイル区切り位置を定義
    header=None,
    names=['日時', '温度(00)', '温度(01)', '温度(02)', '温度(03)', '温度(04)'],
    parse_dates=['日時'])

df[['温度(00)', '温度(01)', '温度(02)', '温度(03)', '温度(04)']] /= 10 #今回使用するデータの都合
df.set_index("日時",inplace=True) #index指定
df
#必要な箇所のみのデータフレームの完成

#グラフ化
df.plot()

以下、参考にしたリンク先：
https://sohatach.hatenablog.jp/entry/2016/09/29/000547

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up