4
3

Delete article

Deleted articles cannot be recovered.

Draft of this article would be also deleted.

Are you sure you want to delete this article?

More than 5 years have passed since last update.

XMLをpandas.DataFrameにする

Posted at

データ分析の過程でAPIを利用することは多いと思います。

APIの結果がXMLで返ってくるとき、扱いにちょっとこまって時間がかかったのでやり方をメモしておきます。

input

例として、こんな感じのXMLっぽいデータがあったとします。

# 例
response = """
<INFO>
    <RESULT>
        <STATUS>0</STATUS>
        <ERROR_MSG>SUCCESS</ERROR_MSG>
        <DATE>2019/09/29T06:21:15+9:00</DATE>
    </RESULT>
    <PARAMETER>
        <lang>ja</lang>
        <code>01</code>
    </PARAMETER>
</INFO>
"""

code

以下のコードでpd.DataFrameに変換できます。

# parse xml
from lxml import etree
root = etree.fromstring(response)

# convert xml to dict
import xmljson
data_dict = xmljson.yahoo.data(root)

# convert dict to pd.DataFrame
import pandas as pd
pd.io.json.json_normalize(data_dict)

output

結果はこんな感じになります。

image.png

pd.io.json.json_normalize()の引数等は元のXMLのネスト具合に応じて調整する必要があります。

4
3
0

Register as a new user and use Qiita more conveniently

  1. You get articles that match your needs
  2. You can efficiently read back useful information
  3. You can use dark theme
What you can do with signing up
4
3

Delete article

Deleted articles cannot be recovered.

Draft of this article would be also deleted.

Are you sure you want to delete this article?