More than 5 years have passed since last update.

Pythonのjson.loadsとxmltodict.parseを比較

Last updated at 2018-09-15Posted at 2018-09-15

はじめに

会社で息抜きにpython触ってみようかという話になって久々にコンソールアプリケーションを作っていたんですが、微妙にんん？っとなった点があったのでメモします。

dockerで環境作ってるので、pythonでコンソールアプリの練習でもはじめてみようかなという方の環境立ち上げの参考にもなるかと。

前提など

dockerを使う (会社ではmac 家ではwin 今のとこ動作に差分見られず)
python3

リポジトリはこちら
https://github.com/ymstshinichiro/python-docker

ファイル構成は以下。(見づらかったらすいません...)

- docker-compose.yml
│
├─docker
│   - Dockerfile
├─app
│   - app.py
├─json
│   - data.json
└─xml
    - data.xml

dockerはこんな感じです。シンプル。

Dockerfile

FROM python:3
RUN pip install xmltodict

docker-compose.yml

version: '2'
services:
  app:
    container_name: python_app
    build: 
      context: ./docker/
      dockerfile: Dockerfile
    volumes:
      - ./app:/var/www/app
      - ./json:/var/www/json
      - ./xml:/var/www/xml

ドキュメントルートにdocker-compose.ymlがいるので、コンソールでそのディレクトリに入って以下のコマンドでビルド、完了したらrunでコンテナに入ります。

$ docker-compose build
$ docker-compose run app bash

jsonとxmlをパースしたかった

下記のようなjsonで書かれた社員データをパースしてほげほげする、というのを最初やってたんですよ。

data.json

{
"employees": [
        {
        "name": "aaa",
        "age": "10",
        "gender": "male"
        },
        {
        "name": "bbb",
        "age": "20",
        "gender": "male"
        },
        {
        "name": "ccc",
        "age": "30",
        "gender": "female"
        }
    ]
}

で、それは問題なくできてたんですが、同じデータをxmlでやろうかという話になったら、そのままデータをXMLに書き直したらうまくいかなくて、rootっちゅう要素を一番下に入れたら無事パースできました。
そのデータが下記。

data.xml

<?xml version="1.0" encoding="UTF-8" ?>
<root>
    <employees>
        <name>aaa</name>
        <age>10</age>
        <gender>male</gender>
    </employees>
    <employees>
        <name>bbb</name>
        <age>20</age>
        <gender>male</gender>
    </employees>
    <employees>
        <name>ccc</name>
        <age>30</age>
        <gender>female</gender>
    </employees>
</root>

試しに表示だけさせるコードが下記の通り。

app.py


import json
import xmltodict

def main():
    json_data = json.loads(file_read('../json/data.json'))
    xml_data = xmltodict.parse(file_read('../xml/data.xml'))

    print_with_header('-- json_data ---- ', json_data)
    print_with_header('-- xml_data ---- ', xml_data)

    print_foreach_with_header('-- json_data["employees"] -- ',  json_data['employees'], 'name')
    print_foreach_with_header('-- xml_data["root"]["employees"] -- ',  xml_data['root']['employees'], 'name')

def file_read(path):
    with open(path, 'r') as file:
        text = file.read()
    return text

def print_with_header(headertext, printobject):
    print(headertext)
    print(printobject)
    print()

def print_foreach_with_header(headertext, printobjects, printlabel):
    print(headertext)

    for obj in printobjects:
        print(obj[printlabel])

main()

実行結果がこちら。

-- json_data ----
{'employees': [{'name': 'aaa', 'age': '10', 'gender': 'male'}, {'name': 'bbb', 'age': '20',
'gender': 'male'}, {'name': 'ccc', 'age': '30', 'gender': 'female'}]}

-- xml_data ----
OrderedDict([('root', OrderedDict([('employees', [OrderedDict([('name', 'aaa'), ('age', '10'), ('gender', 'male')]), OrderedDict([('name', 'bbb'), ('age', '20'), ('gender', 'male')]),
OrderedDict([('name', 'ccc'), ('age', '30'), ('gender', 'female')])])]))])

-- json_data["employees"] --
aaa
bbb
ccc
-- xml_data["root"]["employees"] --
aaa
bbb
ccc

XMLの場合だけパースするためにrootを入れなきゃいけないのがなんともダサいよなーと我ながら思っとるのですが何かいい解決方法はないものか。。。
ご存知の方がいらっしゃったらアドバイスいただけると助かります。

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up