More than 5 years have passed since last update.

僕はこうやってAWSのJSONデータをCSVに変換する

Posted at 2015-03-07

AWSを使ってるプロジェクトでは、AMIの一覧とかをドキュメントにまとめたい時が結構ある。

そういう時はAWS公式のCLIで、JSONでデータを取得して、CSVとかにしてからドキュメントにまとめたいと思うときがある。ドキュメントによくある表形式のデータは、JSONと非常に相性が悪くて困る。

そんな時はこうする。
例えば、AMIの一覧のJSONがこんな感じで取得できたとする。

images.json

{
    "Images": [
        {
            "VirtualizationType": "hvm", 
            "Name": "hogehoge_1", 
            "Hypervisor": "xen", 
            "SriovNetSupport": "simple", 
            "ImageId": "ami-99999991", 
            "State": "available", 
            "BlockDeviceMappings": [
                {
                    "DeviceName": "/dev/xvda", 
                    "Ebs": {
                        "DeleteOnTermination": true, 
                        "SnapshotId": "snap-9999999d", 
                        "VolumeSize": 100, 
                        "VolumeType": "standard", 
                        "Encrypted": false
                    }
                }
            ], 
            "Architecture": "x86_64", 
            "ImageLocation": "999999999993/hogehoge1", 
            "RootDeviceType": "ebs", 
            "OwnerId": "999999999999", 
            "RootDeviceName": "/dev/xvda", 
            "CreationDate": "2014-12-17T06:35:39.000Z", 
            "Public": false, 
            "ImageType": "machine", 
            "Description": null
        }, 
        {
            "VirtualizationType": "hvm", 
            "Name": "hogehoge_2", 
            "Hypervisor": "xen", 
            "SriovNetSupport": "simple", 
            "ImageId": "ami-99999991", 
            "State": "available", 
            "BlockDeviceMappings": [
                {
                    "DeviceName": "/dev/xvda", 
                    "Ebs": {
                        "DeleteOnTermination": true, 
                        "SnapshotId": "snap-9999999d", 
                        "VolumeSize": 100, 
                        "VolumeType": "standard", 
                        "Encrypted": false
                    }
                }
            ], 
            "Architecture": "x86_64", 
            "ImageLocation": "999999999993/hogehoge1", 
            "RootDeviceType": "ebs", 
            "OwnerId": "999999999999", 
            "RootDeviceName": "/dev/xvda", 
            "CreationDate": "2014-12-17T06:35:39.000Z", 
            "Public": false, 
            "ImageType": "machine", 
            "Description": null
        }, 
    ]
}

このjsonを、pythonを使ってCSVに変更する。即興で作ったので、いろいろ突っ込みはあると思うんだけど、ごめん。ほんとごめん。

sample.py

# coding:UTF-8
import json
import codecs

# 変数定義。てきとー
sourcefilename = "images.json"
outfilename = "outfile.csv"
targetDataName = "Images"
# 関数定義
def list2str(srclist,startStr,endStr):
    #リストじゃないなら変換しない。
    if not (isinstance(srclist,list)):
        return srclist
    #listを文字列に変更
    resultStr = ""
    resultStr += startStr
    for item in srclist:
        tmp = ""
        if isinstance(item,str):
            tmp += item +","
        elif isinstance(item,list):
            tmp += list2str(item,"","") + ","
        else:
            tmp += str(item) + ","
        resultStr += tmp
    resultStr += endStr
    return resultStr

# ファイル読み込み
sourceFile = open(sourcefilename,"r")
sourceData = json.load(sourceFile)
sourceFile.close()
# jsonデータから、目的のデータを取得
targetData = sourceData.get(targetDataName)

# ヘッダを取得
headerSet = set()
for row in targetData:
    colList = row.keys()
    for colName in colList:
        headerSet.add(colName)
headerstr = ""
for headerName in headerSet:
    headerstr += headerName + ","
# ヘッダの取得完了

# データ取得
datalist = list()
for row in targetData:
    rowstr = ""
    for colName in headerSet:
        if isinstance(row.get(colName),list):
            rowstr += list2str(row.get(colName),"\"","\"") + ","
        elif isinstance(row.get(colName),bool):
            tmpStr = str(row.get(colName))
            rowstr += tmpStr + ","
        elif isinstance(row.get(colName),unicode):
            rowstr += row.get(colName) + ","
        elif isinstance(row.get(colName),type(None)):
            rowstr += "None,"
    datalist.append(rowstr)

# 書き込みファイルを開く
outfile = codecs.open(outfilename,"w", "shift_jis")
outfile.write(headerstr + "\n")
for rowstr in datalist:
    outfile.write(rowstr + "\n")
outfile.close()

これで、CSVデータが作れる。
一旦CSVデータにしてしまえば、一度Excelで開いて色々と編集できる。

これまではPythonを知らなかったから、Javaでツールを作ってたんだけど、Pythonはすごい便利。
AWSのCLIはPythonを必須としているから、新たにPythonを入れる必要もないし。
Python、本格的に勉強してみようかと思う。

お役にたてば幸いです。

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up