More than 5 years have passed since last update.

PythonでExcelのグラフ、画像挿入などをする。Openpyxlの使用例

Last updated at 2018-10-23Posted at 2018-10-17

PythonでExcelファイルをいじる

Pythonでデータ分析などをしたときに最終的に(偉い人や非エンジニア向け)にExcelで出す必要があることも多いかと思います。
そのときに、openpyxlを使うと便利でした。

openpyxlほか必要なパッケージのインストール

Anaconda(Miniconda)を使って、必要なパッケージをインストールしました。
インストールしたパッケージ

python ver.3.6.5
openpyxl ver.2.5.8
pandas ver.0.23.4
numpy ver.1.15.1
opencv ver.3.4.2
xlrd ver.1.1.0
pillow ver.5.3.0

例えば、以下のようにAnaconda PromptからAnacondaの仮想環境を作ります。

conda create -n excel python=3.6.5
conda activate excel
conda install openpyxl pandas numpy opencv xlrd pillow

csvファイルから必要な項目の抽出

csvファイル(カンマ区切りのテキストファイル)が元データだったとき、データを読み出したり、最終出力にいたるまでの解析をするにはpandasを使うのが便利です。
下のようにcsvファイルを読み出し、DataFrameという形式で保存します。
engine='python'はファイル名のパスに日本語を含む場合に必要になります。(デフォルトは'c'となっています。)

import pandas as pd

inputFilePath = 'C:\test.csv'  # 入力ファイルのパスを指定してください
inputDf = pd.read_csv(inputFilePath, engine='python', encoding='utf-8')

test.csvが以下のようなファイルであったとき、Stateがalertのデータのみ取り出すにはこのようにします。

extractDf = inputDf[inputDf['State'] == 'alert']

また、項目がID, Frame, State, Height, Distのみにするには、以下のようにします。


extractList = ['ID', 'Frame', 'State', 'Height', 'Dist']
for name in extractList:
    extractDf2[name] = inputDf[name]

PandasのDataFrameをExcelで出力するには以下のようにします。


outputFilePath = 'C:\extract.xlsx' # 出力したいファイルパスを指定してください
outputSheet1Name = 'extract1' # 出力したシート名を記入してください
outputSheet2Name = 'extract2'　# 出力したシート名を記入してください

writer = pd.ExcelWriter(outputFilePath)
extractDf.to_excel(writer, sheet_name=outputSheet1Name, index=True)
extractDf2.to_excel(writer, sheet_name=outputSheet2Name, index=True)
writer.save()

(応用) pandasの条件を満たすデータの抽出方法

Stateがalert, alert2, alert3のいずれかに達する場合に、抽出対象として、その状態に至ったIDをもつデータを抽出する場合。

detDf = inputDf.query('State == "alert" or State == "alert2" or State == "alert3"')
detID = detDf['ID'].unique()
outputDf = inputDf[inputDf['ID'].isin(detID)]

列の幅を調整する

データ属性の項目が1行目に書かれて、Stateの項目について列幅を15に調整する場合。
ここでポイントは、列番号からExcelの列の位置を表すアルファベットを取得するため、openpyxl.utils.get_column_letter()を使っていることです。

import openpyxl
from openpyxl.utils import get_column_letter
outputFilePath = 'C:\extract.xlsx'
wb = openpyxl.load_workbook(outputFilePath)
sheetNames = wb.sheetnames
for sh in sheetNames:
    for c in range(1, 256):
        ws = wb[sh]
        if ws.cell(row=1, column=c).value=='State':
            ws.column_dimensions[get_column_letter(c)].width = 15

動画ファイルからフレームを指定して貼り付ける(画像の貼り付け)

OpenCVを使って動画を読み込み、任意のフレームの画像をexcelに貼り付けます。

import cv2
from opencv_japanese import imread, imwrite
import openpyxl
from openpyxl.utils import get_column_letter
from openpyxl.drawing.image import Image as excImage

aviFilePath = 'C:\movie.avi'
# openCVを使った動画の読み込み
cap = cv2.VideoCapture(aviFilePath)

# 作業用に画像を出力するフォルダを作成する
imgPath = 'C:\tmpImg'
try:
    os.makedirs(imgPath, mode=0o777)
except FileExistsError:
    pass

# 該当するフレームの位置に動画ポジションを設定する
frameNo = 50　　# 例として50フレーム目を指定している
cap.set(cv2.CAP_PROP_POS_FRAMES, frameNo)
ret, frame = cap.read()
# 作業用に画像を出力する
dirPath, fileExtName = os.path.split(aviFilePath)
fileName, extName = os.path.splitext(fileExtName)
imgName = fileName + '_frame' + str(frameNo) + '.png'
imgOutputPath = os.path.join(imgPath, imgName)

# 貼り付け画像はサイズを縮小する
imgHeight, imgWidth = frame.shape[:2]
pasteSize = (int(imgWidth/2), int(imgHeight/2))
pasteImg = cv2.resize(frame, pasteSize)

# 抽出したフレームを画像ファイルとして作業用フォルダに保存する  
imwrite(imgOutputPath, pasteImg)

# Excelシートに画像を貼り付ける
outputFilePath = 'C:\extract.xlsx' # 出力したいExcelファイル名を指定してください
wb = openpyxl.load_workbook(outputFilePath)
sheetNames = wb.sheetnames
ws = wb[sheetNames[0]]  # ここでは例として1つ目のシートを取得しています
picInsertRow = 2
picInserrtCol = 10
# openpyxl.drawing.imageのImageを使って画像を取得する(内部ではPillowを使っているよう)
img = excImage(imgOutputPath)
inColLetter = get_column_letter(picInsertCol)
# 画像を貼り付ける位置をExcel形式で表す(列のアルファベット+行の番号)
inCellLetter = inColLetter + str(picInsertRow)
ws.add_image(img, inCellLetter)

wb.save(outputFilePath)

cv2のimreadとimwriteは日本語を含むファイルパスを読み込めないので、日本語に対応する関数を作りました。

opencv_japanese.py

import numpy as np
import cv2
import os

def imread(fileName, flags=cv2.IMREAD_COLOR, dtype=np.uint8):
    try:
        n = np.fromfile(fileName, dtype)
        img = cv2.imdecode(n, flags)
        return img
    except Exception as e:
        print(e)
        return None


def imwrite(fileName, img, params=None):
    try:
        ext = os.path.splitext(fileName)[1]
        result, n = cv2.imencode(ext, img, params)

        if result:
            with open(fileName, mode='w+b') as f:
                n.tofile(f)
            return True
        else:
            return False

    except Exception as e:
        print(e)
        return False

グラフを作成する

時系列折れ線グラフ(散布図)の作成(y軸を左右に持つ)

import openpyxl
from openpyxl.utils import get_column_letter
from openpyxl.styles import Color, PatternFill, Font, Border

# Excelファイルを開きます
outputFilePath = 'C:\extract.xlsx' # 出力したいExcelファイル名を指定してください
wb = openpyxl.load_workbook(outputFilePath)
sheetNames = wb.sheetnames
ws = wb[sheetNames[0]]  # ここでは例として1つ目のシートを取得しています

# 出力するグラフのサイズ
chtWidth = 24
chtHeight = 12

# 1つ目の散布図を用意します
cht1 = openpyxl.chart.ScatterChart()
cht1.y_axis.axId = 200
cht1.y_axis.title = 'Height'
cht1.height = chtHeight
cht1.width = chtWidth

# 2つ目の散布図を用意します
cht2 = openpyxl.chart.ScatterChart()
cht2.title = '検知物の時系列変化'
cht2.x_axis.title = 'Frame'
cht2.y_axis.title = "位置、移動距離"
cht2.y_axis.majorGridLines = None
cht2.height = chtHeight
cht2.width = chtWidth

# グラフ化するデータを参照する
startRow = 2
endRow = 28

# Excelのデータ参照した変数を用意する
frameValues = openpyxl.chart.Reference(ws, min_col=2, min_row=startRow, max_row=endRow)
heightValues = openpyxl.chart.Reference(ws, min_col=5, min_row=startRow, max_row=endRow)
wPosXValues = openpyxl.chart.Reference(ws, min_col=7, min_row=startRow, max_row=endRow)
wPosYValues = openpyxl.chart.Reference(ws, min_col=8, min_row=startRow, max_row=endRow)
moveDistValues = openpyxl.chart.Reference(ws, min_col=6, min_row=startRow, max_row=endRow)

# 系列を用意し、データ参照を入力する 最初の行を凡例とする
s1 = openpyxl.chart.Series(heightValues, frameValues, title_from_data=False, title=analysisItemList[3])
s2 = openpyxl.chart.Series(wPosXValues, frameValues, title_from_data=False, title=analysisItemList[5])
s3 = openpyxl.chart.Series(wPosYValues, frameValues, title_from_data=False, title=analysisItemList[6])
s4 = openpyxl.chart.Series(moveDistValues, frameValues, title_from_data=False, title=analysisItemList[7])

# グラフの書式設定をする
s1.graphicalProperties.line.solidFill = "4f81bd" # グラフの線の色
s1.marker.symbol = "diamond" # 各データ点のマーカーの形状
s1.marker.graphicalProperties.solidFill = "4f81bd" # 各データ点のマーカーの塗りつぶし色
s1.marker.graphicalProperties.line.solidFill = "4f81bd"　# 各データ点のマーカーの枠の色

s2.graphicalProperties.line.solidFill = "8064a2"
s2.marker.symbol = "triangle"
s2.marker.graphicalProperties.solidFill = "8064a2"
s2.marker.graphicalProperties.line.solidFill = "8064a2"

s3.graphicalProperties.line.solidFill = "9bbb59"
s3.marker.symbol = "triangle"
s3.marker.graphicalProperties.solidFill = "9bbb59"
s3.marker.graphicalProperties.line.solidFill = "9bbb59"

s4.graphicalProperties.line.solidFill = "c0504d"
s4.marker.symbol = "x"
s4.marker.graphicalProperties.solidFill = "c0504d"
s4.marker.graphicalProperties.line.solidFill = "c0504d"


# Chartに系列を追加する
# 1つ目のグラフに系列(s1)を、2つ目のグラフに3つの系列(s2, s3, s4)を追加している。
cht1.series.append(s1)
cht2.series.append(s2)
cht2.series.append(s3)
cht2.series.append(s4)

# y軸を2軸もつグラフに設定する(グラフを足し合わせる)
# 2つ目のグラフのy軸を右側に設定する
cht2.y_axis.crosses = "max"
cht2 += cht1

# Excelシートにグラフを貼り付ける
graphInsertCol = 12 # グラフを挿入する列番号
inColLetter = get_column_letter(graphInsertCol)
inRow = 2 # グラフを挿入する行番号
inCellLetter = inColLetter + str(inRow) # グラフを挿入するセルの位置をExcel形式で作る
ws.add_chart(cht2, inCellLetter)

出力イメージ

Excelの色の指定は16進数の数値です。RGB値からの変換は下ページなどを参照してください。
https://www.peko-step.com/tool/tfcolor.html

条件付書式の設定

from openpyxl.formatting.rule import CellIsRule

# 個数により、セル、文字を色付けする。
# Low: 6個以上のとき、赤く塗る
NumTh = [6] # リスト形式で与える必要があります

# データ範囲を指定
LowNumCol = 9
excelStartRow = 2
excelEndRow = 28

# 条件を見たすセルの塗りつぶしと文字を色づける色を指定する
redFill = PatternFill(bgColor='ffc7ce')
redFont = Font(color='9c0006')

LowNumColLetter = get_column_letter(LowNumCol)
# Excel形式でデータ範囲をあらわす(例：I2:I28)
targetRange = LowNumColLetter + str(excelStartRow) + ':' + LowNumColLetter + str(excelEndRow)
ws.conditional_formatting.add(targetRange, CellIsRule(operator='greaterThanOrEqual', formula=NumTh, stopIfTrue=True,
                                                          fill=redFill, font=redFont))

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up