More than 1 year has passed since last update.

openpyxlのオートフィルタ設定のメモ

Last updated at 2023-02-05Posted at 2023-01-25

背景

先日、複数のエクセル出力したテーブルに対して、「オートフィルタを設定して、空白のみ除外したい」といった状況になりまして、Pythonで自動化してみました。VBAでやるのが簡単なのかもしれませんが、個人的に最近Pythonをよく使うようになったので、オートフィルタ設定を簡単にできる関数の作成にトライしました。今回まとめたことを自分用にメモしておきます。

オートフィルタの設定方法

まずはopenpyxlを使ってオートフィルタの設定をするために必要となるAPIの使い方についてまとめます。APIの詳細は公式を参照して下さい。(英文ですので、適宜翻訳して下さい)

オートフィルタを行うために、今回は大きく次の2つの設定を利用しました。

オートフィルタの範囲指定
フィルタの絞り込み

公式によると他にもソートを行う設定があるようですが、本記事では扱いません。

それでは一つづつ、使い方を示します。

オートフィルタの範囲指定

オートフィルタの範囲指定は、ws.auto_filter.refにセルの範囲を指定することで行います。

(例)

import openpyxl

if __name__ == '__main__':
    filename = 'エクセルファイル名'
    sheetname = 'シート名'
    wb = openpyxl.load_workbook(filename)
    ws = wb[sheetname]
    ws.auto_filter.ref = 'A1:F10'  # ← オートフィルタ範囲の設定

以降で示す今回作った関数では、上記のように範囲を直接指定することに加え、自動で全範囲を設定する対策も入れました。

フィルタの絞り込み

オートフィルタの絞り込みを設定するには、filterColumnの設定を行います。FilterColumnオブジェクトを用いてフィルター対象の列指定および、絞り込みを行う文字列(Filtersオブジェクトを使用)を以下のように指定します。

import openpyxl
from openpyxl.worksheet.filters import (FilterColumn, Filters)

if __name__ == '__main__':
    filename = 'エクセルファイル名'
    sheetname = 'シート名'
    words = ['フィルタ表示文字1', 'フィルタ表示文字2', ...]
    wb = openpyxl.load_workbook(filename)
    ws = wb[sheetname]
    ws.auto_filter.ref = 'A1:F10'

    col = FilterColumn(colId=2)
    col.filters = Filters(filter=words)
    ws.auto_filter.filterColumn.append(col)  # ← C列の文字列がwordsに含まれる行のみを表示する

ただし、上記で絞り込み設定を行うのみでは、非表示となった行もエクセル上では表示状態となり、一度ファイルをエクセルで開き手動でオートフィルタを適用しなおす必要があります。今回作った関数ではこの手間をなくすため、非表示行に別途非表示設定を適用する対策を入れました。

できたもの

今回作ったauto_filter関数について説明します。

内容

引数で受け取ったワークシートに対して、オートフィルタの設定を適用する関数です。オプションで特定列の絞り込み表示、もしくは非表示を行えるように作りました。オプションを指定しない場合は、シート全範囲にオートフィルタを適用するのみの挙動となります。

引数

まずは引数についてですが、次の表のとおりです。

引数名	内容	入力例
ws	openpyxlのワークシートオブジェクト。オートフィルタ設定の対象シート。	-
str_col	フィルタの絞り込みを設定する列名。省略時は'A'。	'B'
ref	フィルタを設定する範囲。省略時は全範囲。	'B1:D10'
shows	表示対象の文字列リスト。指定したもののみフィルタで表示。省略時は全て表示。	['Kiwi', 'Banana']
hiddens	非表示対象の文字列リスト。指定したものはフィルタで非表示。省略時は全て表示。	['Kiwi', 'Banana']

プログラム

続いて、プログラムの全体を示してます。

auto_filter.py

import openpyxl
from openpyxl.utils import get_column_letter, column_index_from_string
from openpyxl.utils.cell import coordinate_from_string
from openpyxl.worksheet.filters import (FilterColumn, Filters)


def auto_filter(ws, str_col='A', ref=None, shows=None, hiddens=None):
    # オートフィルタの適用
    start = 'A1'
    end = get_column_letter(ws.max_column) + str(ws.max_row)
    if ref:
        start = ref.split(':')[0]
        end = ref.split(':')[1]
    ws.auto_filter.ref = start + ':' + end

    # 座標を数値に直す
    coord_s = coordinate_from_string(start)
    coord_e = coordinate_from_string(end)
    col = column_index_from_string(str_col)
    col_s, row_s = column_index_from_string(coord_s[0]), coord_s[1]
    if col < col_s:
        raise ValueError(f'col={str_col} < col_s={coord_s[0]}')
    col_e, row_e = column_index_from_string(coord_e[0]), coord_e[1]
    if col_e < col:
        raise ValueError(f'col_e={coord_e[0]} < col={str_col}')

    # 絞り込みなし
    if shows is None and hiddens is None:
        return

    # 表示フィルタの設定
    index = col - 1
    all_words = (set([row[index].value for row in ws.rows]))
    words = []
    if shows:
        words = shows
        if None in words:
            words.append('')
        elif '' in words:
            words.append(None)
        hiddens = list(all_words - set(words))
    else:
        if None in hiddens:
            hiddens.append('')
        elif '' in hiddens:
            hiddens.append(None)
        words = list(all_words - set(hiddens))
    col = FilterColumn(colId=index-col_s+1)
    col.filters = Filters(filter=words)
    ws.auto_filter.filterColumn.append(col)

    # 非表示の設定
    if hiddens:
        for row in ws.rows:
            target = row[index]
            if target.row > row_s:
                if target.value not in words:
                    ws.row_dimensions[target.row].hidden = True

サンプル実行

サンプルとして、適当な値を入れたワークシートを作成し、そこにオートフィルタ設定を適用したうえで空白セルを非表示に絞り込み設定する例を示します。先ほどのauto_filter.pyを同じフォルダに置いておいてください。

sample.py

import openpyxl
from auto_filter import auto_filter

if __name__ == '__main__':
    wb = openpyxl.Workbook()
    ws = wb.active
    data = [
        ['Fruit', 'Quantity'],
        ['Kiwi', 3],
        ['Grape', 15],
        ['Apple', ''],
        ['Peach', None],
        ['Pomegranate', 3],
        ['Pear', 3],
        ['Tangerine', 3],
        ['Blueberry', 3],
        ['Mango', 3],
        ['Watermelon', ''],
        ['Blackberry', None],
        ['Orange', 3],
        ['Raspberry', 3],
        ['Banana', 3]
    ]
    for r in data:
        ws.append(r)
    auto_filter(ws, 'B', hiddens=[None])
    wb.save('filtered.xlsx')

(実行方法)

python sample.py

(実行結果)
上記を実行すると、カレントフォルダにfilterd.xlsxが作成されます。非表示する対象にNoneを設定しており、空白セルは除外しています。(なお、Noneと''は同様に空白セルとみなします)

作成されたエクセルは以下の状態となっています。

元もとの内容は次のとおりでした。

エクセル上のフィルタの設定は以下となっています。

動作確認環境

今回の実行には以下の環境を使いました。

Python3.9以上
openpyxl3.0.10

(openpyxlのインストール)

> pip install openpyxl

解説

作った関数の詳細を説明します。

オートフィルタの適用

start = 'A1'
end = get_column_letter(ws.max_column) + str(ws.max_row)
if ref:
    start = ref.split(':')[0]
    end = ref.split(':')[1]
ws.auto_filter.ref = start + ':' + end

ref引数を指定しない場合は、A1セルからシートの一番右下セルの範囲にフィルタを適用します。

セルの右下の範囲は以下で取得します。get_column_letterは引数の数値をAなど列の文字列に変換します。

end = get_column_letter(ws.max_column) + str(ws.max_row)

refの書式は左上セル:右下セルを想定しており、:でスプリットし、startに左上セル、endに右下セルを格納しています。

if ref:
    start = ref.split(':')[0]
    end = ref.split(':')[1]

指定範囲にオートフィルタを設定する部分は、以下のようにws.auto_filter.refに範囲を代入しています。指定がない場合は、シートの最大範囲としています。

ws.auto_filter.ref = start + ':' + end

座標を数値に直す

coord_s = coordinate_from_string(start)
coord_e = coordinate_from_string(end)
col = column_index_from_string(str_col)
col_s, row_s = column_index_from_string(coord_s[0]), coord_s[1]
if col < col_s:
    raise ValueError(f'col={str_col} < col_s={coord_s[0]}')
col_e, row_e = column_index_from_string(coord_e[0]), coord_e[1]
if col_e < col:
    raise ValueError(f'col_e={coord_e[0]} < col={str_col}')

coordinate_from_stringでA1などセル指定の文字列から、(列の文字列, 行の数字)のタプルを取得しています。
また、column_index_from_stringはAなど列の文字列の引数を元に、列の数値(1はじまり)を取得しています。

refで指定した列の範囲の外にcolの列が指定されていた場合はValuerError例外としています。

絞り込みなし

showsやhiddensの絞り込みを設定しない場合は、フィルタ設定のみで終了します。

if shows is None and hiddens is None:
    return

表示フィルタの設定

showsが指定された場合はそれらをフィルタで表示設定し、shows以外を非表示に設定します。また、hiddensが設定されていた場合は、全ての値からhiddensの値を除外してフィルタで表示設定します。

なお、空白セルはNoneまたは''(空文字)とし同等に扱います。

index = col - 1
all_words = (set([row[index].value for row in ws.rows]))
words = []
if shows:
    words = shows
    if None in words:
        words.append('')
    elif '' in words:
        words.append(None)
    hiddens = list(all_words - set(words))
else:
    if None in hiddens:
        hiddens.append('')
    elif '' in hiddens:
        hiddens.append(None)
    words = list(all_words - set(hiddens))
col = FilterColumn(colId=index-col_s+1)
col.filters = Filters(filter=words)
ws.auto_filter.filterColumn.append(col)

非表示の設定

オートフィルタ設定のみではエクセルを開いた時に反映されないため、プログラム処理にて行のhidden属性をTrueに設定し個別に非表示するようにしています。

if hiddens:
    for row in ws.rows:
        target = row[index]
        if target.row > row_s:
            if target.value not in words:
                ws.row_dimensions[target.row].hidden = True

以上です。よかったら参考にして下さい。

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up