More than 3 years have passed since last update.

【Python】ExcelのVLookUpはPythonに任せよう(txt + txt編)

Last updated at 2022-03-13Posted at 2022-03-10

対象ファイルデータ

今回は、サンプルとして簡単に以下を用意しました。

テキストファイル1

テキストファイル2

この２つのデータを結合して以下のようなexcelファイルを作成しようと思います。

はじめに

今回はChromeの拡張サービスである「Colaboratory」を使用してPythonのコードを記述していきます。
Google Driveから「新規 → その他 → アプリを追加」からインストールしてください。(Chromeで検索してもインストールできます)

全体コード

import openpyxl
import datetime
from google.colab import drive

start = datetime.datetime.now()
print('処理開始')
print(start)

drive.mount('/content/drive')

OUTPUT_FILE_NAME = "/content/drive/My Drive/sample/file/sample_txt.xlsx"

departmentDict = {}
positionDict = {}
detailHeaderList = []
with open('/content/drive/My Drive/sample/file/detail.txt', mode='rt', encoding='utf-8') as f:    
  for index, line in enumerate(f.read().splitlines()):
    splitData = line.split("\t")

    if index == 0:
      for headerIndex, name in enumerate(splitData):
        if headerIndex != 0:
          detailHeaderList.append(name)
    else:
      departmentDict[splitData[0]] = splitData[1]
      positionDict[splitData[0]] = splitData[2]

# ブック作成
workbook = openpyxl.Workbook()
# 先頭シート取得
worksheet = workbook.worksheets[0]

# データ出力
culmnIndex = 0
with open('/content/drive/My Drive/sample/file/employee.txt', mode='rt', encoding='utf-8') as f:    
  for index, line in enumerate(f.read().splitlines()):
    splitData = line.split("\t")
    culmnIndex = len(splitData)

    for splitIndex, data in enumerate(splitData):
      worksheet.cell(index + 1, splitIndex + 1, data)

      if index != 0 and splitIndex == 0:
        # 部署
        if data in departmentDict.keys():
          worksheet.cell(index + 1, culmnIndex + 1, departmentDict[data])
        else:
          worksheet.cell(index + 1, culmnIndex + 1, 'データなし')
        # 役職
        if data in positionDict.keys():
          worksheet.cell(index + 1, culmnIndex + 2, positionDict[data])
        else:
          worksheet.cell(index + 1, culmnIndex + 2, 'データなし')

# 結合ファイルのヘッダーを出力
for index, name in enumerate(detailHeaderList):
  worksheet.cell(1, culmnIndex + index + 1, name)

# 保存
workbook.save(OUTPUT_FILE_NAME)

end = datetime.datetime.now()
print('処理完了')
print(end)
print(end - start)

以下で各コード、手順について解説していきます。

インポート

import openpyxl
from google.colab import drive

まずは必要なライブラリをインストールします。
Google Drive, Colaboratoryを使用する場合、ドライブのマウントが必要なのでgoogle.clabからdriveもインポートしておきます。

ドライブのマウント

drive.mount('/content/drive')

Google Drive, Colaboratoryを使用する場合、ドライブのマウントを行います。

作成するファイル名の定義

OUTPUT_FILE_NAME = "/content/drive/My Drive/sample/file/sample_txt.xlsx"

作成するファイル名を配置したいディレクトリーから定義します。

テキストファイル2の読み込み

departmentDict = {}
positionDict = {}
detailHeaderList = []
with open('/content/drive/My Drive/sample/file/detail.txt', mode='rt', encoding='utf-8') as f:    
  for index, line in enumerate(f.read().splitlines()):
    splitData = line.split("\t")

    if index == 0:
      for headerIndex, name in enumerate(splitData):
        if headerIndex != 0:
          detailHeaderList.append(name)
    else:
      departmentDict[splitData[0]] = splitData[1]
      positionDict[splitData[0]] = splitData[2]

↓

with open('/content/drive/My Drive/sample/file/detail.txt', mode='rt', encoding='utf-8') as f:

こちらでテキストファイル2を読み込み

for index, line in enumerate(f.read().splitlines()):

行ごとに分割します。

splitData = line.split("\t")

行ごとに分割したデータを\tタブで更に分割し、

if index == 0:
  for headerIndex, name in enumerate(splitData):
    if headerIndex != 0:
      detailHeaderList.append(name)
else:
  departmentDict[splitData[0]] = splitData[1]
  positionDict[splitData[0]] = splitData[2]

１行目はヘッダーなのでid以外をdetailHeaderList(テキストファイル2のヘッダーリスト)に追加し、
２行目以降は部署、役職のそれぞれの辞書にkey=idでデータを定義します。

ワークブックの作成

workbook = openpyxl.Workbook()
worksheet = workbook.worksheets[0]

Excelのワークブックを作成し、先頭のシートを取得します。
今回は使用しませんが、workbook.create_sheet()で２シート目以降を作成することができます。

データ出力

culmnIndex = 0
with open('/content/drive/My Drive/sample/file/employee.txt', mode='rt', encoding='utf-8') as f:    
  for index, line in enumerate(f.read().splitlines()):
    splitData = line.split("\t")
    culmnIndex = len(splitData)

    for splitIndex, data in enumerate(splitData):
      worksheet.cell(index + 1, splitIndex + 1, data)

      if index != 0 and splitIndex == 0:
        # 部署
        if data in departmentDict.keys():
          worksheet.cell(index + 1, culmnIndex + 1, departmentDict[data])
        else:
          worksheet.cell(index + 1, culmnIndex + 1, 'データなし')
        # 役職
        if data in positionDict.keys():
          worksheet.cell(index + 1, culmnIndex + 2, positionDict[data])
        else:
          worksheet.cell(index + 1, culmnIndex + 2, 'データなし')

↓

with open('/content/drive/My Drive/sample/file/employee.txt', mode='rt', encoding='utf-8') as f:    
  for index, line in enumerate(f.read().splitlines()):
    splitData = line.split("\t")

同じく、テキストファイル1を読み込み、行ごとに分割、タブで分割します。

culmnIndex = len(splitData)

後ほど使うのでテキストファイル1の列数を格納しておきます。

for splitIndex, data in enumerate(splitData):
  worksheet.cell(index + 1, splitIndex + 1, data)

テキストファイル1のデータを出力していきます。
worksheet.cell(行, 列, 値)で該当のセルにデータを出力することができます。

if index != 0 and splitIndex == 0:
    # 部署
    if data in departmentDict.keys():
      worksheet.cell(index + 1, culmnIndex + 1, departmentDict[data])
    else:
      worksheet.cell(index + 1, culmnIndex + 1, 'データなし')
    # 役職
    if data in positionDict.keys():
      worksheet.cell(index + 1, culmnIndex + 2, positionDict[data])
    else:
      worksheet.cell(index + 1, culmnIndex + 2, 'データなし')

index != 0 and splitIndex == 0テキストファイル１の１行目（ヘッダー）じゃないかつ1列目(社員ID)のとき、
社員IDがテキストファイルから作成した辞書に定義されていればその値を出力します。
今回はあえてデータがない社員も作ったので、データがない場合はデータなしと出力しておくようにしておきます。

テキストファイル2のヘッダーの出力

for index, name in enumerate(detailHeaderList):
  worksheet.cell(1, culmnIndex + index + 1, name)

先に定義したテキストファイル2のヘッダーを出力します。

ワークブックの保存

workbook.save(OUTPUT_FILE_NAME)

最後にワークブックを保存すると、最初に定義したディレクトリにExcelファイルが作成されます。

最後に

今回は社内共有用を兼ねて簡単なサンプルでの解説となりましたが、応用していくことで業務の効率化、ヒューマンエラーの防止につながるので積極的に自動化していきましょう。

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up