Pythonで画像からテキストを抽出する方法

Last updated at 2025-06-13Posted at 2025-06-13

紙の書類をスキャンした画像やスクリーンショットからテキストを取り出したいときに活用できるのがOCR（光学文字認識）です。Pythonでは、Spire.OCR for Python を使うことで、画像からテキストを簡単に抽出できます。

本記事では、Pythonで画像からテキストを読み取る基本的な方法から、テキストブロックの位置取得、フォルダ内の複数画像の一括処理までを解説します。

使用ライブラリについて

本記事では Spire.OCR for Python を使用します。OCR処理には別途OCRモデルファイルのダウンロードが必要です。

インストール方法

pip install spire.ocr

モデルファイルのダウンロード（必須）

Spire.OCRでは事前学習済みのモデルを使用します。以下から使用OSに応じたモデルをダウンロードし、任意のフォルダに展開してください：

Windows（64bit）：win-x64.zip
Linux： linux.zip
macOS： mac.zip

ステップ別ガイド：画像からテキストを抽出する

ステップ1：OCRスキャナーの初期化とモデルの設定

from spire.ocr import *

scanner = OcrScanner()

options = ConfigureOptions()
options.ModelPath = r'D:\OCR\win-x64'  # 展開したモデルフォルダへのパス
options.Language = 'English'           # 対応言語（日本語は今後対応予定）
scanner.ConfigureDependencies(options)

ステップ2：画像からテキストを抽出し、ファイルに保存

scanner.Scan(r'Sample.png')  # 処理対象の画像パス
text = scanner.Text.ToString()

# 抽出されたテキストをファイルに保存
with open('output.txt', 'a', encoding='utf-8') as file:
    file.write(text + '\n')

ステップ3：テキストブロックの内容と位置情報を取得

text = scanner.Text

block_text = ""
for block in text.Blocks:
    rect = block.Box
    info = f'{block.Text} -> x: {rect.X}, y: {rect.Y}, w: {rect.Width}, h: {rect.Height}'
    block_text += info + '\n'

with open('output.txt', 'a', encoding='utf-8') as file:
    file.write(block_text + '\n')

ポイント：
テキストの内容だけでなく、画面上の位置（x, y, 幅、高さ）も取得できるため、画像中のレイアウト保持や構造化処理にも活用できます。

警告を削除するには、split("Evaluation")[0] を使用してください。

ステップ4：フォルダ内の複数画像から一括で抽出

import os
from spire.ocr import *

def extract_text_from_folder(folder_path, model_path):
    scanner = OcrScanner()
    config = ConfigureOptions()
    config.ModelPath = model_path
    config.Language = 'English'
    scanner.ConfigureDependencies(config)

    for filename in os.listdir(folder_path):
        if filename.lower().endswith(('.png', '.jpg', '.jpeg')):
            image_path = os.path.join(folder_path, filename)
            scanner.Scan(image_path)
            text = scanner.Text.ToString()

            output_file = os.path.splitext(filename)[0] + '_output.txt'
            with open(output_file, 'w', encoding='utf-8') as f:
                f.write(text)

# 使用例
extract_text_from_folder(r'D:\images', r'D:\OCR\win-x64')

まとめ

Spire.OCR for Python を使うことで、Pythonから簡単にOCR機能を活用できるようになります。この記事では以下のような活用例を紹介しました：

単一画像からのテキスト抽出
テキストブロックと位置情報の取得
フォルダ内画像の一括読み取り

大規模な文書デジタル化やスクリーンショットの自動解析など、幅広い自動化処理に応用できます。