tesserocr と OpenCV を組み合わせてみる

Posted at 2018-10-30

はじめに

最近始めたばかりの tesserocr と OpenCV で何かやってみたくなったので試してみました。
それぞれ初心者レベルの組み合わせなので、初めての方でもできた気になります（笑）

やったこと

とりあえず何か試してみたいというわけで、下記をやってみました。
Pythonサイトにある、Getting Started部分を画像にしてのテキスト抽出です。

tesserocrの信頼度スコアが90以上を青、80未満を赤として、テキストにOpenCVで枠をつけます
tesserocrで画像内の文字を抽出して、抽出された文字をOpenCVで描画します

このような結果になりました。（スコアが低くても高いのと対して変わらないように感じます。）

ソースコード

tesserocr のサイトにある、Advanced API ExamplesのソースにOpenCVを組み込んだ形のつぎはぎコードです。

import locale
locale.setlocale(locale.LC_ALL, 'C')
import tesserocr
from tesserocr import PyTessBaseAPI, RIL, PSM
from PIL import Image
import cv2

lang='eng'
img_file_name = 'python.png'
image = Image.open(img_file_name)
draw_img = cv2.imread(img_file_name)
font = cv2.FONT_HERSHEY_SIMPLEX

with PyTessBaseAPI(lang=lang, psm=PSM.SINGLE_BLOCK) as api:
    api.SetImage(image)
    boxes = api.GetComponentImages(RIL.TEXTLINE, True)
    print('Found {} textline image components.'.format(len(boxes)))

    for i, (im, box, _, _) in enumerate(boxes):
        x, y, w, h = box['x'], box['y'], box['w'], box['h']
        api.SetRectangle(x, y, w, h)
        ocrResult = api.GetUTF8Text().strip()
        conf = api.MeanTextConf()
        if conf >= 90:
            cv2.rectangle(draw_img, (x, y), (x+w, y+h), (255, 0, 0), 1)
        elif conf < 80:
            cv2.rectangle(draw_img, (x, y), (x+w, y+h), (0, 0, 255), 1)

        cv2.putText(draw_img, str(conf) + ': ' + ocrResult, (x, y), font, 0.6, (255,0,255), 1)

        print ((u"Box[{0}]: x={x}, y={y}, w={w}, h={h}, "
               "confidence: {1}, text: {2}").format(i, conf, ocrResult, **box))

    cv2.imwrite("ouput.png", draw_img)

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up