1
4

More than 1 year has passed since last update.

Easy OCR memo

Last updated at Posted at 2022-05-24

はじめに

Easy OCRに関するメモです。

Easy OCRの設置

pip install easyocr
  • CUDA利用時は、easyocrの設置前にtorchをインストールする必要があります。

** opencvとクラッシュを起こすときがあります。
2022年5月の基準に、opencv-python==4.5.4.60がOKです。

 pip uninstall opencv-python-headless
 pip install opencv-python==4.5.4.60

Easy OCRの利用


import easyocr

#インスタンスを立てます
reader = easyocr.Reader(['en'])

#numpy arrayを入力
img_path = './image/001.png'
img_array = cv2.imread(img_path)

#OCR実施、結果を返してもらう。
result = reader.readtext(img_path)

画像のOCR


import cv2
import easyocr

import numpy as np

reader = easyocr.Reader(['en'])
font = cv2.FONT_HERSHEY_SIMPLEX


def draw_boxes(img, result):
    for detection in result:
        p0 = tuple((detection[0][0]))
        p1 = tuple((detection[0][1]))
        p2 = tuple((detection[0][2]))
        p3 = tuple((detection[0][3]))
        points = np.array([p0, p1, p2, p3],dtype=np.int32)

        text = detection[1]
        confidence = round(detection[2],2)
        img = cv2.polylines(img, [points], True, (255, 0, 0), thickness = 2)
        print('p0',p0)
        img = cv2.putText(img, text , (int(p0[0]), int(p0[1])) , font, 1, (255, 0, 0), 2, cv2.LINE_AA)

    return img


#numpy arrayを入力
img_path = './image/001.png'
img_array = cv2.imread(img_path)

img2 = draw_boxes(img_array, result)
cv2.imshow('box', img2)

cv2.waitKey(0)

output format

result = 
[([[189, 75], [469, 75], [469, 165], [189, 165]], '愚园路', 0.3754989504814148),
 ([[86, 80], [134, 80], [134, 128], [86, 128]], '西', 0.40452659130096436),
 ([[517, 81], [565, 81], [565, 123], [517, 123]], '', 0.9989598989486694),
 ([[78, 126], [136, 126], [136, 156], [78, 156]], '315', 0.8125889301300049),
 ([[514, 126], [574, 126], [574, 156], [514, 156]], '309', 0.4971577227115631),
 ([[226, 170], [414, 170], [414, 220], [226, 220]], 'Yuyuan Rd.', 0.8261902332305908),
 ([[79, 173], [125, 173], [125, 213], [79, 213]], 'W', 0.9848111271858215),
 ([[529, 173], [569, 173], [569, 213], [529, 213]], 'E', 0.8405593633651733)]

ここで、リストの各エレメントを見ると、'(a bounding box, the text detected, confidet level)' である。

(a bounding box,  the text detected, confidet level) 

([[189, 75], [469, 75], [469, 165], [189, 165]], '愚园路', 0.3754989504814148)


p0 = tuple((detection[0][0])) -> (189, 75)
p1 = tuple((detection[0][1])) -> (469,75)
p2 = tuple((detection[0][2])) -> (469,165)
p3 = tuple((detection[0][3])) -> (189,165)
points = np.array([p0, p1, p2, p3],dtype=np.int32)

text = detection[1] -> '愚园路'
confidence = round(detection[2],2) -> 0.37

1
4
0

Register as a new user and use Qiita more conveniently

  1. You get articles that match your needs
  2. You can efficiently read back useful information
  3. You can use dark theme
What you can do with signing up
1
4