OCRの実験がしたかった。googleは精度がいいが、お高い!ということで無料のものを試したメモ
Dockerで環境を
docker run -it python /bin/bash
インストール
tesseract本体インストール
apt install tesseract-ocr tesseract-ocr-jpn
apt install libtesseract-dev
pip install pytesseract
pythonの確認
root@ecfe5ef03ea4:/# python --version
Python 3.7.0
pyocrインストール
root@ecfe5ef03ea4:/# pip install pyocr
Collecting pyocr
Downloading https://files.pythonhosted.org/packages/37/54/2d169a102a3727f3ebe535da9263babb88a5862516ae9a798a7e458399a6/pyocr-0.5.3.tar.gz
Collecting Pillow (from pyocr)
Downloading https://files.pythonhosted.org/packages/62/8c/230204b8e968f6db00c765624f51cfd1ecb6aea57b25ba00b240ee3fb0bd/Pillow-5.3.0-cp37-cp37m-manylinux1_x86_64.whl (2.0MB)
100% |████████████████████████████████| 2.0MB 18.0MB/s
Collecting six (from pyocr)
Downloading https://files.pythonhosted.org/packages/67/4b/141a581104b1f6397bfa78ac9d43d8ad29a7ca43ea90a2d863fe3056e86a/six-1.11.0-py2.py3-none-any.whl
Building wheels for collected packages: pyocr
Running setup.py bdist_wheel for pyocr ... done
Stored in directory: /root/.cache/pip/wheels/ff/94/8e/dccadc6bce17c41a9dbb0c7ccd44acdb9dcc0edd9efa42eaf6
Successfully built pyocr
Installing collected packages: Pillow, six, pyocr
Successfully installed Pillow-5.3.0 pyocr-0.5.3 six-1.11.0
edgeの処理したくて追加
pip install opencv-python
pip install matplotlib
本当のメモ書きなので、深く突っ込まないように