More than 5 years have passed since last update.

Microsoft Computer Vision API OCR の使い方 (日本語)

Posted at 2018-04-07

次のページの処理に日本語の処理を加えます。
Microsoft Computer Vision API OCR の使い方

参考ページ
Cognitive Services: OCR機能で印刷物の写真からテキストのデータ化をしてみる

次の画像を処理してみます。

japanese.py

# !/usr/bin/python
# -*- coding: utf-8 -*-
#
import  requests 
import	json

ocr_url = 'https://westus.api.cognitive.microsoft.com/vision/v1.0/ocr'
headers  = {'Ocp-Apim-Subscription-Key': 'aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa'}
params   = {'language': 'ja', 'detectOrientation ': 'true'}
data     = {'url': 'https://example.com/test_ja.png'}
response = requests.post(ocr_url, headers=headers, params=params, json=data)
response.raise_for_status()

ocr_data = response.json()

output = ""

for txt_lines in ocr_data['regions']:
    for txt_words in txt_lines['lines']:            
        for txt_word in txt_words['words']:
                output += txt_word['text']
        output += '\n'
    output += '\n'
 
print('language:' + ocr_data['language'] + '\n')
print(output)

実行結果

$ ./japanese.py 
language:ja

大きな森のすぐ近くに、木こりが、おかみさんと子供
たちと一緒に住んでいました。男の子はヘンゼルで女
の子はグレ-テルという名前でした。木こりにはほと

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up