【AI-OCR】Tegaki 改め SmartRead 帳票自動仕分け機能を試用してみました。

Last updated at 2021-12-16Posted at 2021-12-16

概要

株式会社 Cogent Labsさんの帳票自動仕分け機能をテストしてみました。複数種類の登録している帳票情報をもとに該当する帳票を区別してくれる機能です。利用するサンプルにより結果は異なるとは思いますが、今回は思いの外良い結果でした。
尚、テストしたバージョンは、Cogent Labsさんから提供を受けた、アーリーベータ版です。
正確なAPIのお作法、およびSmartReadの特徴などは、公式サイト（https://smartread.jp/ ）を参照してください。

試用内容

業務で利用するフォーマット（テストデータではありますが）を6種類、それぞれ20枚前後用意し
・６種類（全種類）が登録されている状態で用意した全帳票の仕分け
・４種類（２種類は未登録）が登録されている状態で用意した全帳票の仕分け
を実施。現物を出せませんが、かけ離れてはいない６種類です。

例えばFAXで多種多様なファイルが１箇所に保管されている場合、
その中には広告など、対象とは無関係の用紙も混ざっていることが想定されます。
登録していない用紙について除外できるのか。ここは大きなポイントです。

試用結果

ケース	結果
6種類	すべて正しいグループに仕分けされた。
4種類	4/6の正解が存在する帳票については正しいグループに仕分けされた。正解の存在しない2/6については、4/6のどこかに仕分けされた。但し、4/6である正解した帳票の確信度は0.999 正解の存在しない2/6の帳票について確信度が0.001
6種類のうち登録されていない２種類はどこかに混ざってしまいましたが確信度が低くしきい値によって高精度で除外できるのではないか。と思わせるテスト結果です。

TegakiEditorを使い、６種類分のテンプレート(json)を作成。

リリースでは改善される可能性がありますが、フィールドを１つ以上登録する必要はありました。
フィールドの情報を使うことはないとのことです。
https://editor.tegaki.ai/

各APIの実施

*** 、***xxxxx は、適宜置き換えてください。API_KEY（認証）は必要です。
各IDは、***xxxxx という形式で記載しています。

テンプレート情報と画像を登録し、templateIdを取得（利用テンプレートの種類分実施が必要です。今回は４個と６個分）

# !/usr/bin/python3
# Imports
import base64
import json
import requests
import os
TEGAKI_POST_TEMPLATE_ENDPOINT = 'https://api.tegaki.ai/hwr/v2/template'
MY_API_KEY = '***'


def post_template(template_json_file, template_image_file):
    # Send POST request to Tegaki service
    response = requests.post(TEGAKI_POST_TEMPLATE_ENDPOINT,
                             headers={'Authorization': 'apikey {}'.format(MY_API_KEY)},
                             files={'image': ('image.jpg', open(template_image_file, 'rb'), 'image/jpeg'),
                                    'payload': ('payload.json', open(template_json_file, 'rb'), 'application/json')}
                             )

    # Print the result
    print(response.json())

post_template('./tmp6.json', './tmp6.jpg')

複数のtemplateIdを指定して、templateGroupIdを取得

# !/usr/bin/python3
# Imports

import base64
import json
import requests
TEGAKI_POST_TEMPLATE_GROUP_ENDPOINT = 'https://api.tegaki.ai/hwr/v2/template-group'
MY_API_KEY = '***'
def post_template_group(template_group_json_file):

    # Read json file
    template_group_json_data = json.load(open(template_group_json_file, 'r'))

    # Send POST request to Tegaki service
    response = requests.post(TEGAKI_POST_TEMPLATE_GROUP_ENDPOINT,
                           headers={'Authorization': 'apikey {}'.format(MY_API_KEY)},
                           json=template_group_json_data)

    # Print the result
    print(response.json())

post_template_group('./tmp-grp.json')

tmp-grp.json

{
  "name": "tmp-grp",
  "templateIds": [
    "***17e88-4477-441f-8cdb-8f6cf77b6d46",
    "***e3950-ba5d-4edc-94cd-42f11b562bd2",
    "***9aefd-848e-4407-9227-b70c8ee03adb",
    "***b36b0-e58a-46dd-9d00-127debd8fb86",
    "***68d64-c926-4e6f-8151-5acc6eeaa2ae",
    "***519b2-e4d1-4273-bca8-0a739c40ab58"
],
  "labels": {
    "Lorem": "Ipsum"
  }
}

ここまでが準備です。ここからは画像をPOSTし、結果を取得という流れです。

フォーム単位の読み取りリクエストをtemplateGroupIdを指定して送信し、requestIdを取得

# !/usr/bin/python3
# Imports
import base64
import requests
import json

TEGAKI_POST_MATCH_TEMPLATE_GROUP_ENDPOINT = 'https://api.tegaki.ai/hwr/v2/template-group/{}/match'
MY_API_KEY = '***'

# Post request for a single form to Tegaki service
def post_match_template_group(template_group_id, form_image_file, optional_form_json_file = None):
    # Must include the image file
    files = {
        'image': ('image.jpg', open(form_image_file, 'rb'), 'image/jpeg'),
    }

    # Payload is optional
    if optional_form_json_file:
        files['payload'] = ('payload.json', open(optional_form_json_file, 'rb'), 'application/json')

    # Send multipart POST request to Tegaki service
    response = requests.post(TEGAKI_POST_MATCH_TEMPLATE_GROUP_ENDPOINT.format(template_group_id),
                             headers={'Authorization': 'apikey {}'.format(MY_API_KEY)},
                             files=files)

    # Print the result
    print(response.json())


# post_match_template_group('***42bd7-f621-419a-90fb-f001102c38bf', './test/1-1.jpg')

requestIdの読み取りステータスもしくは読み取り結果(templateId含む)を取得

# !/usr/bin/python3
# Imports
import json
import requests

MY_API_KEY = '***'
TEGAKI_GET_REQUEST_ENDPOINT = 'https://api.tegaki.ai/hwr/v2/request/'

# Get results from an already send request using its ID
def get_result(request_id):

  # Generate the endpoint
  url = '{}{}'.format(TEGAKI_GET_REQUEST_ENDPOINT, request_id)

  # Send GET request
  response = requests.get(url, headers={'Authorization': 'apikey {}'.format(MY_API_KEY)})

  # Print the result
  print(response.json())
  file = open("resp_text.json", "w")
  file.write(response.text)
  file.close()

get_result('***7ac5d-68e5-4ee7-b6c7-42185883dcff')

以上です。

2021年も年末ですね。現段階で不定形類似帳票（各社の請求書など）を読み取るソリューションを各社が展開しています。
実際やってみると設定が大変だったりで「これだ！！」というソリューションは出ていない認識です。（主観です。個人的見解です。）
しかしこの分野は、〇〇えもんを作る話とは異なり、現代の科学で実現できる分野ではないかと考えてはいます。
さて、近い将来どうなっているのでしょうか。
　

記事に不具合。不都合などございましたらご連絡いただけると幸いです。記事内容は会社公式のものではなく、個人として記載するものです。

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up