More than 5 years have passed since last update.

Cloud Vision APIを使ってOCR（文字認識）するAndroidアプリを最短ルートで作る

Android

Last updated at 2018-10-21Posted at 2018-10-21

はじめに

機械学習の時代ですね。
ということで機械学習を利用したアプリを作ってみたいけど、データセットなんて無いし手法もわからない・・・

そんな状況でもGoogleが提供するGoogle Cloud Platformの機械学習APIを使うと、驚くほど簡単に機械学習を利用したアプリが作れます。

今回はGoogle Cloud Vision APIのOCR（文字認識）機能を使って文字認識アプリを作ります。

Google Cloud Vision API

公式
 Google Cloud Vision APIを使ってみた

1000ユニットまでなら無料で使用することができます。
料金表はコチラ

APIキー発行

まずはGoogle Developers ConsoleからAPIキーを発行します。
必要なものは以下

Googleアカウント
クレジットカード（お金は払いません）

私は下記サイトを参考にAPIキーを発行しました。
Cloud Vision APIの使い方まとめ

公式サンプルの取り込み

Cloud Vision APIを利用したAndroidアプリの公式サンプルがあるため、このコードを元に実装します。

まずはgit clone https://github.com/GoogleCloudPlatform/cloud-vision.git を叩いてサンプルのコードを取り込みます。
取り込んだら、MainActivityの定数CLOUD_VISION_API_KEYに発行したAPIキーを代入します。

Feature Typeの変更

公式サンプルはLABEL_DETECTION(物体の分類)を行っているため、
TEXT_DETECTION(文字認識)を行うようにリファレンスを見ながら書き換えます。

before

MainActivity.java

            annotateImageRequest.setFeatures(new ArrayList<Feature>() {{
                Feature labelDetection = new Feature();
                labelDetection.setType("LABEL_DETECTION");
                labelDetection.setMaxResults(MAX_LABEL_RESULTS);
                add(labelDetection);
            }});

after (MAX_LABEL_RESULTSを10から1に変更)

MainActivity.java

            annotateImageRequest.setFeatures(new ArrayList<Feature>() {{
                Feature textDetection = new Feature();
                textDetection.setType("TEXT_DETECTION");
                textDetection.setMaxResults(MAX_LABEL_RESULTS);
                add(textDetection);
            }});

出力形式の変更

文字認識の結果を出力するようにコードを書き換えます。

before

MainActivity.java

    private static String convertResponseToString(BatchAnnotateImagesResponse response) {
        StringBuilder message = new StringBuilder("I found these things:\n\n");

        List<EntityAnnotation> labels = response.getResponses().get(0).getLabelAnnotations();
        if (labels != null) {
            for (EntityAnnotation label : labels) {
                message.append(String.format(Locale.US, "%.3f: %s", label.getScore(), label.getDescription()));
                message.append("\n");
            }
        } else {
            message.append("nothing");
        }

        return message.toString();
    }

after

MainActivity.java

    private static String convertResponseToString(BatchAnnotateImagesResponse response) {
        StringBuilder message = new StringBuilder("I found these things:\n\n");

        TextAnnotation label = response.getResponses().get(0).getFullTextAnnotation();
        if (label != null) {
            message.append(label.getText());
        } else {
            message.append("nothing");
        }

        return message.toString();
    }

検証

こちらの写真でアプリの検証を行います。

結果

白抜きされた「増補改訂版」も含め、完璧ですね！

まとめ

Google Cloud Vision APIでOCRを最短ルートで試すまでの手順をまとめました。

APIを利用することによって、機械学習の知識があまりなくても簡単に最先端の技術を体験することができます。
是非、お試しを。

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up