More than 5 years have passed since last update.

Firebase MLKitのサンプルを試してみる。

Posted at 2018-05-17

Google I/O 2018 で発表になった、FirebaseのMLKitを試してみようと思います。

Firebase MLKit
https://firebase.google.com/products/ml-kit/
Documentation
https://firebase.google.com/docs/ml-kit/

Firebase MLKitは、今のところ下記の機能があります。

・Text recognition：文章読み取り
・Face detection：顔認識
・Barcode scanning：バーコード読み取り
・Image labeling：画像の認識して、ラベルづけ
・Landmark recognition：ランドマーク認識
・Custom model inference：TensorFlowによるカスタムモデルを利用した認識

これらの機能が、
・無料
・Landmark recognition以外の機能がオフラインでも利用可能
であることが魅力的です。

まずは、公式のサンプルプロジェクトを動かしてみます。
https://github.com/firebase/quickstart-ios

実際に動作させた環境は下記の通り
・macOS X 10.12.6
・Xcode 9.2
・CocoaPods 1.5.2
・検証端末のiPhone 6 (iOS 9.2.1)

上記リポジトリの"mlkit"配下のプロジェクトを利用するために行ったことは下記の通り。

・CocoaPodsで関連ライブラリのインストール
・予め用意したFirebaseプロジェクトのGoogleService-Info.plistの追加
・info.plistのbundle identifierを変更（実機で動作確認するため）
・開発用のProvisioning Profileの適用（実機で動作確認するため）
・今回iOS8.0以降でも利用できるか試したかったので、
Realtime Processing機能のカメラ取得部分に下記の変更を加えました。

FrameProcessingViewController.swift

func prepareCamera() {
    captureSession.sessionPreset = AVCaptureSession.Preset.medium
+    if #available(iOS 10.0, *) {
        captureDevice = AVCaptureDevice.DiscoverySession(
            deviceTypes: [.builtInWideAngleCamera],
            mediaType: AVMediaType.video, position:
            AVCaptureDevice.Position.back
            ).devices.first
+    } else {
+        // iOS10未満の場合のcaptureDeviceの取得方法を追加
+        captureDevice = AVCaptureDevice.default(for: .video)
+    }
    beginSession()
  }

AVCaptureDevice.DiscoverySessionは、iOS10以降で利用できるAPIのため、
iOS10未満の場合はAVCaputureDevice.defaultでカメラを取得します。

AVCaptureDevice.DiscoverySession
https://developer.apple.com/documentation/avfoundation/avcapturedevice.discoverysession
default(for:)
https://developer.apple.com/documentation/avfoundation/avcapturedevice/1386589-default

この状態で実機で試すと以下のgifのような動作になります。

試した感想は、
・英数字であれば問題なく認識してくれる。
・漢字、ひらがな、カタカナの認識はまだうまくできない。
（Google Cloud Visionの機能使えばもしかしたらうまくいくのかな・・・？）
・MLKitとしてできることが、画像認識だけではなく、音声認識が追加されるか気になる。

以上です。
MLKitを使えば、英数字に限定したOCR機能が実現できそうな気がしています。

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up