NICTが作ったつよつよな翻訳APIとiOS標準の音声文字起こしSDKを組み合わせて翻訳アプリを作ってみた

Posted at 2019-12-24

TexTra API

情報通信研究機構（NICT）が提供するAPIです。
日本人が使うことに特化しており、高い翻訳性能を誇ります。

試しにフォームで使ってみるとこんな感じ。
逆翻訳してみると、Google翻訳にあったようなトンチンカンっぷりは少なく感じます。

ポケトークも高いですし、スマホでこれ使えるようにしたい！ということで、音声認識→翻訳までを面倒見るアプリを試しに作ってみました。

音声認識

iOSの音声認識は、標準装備で利用できます。（ https://developer.apple.com/documentation/speech ）
音声認識の際は、Privacy - Microphone Usage DescriptionをInfo.plistに記述しなければなりません。

音声の聞き取り開始まではこんな感じ

let audioSession = AVAudioSession.sharedInstance()
do{
    try audioSession.setCategory(.record, mode: .measurement, options: [])
    try audioSession.setActive(true, options: .notifyOthersOnDeactivation)
}catch{
    
}

self.speechRecognizer = SFSpeechRecognizer.init(locale: Locale.init(identifier: "ja_jp")) 

self.speechRecognizer.delegate = self
self.recognitionRequest = SFSpeechAudioBufferRecognitionRequest()

// 発話終了を待たずに結果をリアルタイムで取得できる
self.recognitionRequest.shouldReportPartialResults = true

self.recognitionTask = self.speechRecognizer.recognitionTask(with: self.recognitionRequest) { result,error in
    guard let result = result else{
        return
    }
    // 書き起こした文字列を画面に反映
    self.rawText.text = result.bestTranscription.formattedString
    if result.isFinal{
        // 発話が終了した場合
        self.audioEngine.inputNode.removeTap(onBus: 0)
        self.audioEngine.stop()
        
        // MARK: 言語はあくまでテスト、設定された言語に応じて値を変更すること
        self.callTranslationAPI(text: result.bestTranscription.formattedString, fromLangCode: "ja", toLangCode: "en")
    }
    
}

let recordingFormat = audioEngine.inputNode.outputFormat(forBus: 0)
audioEngine.inputNode.installTap(onBus: 0, bufferSize: 1024, format: recordingFormat) { (buffer: AVAudioPCMBuffer, when: AVAudioTime) in
    self.recognitionRequest?.append(buffer)
}

audioEngine.prepare()
try? audioEngine.start()

どの言語で発話するかは教えてあげなければなりません。

self.speechRecognizer = SFSpeechRecognizer.init(locale: Locale.init(identifier: "ja_jp"))

の部分です。
対応する言語はまとめてくださった方がいますが、もしかしたら対応言語が増えているかも？
iOSのSpeechフレームワークで音声認識 - 対応言語は58種類！

recognitionTaskの中で、聞き取った結果を受け取れるので、それをNICTのAPIに投げるようにします。

TexTra APIをコールしてみる

APIのURLはhttps://mt-auto-minhon-mlt.ucri.jgn-x.jp/api/mt/generalN_[翻訳元言語]_[翻訳先言語]です。
認証はOAuth1.0で、キーやシークレットはマイページのユーザー設定から確認できます。

OAuth1.0のシグネチャ生成は、自分でやると大変そうなので、その辺全部丸投げできるライブラリを使用しました。
OAuthSwift/OAuthSwift

func callTranslationAPI(text:String,fromLangCode:String,toLangCode:String){
        var params = OAuthSwift.Parameters()
        params["key"] = config.value(forKey: "NICT_TRANSLATION_API_KEY")!
        params["name"] = config.value(forKey: "NICT_TRANSLATION_API_NAME")!
        params["type"] = "json"
        params["text"] = text

        let oauthClient = OAuth1Swift.init(consumerKey: config.value(forKey: "NICT_TRANSLATION_API_KEY")!, consumerSecret: config.value(forKey: "NICT_TRANSLATION_API_SECRET")!)
        oauthClient.client.post("https://mt-auto-minhon-mlt.ucri.jgn-x.jp/api/mt/generalN_\(fromLangCode)_\(toLangCode)/", parameters: params, completionHandler: {result in
            switch result{
            case .success(let response):
                if let responseJsonString = response.string{
                    let responseJsonDict = JSON.init(parseJSON: responseJsonString)
                    print(responseJsonDict)
                    
                    self.translatedText.text = responseJsonDict["resultset"]["result"]["text"].stringValue //翻訳の結果
                }
            default:
                // MARK: NICTのAPIは異常終了時でも200番を返却する？ そもそも呼ばれない可能性がある
                let alert = UIAlertController(title: "Error", message: "翻訳に失敗しました", preferredStyle: .alert)
                alert.addAction(UIAlertAction(title: "OK", style: .default, handler: nil))
                
                self.present(alert, animated: true, completion: nil)
            }
        })
}

通信系はいつもAlamofire任せなんですが、こればっかりは・・・ね・・・
といっても実際にはAlamofireと遜色ない感じで利用できたので、ありがたいです。

{
  "resultset" : {
    "code" : 0,
    "result" : {
      "information" : {
        "text-t" : "The weather tomorrow is cloudy.",
        "sentence" : [
          {
            "text-t" : "The weather tomorrow is cloudy.",
            "text-s" : "明日の天気は曇りです",
            "split" : [
              {
                "text-t" : "The weather tomorrow is cloudy.",
                "process" : {
                  "replace-after" : [

                  ],
                  "preprocess" : [

                  ],
                  "translate" : {
                    "oov" : null,
                    "associates" : [
                      [
                        [

                        ],
                        [

                        ],
                        [

                        ]
                      ]
                    ],
                    "associate" : [
                      [

                      ],
                      [

                      ],
                      [

                      ]
                    ],
                    "reverse" : [

                    ],
                    "text-t" : "The weather tomorrow is cloudy.",
                    "text-s" : "明日の天気は曇りです",
                    "specification" : [

                    ]
                  },
                  "replace-before" : [

                  ],
                  "regex" : [

                  ]
                },
                "text-s" : "明日の天気は曇りです"
              }
            ]
          }
        ],
        "text-s" : "明日の天気は曇りです"
      },
      "text" : "The weather tomorrow is cloudy."
    },
    "request" : {
      "url" : "https:\/\/mt-auto-minhon-mlt.ucri.jgn-x.jp\/api\/mt\/generalN_ja_en\/",
      "text" : "明日の天気は曇りです",
      "data" : "",
      "split" : 0
    },
    "message" : ""
  }
}

文節などで自動で区切ってくれる機能があるのですが、実際に翻訳のデータとして取り出すだけであれば、["resultset"]["result"]["text"]が一番使えそうな感じがします。

一番最初のフォームに入れた文章を読み上げて、翻訳までさせてみました。

句読点などがないので、ちょっと変な感じになっていますが、割といい感じです。
多少噛んだりしても、勝手に無視してくれるのも、ありがたいです。

まとめ

アルバイトなどしていても、外国人の客対応が発生するケースは多々あるので、こういうのが無料で使えるというのは本当にありがたいです。（そもそもスマホを持ち込めないのですが）
Apple Watchに落とし込めないかなと思案中です。
ソースコードはこちらに上げているので、よければ参考にしてください。
まだ完成していないのでちょくちょく変更するかもです。

例に使った文章はこちらから拝借しました → https://headlines.yahoo.co.jp/hl?a=20191224-00000062-impress-sci

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up