More than 5 years have passed since last update.

MacでJulius-v4.5を使って音声認識する

Julius

Posted at 2019-07-13

目的

Macで2019.2にリリースされたJulius-v4.5を使って音声認識した際の備忘録です

julius-speech/julius: Release 4.5
https://zenodo.org/record/2530396#.XSA_US3AOu4

準備

Juliusダウンロード/インストール

$ wget https://zenodo.org/record/2530396/files/julius-speech/julius-v4.5.zip
$ unzip julius-v4.5.zip
$ cd julius-speech-julius-36de469/
$ ./configure
****************************************************************
Julius/Julian libsent library rev.4.5:

- Audio I/O
    primary mic device API   : coreaudio (MacOSX CoreAudio)
    available mic device API :
    supported audio format   : various formats by libsndfile ver.1
    NetAudio support         : no
- Language Modeling
    class N-gram support     : yes
- Libraries
    file decompression by    : zlib library
- Process management
    fork on adinnet input    : no
 
  Note: compilation time flags are now stored in "libsent-config".
        If you link this library, please add output of
        "libsent-config --cflags" to CFLAGS and
        "libsent-config --libs" to LIBS.
****************************************************************
$ make
$ sudo make install

Dictationキットダウンロード

$ wget https://ja.osdn.net/projects/julius/downloads/71011/dictation-kit-4.5.zip
$ unzip dictation-kit-4.5.zip
$ cd dictation-kit-4.5

テスト

juliusサンプルコードで入力音声が認識されることを確認する

$ pwd
~/julius/dictation-kit-4.5
$ julius -C main.jconf -C am-gmm.jconf -nostrip
### read waveform input
STAT: AD-in thread created
<input rejected by short input>

マイクに向かって色々と発話する。

STAT: AD-in thread created
pass1_best:  こんにちは 。
pass1_best_wordseq: <s> こんにちは+感動詞 </s>
pass1_best_phonemeseq: silB | k o N n i ch i w a | silE
pass1_best_score: -2448.511963
### Recognition: 2nd pass (RL heuristic best-first)
STAT: 00 _default: 17650 generated, 1587 pushed, 236 nodes popped in 97
sentence1:  こんにちは 。
wseq1: <s> こんにちは+感動詞 </s>
phseq1: silB | k o N n i ch i w a | silE
cmscore1: 0.741 0.687 1.000
score1: -2472.910400

pass1_best:  おはよう ござい ます 。
pass1_best_wordseq: <s> おはよう+感動詞 ござい+動詞 ます+助動詞 </s>
pass1_best_phonemeseq: silB | o h a y o: | g o z a i | m a s u | silE
pass1_best_score: -3234.950684
### Recognition: 2nd pass (RL heuristic best-first)
STAT: 00 _default: 6666 generated, 1219 pushed, 189 nodes popped in 122
sentence1:  おはよう ござい ます 。
wseq1: <s> おはよう+感動詞 ござい+動詞 ます+助動詞 </s>
phseq1: silB | o h a y o: | g o z a i | m a s u | silE
cmscore1: 0.649 0.793 0.921 0.983 1.000
score1: -3258.356689

pass1_best:  こんばんは 。
pass1_best_wordseq: <s> こんばんは+感動詞 </s>
pass1_best_phonemeseq: silB | k o N b a N w a | silE
pass1_best_score: -2248.228760
### Recognition: 2nd pass (RL heuristic best-first)
STAT: 00 _default: 8960 generated, 1281 pushed, 193 nodes popped in 90
sentence1:  こんばんは 。
wseq1: <s> こんばんは+感動詞 </s>
phseq1: silB | k o N b a N w a | silE
cmscore1: 0.716 0.464 1.000
score1: -2270.507080

pass1_best:  右 。  
pass1_best_wordseq: <s> 右+名詞 </s>
pass1_best_phonemeseq: silB | m i g i | silE
pass1_best_score: -2155.382812
### Recognition: 2nd pass (RL heuristic best-first)
STAT: 00 _default: 12732 generated, 1725 pushed, 255 nodes popped in 84
sentence1:  右 。
wseq1: <s> 右+名詞 </s>
phseq1: silB | m i g i | silE
cmscore1: 0.694 0.149 1.000
score1: -2181.414307

pass1_best:  左 。  
pass1_best_wordseq: <s> 左+名詞 </s>
pass1_best_phonemeseq: silB | h i d a r i | silE
pass1_best_score: -2275.306885
### Recognition: 2nd pass (RL heuristic best-first)
STAT: 00 _default: 8577 generated, 1332 pushed, 218 nodes popped in 90
sentence1:  左 。
wseq1: <s> 左+名詞 </s>
phseq1: silB | h i d a r i | silE
cmscore1: 0.779 0.612 1.000
score1: -2291.427002

sentence1:  上 。
wseq1: <s> 上+名詞 </s>
phseq1: silB | u e | silE
cmscore1: 0.750 0.082 1.000
score1: -2423.035400

pass1_best:  下 は 。
pass1_best_wordseq: <s> 下+名詞 は+助詞 </s>
pass1_best_phonemeseq: silB | sh i t a | w a | silE
pass1_best_score: -2421.643555
### Recognition: 2nd pass (RL heuristic best-first)
STAT: 00 _default: 9518 generated, 2179 pushed, 233 nodes popped in 103
sentence1:  下 わあ 。
wseq1: <s> 下+名詞 わあ+助詞 </s>
phseq1: silB | sh i t a | w a: | silE
cmscore1: 0.576 0.222 0.012 1.000
score1: -2455.557129

pass1_best:  前 。  
pass1_best_wordseq: <s> 前+名詞 </s>
pass1_best_phonemeseq: silB | m a e | silE
pass1_best_score: -2158.709473
### Recognition: 2nd pass (RL heuristic best-first)
STAT: 00 _default: 7800 generated, 1096 pushed, 198 nodes popped in 90
sentence1:  前 。
wseq1: <s> 前+名詞 </s>
phseq1: silB | m a e | silE
cmscore1: 0.624 0.387 1.000
score1: -2184.526855

pass1_best:  後ろ 。
pass1_best_wordseq: <s> 後ろ+名詞 </s>
pass1_best_phonemeseq: silB | u sh i r o | silE
pass1_best_score: -2560.170898
### Recognition: 2nd pass (RL heuristic best-first)
STAT: 00 _default: 10130 generated, 1400 pushed, 220 nodes popped in 103
sentence1:  後ろ を 。
wseq1: <s> 後ろ+名詞 を+助詞 </s>
phseq1: silB | u sh i r o | o | silE
cmscore1: 0.573 0.417 0.055 1.000
score1: -2579.712646

pass1_best:  ありがとう 。
pass1_best_wordseq: <s> ありがとう+感動詞 </s>
pass1_best_phonemeseq: silB | a r i g a t o: | silE
pass1_best_score: -2720.370605
### Recognition: 2nd pass (RL heuristic best-first)
STAT: 00 _default: 35224 generated, 2071 pushed, 314 nodes popped in 109
sentence1:  ありがとう 。
wseq1: <s> ありがとう+感動詞 </s>
phseq1: silB | a r i g a t o: | silE
cmscore1: 0.457 0.349 1.000
score1: -2741.026123

<<< please speak >>>

CodingError

$ julius -C main.jconf -C am-dnn.jconf -dnnconf julius.dnnconf -nostrip
STAT: include config: main.jconf
ERROR: m_options: wrong argument: "-fvad"
Try `-help' for more information.

-->juliusとdictation-kitのバージョンが合わずlibfvadが見付からない?(libfvadはjulius-v4.5から対応、古いjulius、例えばv4.2を利用するとエラーが出る)
対策：main.jconfから-fvadをコメントアウト or バージョン見直し

$ julius -C main.jconf -C am-dnn.jconf -dnnconf julius.dnnconf -nostrip
STAT: include config: main.jconf
STAT: include config: am-dnn.jconf
ERROR: dnn_config_file_parse: unknown spec: num_threads 2
ERROR: m_options: failed to read julius.dnnconf
Try `-help' for more information.

-->juliusとdictation-kitのバージョンが合わずnum_threadsが見付からない?(julius-v4.2とdictation-kit v4.5だとエラー)
対策：julius.dnnconfからnum_threads 2をコメントアウト or バージョン見直し

Error: adin_darwin: cannot set audio converter quality
ERROR: m_adin: failed to ready input device

-->内蔵マイクが有効になっていない場合にエラーが出る
対策：システム環境設定->サウンド->入力　から入力が内蔵マイクになっていることを確認する

参考

Julius
zenodo julius-speech/julius: Release 4.5
Terminal (macOS): Julius で音声認識をする
 Mac で完全フリーの音声認識エンジンJuliusを試してみた
 Julius/音声認識関連超入門用自分用まとめ
 オフラインでフリーで使える音声認識 Julius、声で操作するロボを自作しよう
 ラズパイに、人の声を理解させる。
Raspberry Pi＋Juliusで音声を認識する【2019年3月】

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up