More than 1 year has passed since last update.

ずんだもん認知シャッフル睡眠アプリを作ろうと思ったら大変だった

Last updated at 2024-03-18Posted at 2024-03-17

ずんたもんが読み上げる認知シャッフル睡眠の動画をよく聴くので、voicevoxで単語を読み上げるアプリを作ってみようと挑戦しましたが、アプリ初心者なので各種ポイントでつまずいてしまい思ったより大変でした。
はまったポイントを記録したいと思います。順不同です。

これをやったら動いた

voicevox_coreのAARファイルをアプリのプロジェクトのdependencyに追加する

aarがビルドされているのはプレビュー版のみのようです。
https://github.com/VOICEVOX/voicevox_core/releases/tag/0.15.0-preview.16 (2024/02現在最新のプレビュー版ビルド)
から model-0.15.0-preview.16.zip をDL・解凍し、
https://developer.android.com/studio/projects/android-library#psd-add-aar-jar-dependency
こちらの手順通りにaarファイルをdependencyに追加する
つまずきの8割がcoreをビルドしようとしていたところに起因するので、aarを読み込めた時点でほぼ成功しています。

onnxruntimeをdependencyに追加

implementation("com.microsoft.onnxruntime:onnxruntime-android:1.14.0")

vvm, 辞書ファイルをassetに追加

ずんだもんのささやき声は 5.vvm style:22 です

vvm, 辞書ファイルをassetからコピーしてttsを呼び出す

val text = "こんにちは"
model = VoiceModel(file.absolutePath)
jtalk = OpenJtalk(path.absolutePath)
synthesizer = Synthesizer.builder(jtalk).build()
synthesizer.loadVoiceModel(model)
val data = synthesizer.tts(text, 22).execute() // wavのByteArray

AudioTrackで再生する

val bufSize = 40000 * 3 // 3-5文字の単語だとvoicevoxで生成される音声データは40kb程度でした。バッファーサイズは使用するデータの2〜3倍にしておくと途切れないので良いらしいです。
audio = AudioTrack.Builder()
    .setAudioAttributes(
        AudioAttributes.Builder()
            .setUsage(AudioAttributes.USAGE_VOICE_COMMUNICATION)
            .setContentType(AudioAttributes.CONTENT_TYPE_SPEECH)
            .build()
    )
    .setAudioFormat(
        AudioFormat.Builder()
            .setEncoding(AudioFormat.ENCODING_PCM_16BIT)
            .setSampleRate(24000)
            .setChannelMask(AudioFormat.CHANNEL_OUT_MONO)
            .build()
    )
    .setBufferSizeInBytes(bufSize)
    .setTransferMode(AudioTrack.MODE_STATIC)
    .build()
// バッファを0で埋めておかないと音声が途切れて聞こえました
audio.write(ByteArray(bufSize), 0, bufSize)
// 先頭にヘッダ情報が入っているので46バイト飛ばす
audio.write(data, 46, data.count() - 46)
audio.play()

読み上げの速度を遅くする、無音時間を0にする

val query = synthesizer.createAudioQuery(text, 22)
query.speedScale = query.speedScale * 0.65
query.postPhonemeLength = 0.0
query.prePhonemeLength = 0.0
return synthesizer.synthesis(query, STYLE_ID).execute()

単語リスト作成

松下言語学習ラボ様の 日本語を読むための語彙データベース を使わせていただきました。
http://www17408ui.sakura.ne.jp/tatsum/database.html
品詞: 名詞かつランキング上位をcsvファイルとして保存

単語をシャッフル

class ShuffleWord(context: Context) {
    var word_list = mutableListOf<String>()
    var shuffled: List<String>
    var current = 0;
    init {
        context.assets.open("word_list/list.csv").use {
            csvReader().open(it){
                readAllAsSequence().forEach { row: List<String> ->
                    word_list.add(row[0])
                }
            }
        }
        shuffled = word_list.shuffled()
    }
    fun next() : String {
        current += 1
        if (current >= shuffled.size) {
            current = 0
            shuffled = word_list.shuffled()
        }
        return shuffled[current]
    }
}

↓
できた！

途中で出たエラーなど

辞書ファイルがわからない

ここからDLできた
https://open-jtalk.sourceforge.net/

vvmファイル読み込めない

val path = Uri.parse("file:///android_asset/model/5.vvm")
model = VoiceModel(path.path)

coreで読み込んでもらうためにvvm、辞書ファイルのabsolutePathを渡す必要があるが、assetディレクトリのabsolutePathがわからない
出たエラー

FATAL EXCEPTION: DefaultDispatcher-worker-2
Process: com.example.shufflevoiceapp, PID: 30560
jp.hiroshiba.voicevoxcore.exceptions.OpenZipFileException: `/android_asset/model/5.vvm`の読み込みに失敗しました: ZIPファイルとして開くことができませんでした
    at jp.hiroshiba.voicevoxcore.VoiceModel.rsFromPath(NativeMethod)
    at jp.hiroshiba.voicevoxcore.VoiceModel.<init>(VoiceModel.java:19)
Caused by: java.lang.RuntimeException: an upstream reader returned an error: No such file or directory (os error 2)
                           ... 13 more
Caused by: java.lang.RuntimeException: No such file or directory (os error 2)
                           ... 13 more

解決策
vvm、辞書ファイルをassetディレクトリからアプリデータフォルダへコピーした

val file = File(context.filesDir, "voice.vvm")
val assets: AssetManager = context.assets
file.outputStream().use {
    val input = assets.open("model/5.vvm")
    val buf = ByteArray(1024)
    var len: Int
    while (input.read(buf).also { len = it } > 0) {
        it.write(buf, 0, len)
    }
    input.close()
}
model = VoiceModel(file.absolutePath)

sampleじゃないvvmはどこに

https://github.com/VOICEVOX/voicevox_fat_resource/tree/main/core/model
ここにあった
もしくはダウンローダーでDLする

sample.vvmは読み込めるけど、キャラクターのvvmは読み込めない

最初よくわからずにこのREADMEを見てビルドしていましたが、ずんだもんのvvmファイルを利用する場合はビルド済みのものからしか使用できない
https://github.com/VOICEVOX/voicevox_core/tree/0848630d81ae3e917c6ff2038f0b15bbd4270702/crates/voicevox_core_java_api

エラー

FATAL EXCEPTION: main
Caused by: java.lang.reflect.InvocationTargetException
Caused by: InvalidModelDataException: `/data/user/0/com.example.shufflevoiceapp/files/voice.vvm`の読み込みに失敗しました: モデルデータを読むことができませんでした
Caused by: java.lang.RuntimeException: Failed to create session: Error calling ONNX Runtime C function: Failed to load model because protobuf parsing failed.

解決法
そもそもビルド不要だった。ビルド済みのaarを使う。
https://github.com/VOICEVOX/voicevox_core/issues/715#issuecomment-1876688539
自分でビルドしたcoreからは用意されたvvmは読み込めないとのこと

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up