はじめに
2024/11/19 に、AivisSpeech という新しい音声合成ソフトウェアがリリースされました。
🎉本日リリース🎉
— Aivis Project (@aivis_project) November 19, 2024
【完全無料!AI音声合成の新時代がここに】
かんたんに感情豊かな声をつくれる、最先端の音声合成ソフト「AivisSpeech」が登場!
💠 無料で圧倒的に高品質な音声合成!
💠 お手元のPCで快適に動作!
💠 自作した音声合成モデルも使える!
今すぐ試す 👉 https://t.co/q0jONJUc2J pic.twitter.com/f8nLiP3Xwj
AivisSpeech はOSSの音声合成ソフトウェアである VOICEVOX をベースに開発されており、
上記のポストにあるような自作の音声合成モデルを利用する機能などが拡張されています。
AivisSpeech 自体は、GUIアプリケーションとして提供されていますが、
AivisSpeech Engine に HTTP API の機能が搭載されています。
ソースコードはLGPL-3.0ライセンスで公開されています。
https://github.com/Aivis-Project/AivisSpeech-Engine
そして、AivisSpeeh EngineはAPIのエンドポイントが一部を除いてVOICEVOX互換となっており、
VOICEVOXとの併用、つまりマルチエンジン対応も可能になっています。
今回、AirvisSpeech Engine の HTTP API を使った音声合成を試してみましたので、
本記事では、その一連の流れを紹介しています。
前提
今回はNVIDIA GPU向けのDockerイメージをWSL2に導入しました。
環境
- ホストOS: Windows 11
- WSL2: Ubuntu 22.04.4 LTS
- Docker Desktop: 4.36.0 (175267)
- AivisSpeech Engine: latest (1.0.0)
ホストマシンのスペック
- CPU: Intel Core i7-10700 @ 2.90GHz
- GPU: NVIDIA GeForce RTX 3070
エンジンの実行
事前準備: CUDA Toolkitのインストール
WSL2上でNVIDIA GPUを利用するためには、CUDA Toolkitをインストールする必要があります。
以下のURLに掲載されている一連のコマンドをWSL上で実行することでインストールできます。
Dockerイメージの取得とコンテナ起動
以下のコマンドで、AivisSpeech Engine のDockerイメージを取得し、コンテナを起動します。
$ docker pull ghcr.io/aivis-project/aivisspeech-engine:nvidia-latest
$ docker run --rm --gpus all -p '10101:10101' \
-v ~/.local/share/AivisSpeech-Engine:/home/user/.local/share/AivisSpeech-Engine-Dev \
ghcr.io/aivis-project/aivisspeech-engine:nvidia-latest
起動に成功すると、以下のようなログが出力されます。
$ docker run --rm --gpus all -p '10101:10101' -v ~/.local/share/AivisSpeech-Engine:/home/user/.local/share/AivisSpeech-Engine-Dev ghcr.io/aivis-project/aivisspeech-engine:nvidia-latest
+ exec gosu user /opt/python/bin/poetry run python ./run.py --use_gpu --host 0.0.0.0
[2024/12/04 05:20:59] INFO: AivisSpeech Engine version latest
[2024/12/04 05:20:59] INFO: Engine root directory: /opt/aivisspeech-engine
[2024/12/04 05:20:59] INFO: User data directory: /home/user/.local/share/AivisSpeech-Engine-Dev
[2024/12/04 05:20:59] INFO: Models directory: /home/user/.local/share/AivisSpeech-Engine-Dev/Models
[2024/12/04 05:21:00] INFO: Installed AIVM models:
[2024/12/04 05:21:00] INFO: - Anneli (a59cb814-0083-4369-8542-f51a29e72af7)
[2024/12/04 05:21:00] INFO: Using GPU (NVIDIA CUDA) for inference.
[2024/12/04 05:21:00] INFO: Loading BERT model and tokenizer...
[2024/12/04 05:21:05] INFO: BERT model and tokenizer loaded. (5.47s)
[2024/12/04 05:21:05] INFO: Compiled user dictionary applied.
[2024/12/04 05:21:06] INFO: Started server process [1]
[2024/12/04 05:21:06] INFO: Waiting for application startup.
[2024/12/04 05:21:06] INFO: Application startup complete.
[2024/12/04 05:21:06] INFO: Uvicorn running on http://0.0.0.0:10101 (Press CTRL+C to quit)
reading /home/user/.local/share/AivisSpeech-Engine-Dev/user.dict_csv-d1d730f8-aa02-4a62-9529-4876c5fd85b6.tmp ... 900516
emitting double-array: 100% |###########################################|
[2024/12/04 05:21:09] INFO: User dictionary updated. (4.14s)
トラブルシューティング
私の環境では、初めて docker run
した際に以下のようなエラーが発生しました。
$ docker run --rm --gpus all -p '10101:10101' -v ~/.local/share/AivisSpeech-Engine:/home/user/.local/share/AivisSpeech-Engine-Dev ghcr.io/aivis-project/aivisspeech-engine:
nvidia-latest
- exec gosu user /opt/python/bin/poetry run python ./run.py --use_gpu --host 0.0.0.0
Traceback (most recent call last):
File "/opt/aivisspeech-engine/./run.py", line 27, in <module>
from voicevox_engine.aivm_manager import AivmManager
File "/opt/aivisspeech-engine/voicevox_engine/aivm_manager.py", line 20, in <module>
from voicevox_engine.logging import logger
File "/opt/aivisspeech-engine/voicevox_engine/logging.py", line 14, in <module>
ENGINE_LOG_DIR.mkdir(parents=True, exist_ok=True)
File "/opt/python/lib/python3.11/pathlib.py", line 1116, in mkdir
os.mkdir(self, mode)
PermissionError: [Errno 13] Permission denied: '/home/user/.local/share/AivisSpeech-Engine-Dev/Logs'
docker run
の -v
オプションで ~/.local/share/AivisSpeech-Engine
にコンテナ上のディレクトリをマウントしていますが、
コマンド実行時に自動でホストマシンに作成されたマウント先ディレクトリの所有者が root
になっており、
Dockerコンテナからwriteできなかったのが原因だったようです。
今回は、ホストマシン側のマウント先ディレクトリを一般ユーザーで作成し直すことにより対処しました。
(所有権を直接変更しても問題ないと思います。)
$ ls -ld ~/.local/share/AivisSpeech-Engine
drwxr-xr-x 2 root root 4096 Dec 4 14:35 /home/aqua/.local/share/AivisSpeech-Engine
$ rm -rf ~/.local/share/AivisSpeech-Engine
$ mkdir ~/.local/share/AivisSpeech-Engine
$ ls -ld ~/.local/share/AivisSpeech-Engine
drwxr-xr-x 2 aqua aqua 4096 Dec 4 12:57 /home/aqua/.local/share/AivisSpeech-Engine
# aqua はホストマシンのユーザー名です
APIの利用
APIドキュメントの確認
エンジンを起動した後、ブラウザで http://127.0.0.1:10101/docs
にアクセスすると、Swagger UI でAPIドキュメントが確認できます。
利用可能な話者の一覧を取得
/speakers
にGETリクエストを送信すると、以下のように利用可能な話者の一覧が返ってきます。
curl http://127.0.0.1:10101/speakers
[
{
"name": "Anneli",
"speaker_uuid": "e756b8e4-b606-4e15-99b1-3f9c6a1b2317",
"styles": [
{
"name": "ノーマル",
"id": 888753760,
"type": "talk"
},
{
"name": "通常",
"id": 888753761,
"type": "talk"
},
{
"name": "テンション高め",
"id": 888753762,
"type": "talk"
},
{
"name": "落ち着き",
"id": 888753763,
"type": "talk"
},
{
"name": "上機嫌",
"id": 888753764,
"type": "talk"
},
{
"name": "怒り・悲しみ",
"id": 888753765,
"type": "talk"
}
],
"version": "1.0.0",
"supported_features": {
"permitted_synthesis_morphing": "NOTHING"
}
}
]
VOICEVOXでは話者IDが0から割り振られていましたが、
AivisSpeech Engine では変則的なIDが割り振られているようです。
音声合成のリクエスト
/audio_query
でクエリを生成したあと、 それを /synthesis
にPOSTすることで音声合成を実行することができます。
以下の例では、話者 Anneli
の ノーマル
スタイルで、こんにちは
というテキストを音声合成しています。
クエリの生成
curl -X 'POST' \
'http://127.0.0.1:10101/audio_query?text=%E3%81%93%E3%82%93%E3%81%AB%E3%81%A1%E3%81%AF&speaker=888753760' \
-H 'accept: application/json' \
-d ''
生成されたクエリ
{
"accent_phrases": [
{
"moras": [
{
"text": "コ",
"consonant": "k",
"consonant_length": 0,
"vowel": "o",
"vowel_length": 0,
"pitch": 0
},
{
"text": "ン",
"consonant": null,
"consonant_length": null,
"vowel": "N",
"vowel_length": 0,
"pitch": 0
},
{
"text": "ニ",
"consonant": "n",
"consonant_length": 0,
"vowel": "i",
"vowel_length": 0,
"pitch": 0
},
{
"text": "チ",
"consonant": "ch",
"consonant_length": 0,
"vowel": "i",
"vowel_length": 0,
"pitch": 0
},
{
"text": "ワ",
"consonant": "w",
"consonant_length": 0,
"vowel": "a",
"vowel_length": 0,
"pitch": 0
}
],
"accent": 5,
"pause_mora": null,
"is_interrogative": false
}
],
"speedScale": 1,
"intonationScale": 1,
"tempoDynamicsScale": 1,
"pitchScale": 0,
"volumeScale": 1,
"prePhonemeLength": 0.1,
"postPhonemeLength": 0.1,
"pauseLength": null,
"pauseLengthScale": 1,
"outputSamplingRate": 44100,
"outputStereo": false,
"kana": "こんにちは"
}
クエリの各パラメータを変更することで、イントネーションなどを調整することができます。
(今回は変更せずに実行しています。各パラメータの詳細はAPIドキュメントを参照してください。)
音声合成の実行
hello.wav というファイルに音声合成の結果を保存します。
curl -X 'POST' \
'http://127.0.0.1:10101/synthesis?speaker=888753760&enable_interrogative_upspeak=true' \
-H 'accept: audio/wav' \
-H 'Content-Type: application/json' \
-o 'hello.wav' \
-d '{
"accent_phrases": [
{
"moras": [
{
"text": "コ",
"consonant": "k",
"consonant_length": 0,
"vowel": "o",
"vowel_length": 0,
"pitch": 0
},
{
"text": "ン",
"consonant": null,
"consonant_length": null,
"vowel": "N",
"vowel_length": 0,
"pitch": 0
},
{
"text": "ニ",
"consonant": "n",
"consonant_length": 0,
"vowel": "i",
"vowel_length": 0,
"pitch": 0
},
{
"text": "チ",
"consonant": "ch",
"consonant_length": 0,
"vowel": "i",
"vowel_length": 0,
"pitch": 0
},
{
"text": "ワ",
"consonant": "w",
"consonant_length": 0,
"vowel": "a",
"vowel_length": 0,
"pitch": 0
}
],
"accent": 5,
"pause_mora": null,
"is_interrogative": false
}
],
"speedScale": 1,
"intonationScale": 1,
"tempoDynamicsScale": 1,
"pitchScale": 0,
"volumeScale": 1,
"prePhonemeLength": 0.1,
"postPhonemeLength": 0.1,
"pauseLength": null,
"pauseLengthScale": 1,
"outputSamplingRate": 44100,
"outputStereo": false,
"kana": "こんにちは"
}'
音声合成の結果
サーバー側では以下のような情報が出力されます。
[2024/12/04 06:17:00] INFO: 172.17.0.1:34820 - "POST /audio_query?text=%E3%81%93%E3%82%93%E3%81%AB%E3%81%A1%E3%81%AF&speaker=888753760 HTTP/1.1" 200 OK
[2024/12/04 06:17:20] INFO: Model: Anneli / Version 1.0.0
[2024/12/04 06:17:20] INFO: Speaker: Anneli / Style: ノーマル
[2024/12/04 06:17:20] INFO: Running inference...
[2024/12/04 06:17:20] INFO: Text: こんにちは
[2024/12/04 06:17:20] INFO: Speed: 1.00 (Input: 1.00)
[2024/12/04 06:17:20] INFO: Style Weight: 1.00 (Input: 1.00)
[2024/12/04 06:17:20] INFO: Tempo Dynamics: 0.20 (Input: 1.00)
[2024/12/04 06:17:20] INFO: Pitch: 1.00 (Input: 0.00)
[2024/12/04 06:17:20] INFO: Volume: 1.00
[2024/12/04 06:17:20] INFO: Pre-Silence: 0.10
[2024/12/04 06:17:20] INFO: Post-Silence: 0.10
[2024/12/04 06:17:21] INFO: Inference done. Elapsed time: 0.43 sec.
[2024/12/04 06:17:21] INFO: 172.17.0.1:50148 - "POST /synthesis?speaker=888753760&enable_interrogative_upspeak=true HTTP/1.1" 200 OK
結果として、以下のような音声が生成されました。
https://github.com/user-attachments/assets/d181116f-43a2-4aca-9681-e16c07863765
モデルのプレロード
/initialize_speaker
にPOSTリクエストを送信することで、指定した話者のモデルを事前にロードしておくことができます。
これを行っておくことで、音声合成の初回リクエストにかかる時間を短縮することができるようです。
curl -X 'POST' \
'http://127.0.0.1:10101/initialize_speaker?speaker=888753760&skip_reinit=false' \
-H 'accept: */*' \
-d ''
アルファベットの発音について
VOICEVOXでは、英単語の発音がうまくいかない場合がある(アルファベットをそのまま読んでしまう)という課題があり、
正しく読み上げるためには、事前に辞書に登録しておいたり、読み上げる前にカタカナに変換したりといった工夫が必要でした。
AivisSpeech Engine ではその課題が改善されているようです。
この動画でいう「Act-One」や「LivePortrait」のような単体で辞書に載っていないような単語もいい感じにカタカナ読みできるようにしてあるのがこだわりポイント
— Torishima / INTP (@izutorishima) November 19, 2024
特に技術系記事を一発で読ませたい時に役立つはず…! #AivisSpeech https://t.co/ThyfmfFaX4
そこで、ChatGPTに以下のような文章を生成してもらい、それを AivisSpeech Engine で読み上げてみました。
Pythonは、データサイエンスの分野で圧倒的な支持を得ているプログラミング言語です。
この言語の強みは、PandasやNumPyなどのライブラリを活用することで、大量のデータを効率的に処理できる点にあります。
たとえば、数百万行のデータを「DataFrame」として扱い、数行のコードで統計的な分析を行うことが可能です。
また、MatplotlibやSeabornを使えば、視覚的なデータの可視化も容易に実現できます。
生成されたクエリ
{
"accent_phrases": [
{
"moras": [
{
"text": "パ",
"consonant": "p",
"consonant_length": 0,
"vowel": "a",
"vowel_length": 0,
"pitch": 0
},
{
"text": "イ",
"consonant": null,
"consonant_length": null,
"vowel": "i",
"vowel_length": 0,
"pitch": 0
},
{
"text": "ソ",
"consonant": "s",
"consonant_length": 0,
"vowel": "o",
"vowel_length": 0,
"pitch": 0
},
{
"text": "ン",
"consonant": null,
"consonant_length": null,
"vowel": "N",
"vowel_length": 0,
"pitch": 0
},
{
"text": "ワ",
"consonant": "w",
"consonant_length": 0,
"vowel": "a",
"vowel_length": 0,
"pitch": 0
},
{
"text": ",",
"consonant": null,
"consonant_length": null,
"vowel": "pau",
"vowel_length": 0,
"pitch": 0
}
],
"accent": 1,
"pause_mora": null,
"is_interrogative": false
},
{
"moras": [
{
"text": "デ",
"consonant": "d",
"consonant_length": 0,
"vowel": "e",
"vowel_length": 0,
"pitch": 0
},
{
"text": "エ",
"consonant": null,
"consonant_length": null,
"vowel": "e",
"vowel_length": 0,
"pitch": 0
},
{
"text": "タ",
"consonant": "t",
"consonant_length": 0,
"vowel": "a",
"vowel_length": 0,
"pitch": 0
},
{
"text": "サ",
"consonant": "s",
"consonant_length": 0,
"vowel": "a",
"vowel_length": 0,
"pitch": 0
},
{
"text": "イ",
"consonant": null,
"consonant_length": null,
"vowel": "i",
"vowel_length": 0,
"pitch": 0
},
{
"text": "エ",
"consonant": null,
"consonant_length": null,
"vowel": "e",
"vowel_length": 0,
"pitch": 0
},
{
"text": "ン",
"consonant": null,
"consonant_length": null,
"vowel": "N",
"vowel_length": 0,
"pitch": 0
},
{
"text": "ス",
"consonant": "s",
"consonant_length": 0,
"vowel": "u",
"vowel_length": 0,
"pitch": 0
},
{
"text": "ノ",
"consonant": "n",
"consonant_length": 0,
"vowel": "o",
"vowel_length": 0,
"pitch": 0
}
],
"accent": 4,
"pause_mora": null,
"is_interrogative": false
},
{
"moras": [
{
"text": "ブ",
"consonant": "b",
"consonant_length": 0,
"vowel": "u",
"vowel_length": 0,
"pitch": 0
},
{
"text": "ン",
"consonant": null,
"consonant_length": null,
"vowel": "N",
"vowel_length": 0,
"pitch": 0
},
{
"text": "ヤ",
"consonant": "y",
"consonant_length": 0,
"vowel": "a",
"vowel_length": 0,
"pitch": 0
},
{
"text": "デ",
"consonant": "d",
"consonant_length": 0,
"vowel": "e",
"vowel_length": 0,
"pitch": 0
}
],
"accent": 1,
"pause_mora": null,
"is_interrogative": false
},
{
"moras": [
{
"text": "ア",
"consonant": null,
"consonant_length": null,
"vowel": "a",
"vowel_length": 0,
"pitch": 0
},
{
"text": "ッ",
"consonant": null,
"consonant_length": null,
"vowel": "cl",
"vowel_length": 0,
"pitch": 0
},
{
"text": "ト",
"consonant": "t",
"consonant_length": 0,
"vowel": "o",
"vowel_length": 0,
"pitch": 0
},
{
"text": "オ",
"consonant": null,
"consonant_length": null,
"vowel": "o",
"vowel_length": 0,
"pitch": 0
},
{
"text": "テ",
"consonant": "t",
"consonant_length": 0,
"vowel": "e",
"vowel_length": 0,
"pitch": 0
},
{
"text": "キ",
"consonant": "k",
"consonant_length": 0,
"vowel": "i",
"vowel_length": 0,
"pitch": 0
},
{
"text": "ナ",
"consonant": "n",
"consonant_length": 0,
"vowel": "a",
"vowel_length": 0,
"pitch": 0
},
{
"text": "シ",
"consonant": "sh",
"consonant_length": 0,
"vowel": "i",
"vowel_length": 0,
"pitch": 0
},
{
"text": "ジ",
"consonant": "j",
"consonant_length": 0,
"vowel": "i",
"vowel_length": 0,
"pitch": 0
},
{
"text": "オ",
"consonant": null,
"consonant_length": null,
"vowel": "o",
"vowel_length": 0,
"pitch": 0
}
],
"accent": 8,
"pause_mora": null,
"is_interrogative": false
},
{
"moras": [
{
"text": "エ",
"consonant": null,
"consonant_length": null,
"vowel": "e",
"vowel_length": 0,
"pitch": 0
},
{
"text": "テ",
"consonant": "t",
"consonant_length": 0,
"vowel": "e",
"vowel_length": 0,
"pitch": 0
},
{
"text": "イ",
"consonant": null,
"consonant_length": null,
"vowel": "i",
"vowel_length": 0,
"pitch": 0
},
{
"text": "ル",
"consonant": "r",
"consonant_length": 0,
"vowel": "u",
"vowel_length": 0,
"pitch": 0
}
],
"accent": 1,
"pause_mora": null,
"is_interrogative": false
},
{
"moras": [
{
"text": "プ",
"consonant": "p",
"consonant_length": 0,
"vowel": "u",
"vowel_length": 0,
"pitch": 0
},
{
"text": "ロ",
"consonant": "r",
"consonant_length": 0,
"vowel": "o",
"vowel_length": 0,
"pitch": 0
},
{
"text": "グ",
"consonant": "g",
"consonant_length": 0,
"vowel": "u",
"vowel_length": 0,
"pitch": 0
},
{
"text": "ラ",
"consonant": "r",
"consonant_length": 0,
"vowel": "a",
"vowel_length": 0,
"pitch": 0
},
{
"text": "ミ",
"consonant": "m",
"consonant_length": 0,
"vowel": "i",
"vowel_length": 0,
"pitch": 0
},
{
"text": "ン",
"consonant": null,
"consonant_length": null,
"vowel": "N",
"vowel_length": 0,
"pitch": 0
},
{
"text": "グ",
"consonant": "g",
"consonant_length": 0,
"vowel": "u",
"vowel_length": 0,
"pitch": 0
}
],
"accent": 2,
"pause_mora": null,
"is_interrogative": false
},
{
"moras": [
{
"text": "ゲ",
"consonant": "g",
"consonant_length": 0,
"vowel": "e",
"vowel_length": 0,
"pitch": 0
},
{
"text": "ン",
"consonant": null,
"consonant_length": null,
"vowel": "N",
"vowel_length": 0,
"pitch": 0
},
{
"text": "ゴ",
"consonant": "g",
"consonant_length": 0,
"vowel": "o",
"vowel_length": 0,
"pitch": 0
},
{
"text": "デ",
"consonant": "d",
"consonant_length": 0,
"vowel": "e",
"vowel_length": 0,
"pitch": 0
},
{
"text": "ス",
"consonant": "s",
"consonant_length": 0,
"vowel": "u",
"vowel_length": 0,
"pitch": 0
},
{
"text": ".",
"consonant": null,
"consonant_length": null,
"vowel": "pau",
"vowel_length": 0,
"pitch": 0
}
],
"accent": 4,
"pause_mora": null,
"is_interrogative": false
},
{
"moras": [
{
"text": "コ",
"consonant": "k",
"consonant_length": 0,
"vowel": "o",
"vowel_length": 0,
"pitch": 0
},
{
"text": "ノ",
"consonant": "n",
"consonant_length": 0,
"vowel": "o",
"vowel_length": 0,
"pitch": 0
},
{
"text": "ゲ",
"consonant": "g",
"consonant_length": 0,
"vowel": "e",
"vowel_length": 0,
"pitch": 0
},
{
"text": "ン",
"consonant": null,
"consonant_length": null,
"vowel": "N",
"vowel_length": 0,
"pitch": 0
},
{
"text": "ゴ",
"consonant": "g",
"consonant_length": 0,
"vowel": "o",
"vowel_length": 0,
"pitch": 0
},
{
"text": "ノ",
"consonant": "n",
"consonant_length": 0,
"vowel": "o",
"vowel_length": 0,
"pitch": 0
}
],
"accent": 3,
"pause_mora": null,
"is_interrogative": false
},
{
"moras": [
{
"text": "ツ",
"consonant": "ts",
"consonant_length": 0,
"vowel": "u",
"vowel_length": 0,
"pitch": 0
},
{
"text": "ヨ",
"consonant": "y",
"consonant_length": 0,
"vowel": "o",
"vowel_length": 0,
"pitch": 0
},
{
"text": "ミ",
"consonant": "m",
"consonant_length": 0,
"vowel": "i",
"vowel_length": 0,
"pitch": 0
},
{
"text": "ワ",
"consonant": "w",
"consonant_length": 0,
"vowel": "a",
"vowel_length": 0,
"pitch": 0
},
{
"text": ",",
"consonant": null,
"consonant_length": null,
"vowel": "pau",
"vowel_length": 0,
"pitch": 0
}
],
"accent": 3,
"pause_mora": null,
"is_interrogative": false
},
{
"moras": [
{
"text": "パ",
"consonant": "p",
"consonant_length": 0,
"vowel": "a",
"vowel_length": 0,
"pitch": 0
},
{
"text": "ン",
"consonant": null,
"consonant_length": null,
"vowel": "N",
"vowel_length": 0,
"pitch": 0
},
{
"text": "ダ",
"consonant": "d",
"consonant_length": 0,
"vowel": "a",
"vowel_length": 0,
"pitch": 0
},
{
"text": "ス",
"consonant": "s",
"consonant_length": 0,
"vowel": "u",
"vowel_length": 0,
"pitch": 0
},
{
"text": "ヤ",
"consonant": "y",
"consonant_length": 0,
"vowel": "a",
"vowel_length": 0,
"pitch": 0
}
],
"accent": 1,
"pause_mora": null,
"is_interrogative": false
},
{
"moras": [
{
"text": "ナ",
"consonant": "n",
"consonant_length": 0,
"vowel": "a",
"vowel_length": 0,
"pitch": 0
},
{
"text": "ム",
"consonant": "m",
"consonant_length": 0,
"vowel": "u",
"vowel_length": 0,
"pitch": 0
},
{
"text": "パ",
"consonant": "p",
"consonant_length": 0,
"vowel": "a",
"vowel_length": 0,
"pitch": 0
},
{
"text": "イ",
"consonant": null,
"consonant_length": null,
"vowel": "i",
"vowel_length": 0,
"pitch": 0
},
{
"text": "ナ",
"consonant": "n",
"consonant_length": 0,
"vowel": "a",
"vowel_length": 0,
"pitch": 0
},
{
"text": "ド",
"consonant": "d",
"consonant_length": 0,
"vowel": "o",
"vowel_length": 0,
"pitch": 0
},
{
"text": "ノ",
"consonant": "n",
"consonant_length": 0,
"vowel": "o",
"vowel_length": 0,
"pitch": 0
}
],
"accent": 2,
"pause_mora": null,
"is_interrogative": false
},
{
"moras": [
{
"text": "ラ",
"consonant": "r",
"consonant_length": 0,
"vowel": "a",
"vowel_length": 0,
"pitch": 0
},
{
"text": "イ",
"consonant": null,
"consonant_length": null,
"vowel": "i",
"vowel_length": 0,
"pitch": 0
},
{
"text": "ブ",
"consonant": "b",
"consonant_length": 0,
"vowel": "u",
"vowel_length": 0,
"pitch": 0
},
{
"text": "ラ",
"consonant": "r",
"consonant_length": 0,
"vowel": "a",
"vowel_length": 0,
"pitch": 0
},
{
"text": "リ",
"consonant": "r",
"consonant_length": 0,
"vowel": "i",
"vowel_length": 0,
"pitch": 0
},
{
"text": "オ",
"consonant": null,
"consonant_length": null,
"vowel": "o",
"vowel_length": 0,
"pitch": 0
}
],
"accent": 1,
"pause_mora": null,
"is_interrogative": false
},
{
"moras": [
{
"text": "カ",
"consonant": "k",
"consonant_length": 0,
"vowel": "a",
"vowel_length": 0,
"pitch": 0
},
{
"text": "ツ",
"consonant": "ts",
"consonant_length": 0,
"vowel": "u",
"vowel_length": 0,
"pitch": 0
},
{
"text": "ヨ",
"consonant": "y",
"consonant_length": 0,
"vowel": "o",
"vowel_length": 0,
"pitch": 0
},
{
"text": "オ",
"consonant": null,
"consonant_length": null,
"vowel": "o",
"vowel_length": 0,
"pitch": 0
},
{
"text": "ス",
"consonant": "s",
"consonant_length": 0,
"vowel": "u",
"vowel_length": 0,
"pitch": 0
},
{
"text": "ル",
"consonant": "r",
"consonant_length": 0,
"vowel": "u",
"vowel_length": 0,
"pitch": 0
},
{
"text": "コ",
"consonant": "k",
"consonant_length": 0,
"vowel": "o",
"vowel_length": 0,
"pitch": 0
},
{
"text": "ト",
"consonant": "t",
"consonant_length": 0,
"vowel": "o",
"vowel_length": 0,
"pitch": 0
},
{
"text": "デ",
"consonant": "d",
"consonant_length": 0,
"vowel": "e",
"vowel_length": 0,
"pitch": 0
},
{
"text": ",",
"consonant": null,
"consonant_length": null,
"vowel": "pau",
"vowel_length": 0,
"pitch": 0
}
],
"accent": 8,
"pause_mora": null,
"is_interrogative": false
},
{
"moras": [
{
"text": "タ",
"consonant": "t",
"consonant_length": 0,
"vowel": "a",
"vowel_length": 0,
"pitch": 0
},
{
"text": "イ",
"consonant": null,
"consonant_length": null,
"vowel": "i",
"vowel_length": 0,
"pitch": 0
},
{
"text": "リョ",
"consonant": "ry",
"consonant_length": 0,
"vowel": "o",
"vowel_length": 0,
"pitch": 0
},
{
"text": "オ",
"consonant": null,
"consonant_length": null,
"vowel": "o",
"vowel_length": 0,
"pitch": 0
},
{
"text": "ノ",
"consonant": "n",
"consonant_length": 0,
"vowel": "o",
"vowel_length": 0,
"pitch": 0
},
{
"text": "デ",
"consonant": "d",
"consonant_length": 0,
"vowel": "e",
"vowel_length": 0,
"pitch": 0
},
{
"text": "エ",
"consonant": null,
"consonant_length": null,
"vowel": "e",
"vowel_length": 0,
"pitch": 0
},
{
"text": "タ",
"consonant": "t",
"consonant_length": 0,
"vowel": "a",
"vowel_length": 0,
"pitch": 0
},
{
"text": "オ",
"consonant": null,
"consonant_length": null,
"vowel": "o",
"vowel_length": 0,
"pitch": 0
}
],
"accent": 6,
"pause_mora": null,
"is_interrogative": false
},
{
"moras": [
{
"text": "コ",
"consonant": "k",
"consonant_length": 0,
"vowel": "o",
"vowel_length": 0,
"pitch": 0
},
{
"text": "オ",
"consonant": null,
"consonant_length": null,
"vowel": "o",
"vowel_length": 0,
"pitch": 0
},
{
"text": "リ",
"consonant": "r",
"consonant_length": 0,
"vowel": "i",
"vowel_length": 0,
"pitch": 0
},
{
"text": "ツ",
"consonant": "ts",
"consonant_length": 0,
"vowel": "u",
"vowel_length": 0,
"pitch": 0
},
{
"text": "テ",
"consonant": "t",
"consonant_length": 0,
"vowel": "e",
"vowel_length": 0,
"pitch": 0
},
{
"text": "キ",
"consonant": "k",
"consonant_length": 0,
"vowel": "i",
"vowel_length": 0,
"pitch": 0
},
{
"text": "ニ",
"consonant": "n",
"consonant_length": 0,
"vowel": "i",
"vowel_length": 0,
"pitch": 0
}
],
"accent": 7,
"pause_mora": null,
"is_interrogative": false
},
{
"moras": [
{
"text": "ショ",
"consonant": "sh",
"consonant_length": 0,
"vowel": "o",
"vowel_length": 0,
"pitch": 0
},
{
"text": "リ",
"consonant": "r",
"consonant_length": 0,
"vowel": "i",
"vowel_length": 0,
"pitch": 0
},
{
"text": "デ",
"consonant": "d",
"consonant_length": 0,
"vowel": "e",
"vowel_length": 0,
"pitch": 0
},
{
"text": "キ",
"consonant": "k",
"consonant_length": 0,
"vowel": "i",
"vowel_length": 0,
"pitch": 0
},
{
"text": "ル",
"consonant": "r",
"consonant_length": 0,
"vowel": "u",
"vowel_length": 0,
"pitch": 0
},
{
"text": "テ",
"consonant": "t",
"consonant_length": 0,
"vowel": "e",
"vowel_length": 0,
"pitch": 0
},
{
"text": "ン",
"consonant": null,
"consonant_length": null,
"vowel": "N",
"vowel_length": 0,
"pitch": 0
},
{
"text": "ニ",
"consonant": "n",
"consonant_length": 0,
"vowel": "i",
"vowel_length": 0,
"pitch": 0
}
],
"accent": 5,
"pause_mora": null,
"is_interrogative": false
},
{
"moras": [
{
"text": "ア",
"consonant": null,
"consonant_length": null,
"vowel": "a",
"vowel_length": 0,
"pitch": 0
},
{
"text": "リ",
"consonant": "r",
"consonant_length": 0,
"vowel": "i",
"vowel_length": 0,
"pitch": 0
},
{
"text": "マ",
"consonant": "m",
"consonant_length": 0,
"vowel": "a",
"vowel_length": 0,
"pitch": 0
},
{
"text": "ス",
"consonant": "s",
"consonant_length": 0,
"vowel": "u",
"vowel_length": 0,
"pitch": 0
},
{
"text": ".",
"consonant": null,
"consonant_length": null,
"vowel": "pau",
"vowel_length": 0,
"pitch": 0
}
],
"accent": 3,
"pause_mora": null,
"is_interrogative": false
},
{
"moras": [
{
"text": "タ",
"consonant": "t",
"consonant_length": 0,
"vowel": "a",
"vowel_length": 0,
"pitch": 0
},
{
"text": "ト",
"consonant": "t",
"consonant_length": 0,
"vowel": "o",
"vowel_length": 0,
"pitch": 0
},
{
"text": "エ",
"consonant": null,
"consonant_length": null,
"vowel": "e",
"vowel_length": 0,
"pitch": 0
},
{
"text": "バ",
"consonant": "b",
"consonant_length": 0,
"vowel": "a",
"vowel_length": 0,
"pitch": 0
},
{
"text": ",",
"consonant": null,
"consonant_length": null,
"vowel": "pau",
"vowel_length": 0,
"pitch": 0
}
],
"accent": 2,
"pause_mora": null,
"is_interrogative": false
},
{
"moras": [
{
"text": "ス",
"consonant": "s",
"consonant_length": 0,
"vowel": "u",
"vowel_length": 0,
"pitch": 0
},
{
"text": "ウ",
"consonant": null,
"consonant_length": null,
"vowel": "u",
"vowel_length": 0,
"pitch": 0
}
],
"accent": 2,
"pause_mora": null,
"is_interrogative": false
},
{
"moras": [
{
"text": "ヒャ",
"consonant": "hy",
"consonant_length": 0,
"vowel": "a",
"vowel_length": 0,
"pitch": 0
},
{
"text": "ク",
"consonant": "k",
"consonant_length": 0,
"vowel": "u",
"vowel_length": 0,
"pitch": 0
},
{
"text": "マ",
"consonant": "m",
"consonant_length": 0,
"vowel": "a",
"vowel_length": 0,
"pitch": 0
},
{
"text": "ン",
"consonant": null,
"consonant_length": null,
"vowel": "N",
"vowel_length": 0,
"pitch": 0
},
{
"text": "ギョ",
"consonant": "gy",
"consonant_length": 0,
"vowel": "o",
"vowel_length": 0,
"pitch": 0
},
{
"text": "ウ",
"consonant": null,
"consonant_length": null,
"vowel": "u",
"vowel_length": 0,
"pitch": 0
},
{
"text": "ノ",
"consonant": "n",
"consonant_length": 0,
"vowel": "o",
"vowel_length": 0,
"pitch": 0
},
{
"text": "デ",
"consonant": "d",
"consonant_length": 0,
"vowel": "e",
"vowel_length": 0,
"pitch": 0
},
{
"text": "エ",
"consonant": null,
"consonant_length": null,
"vowel": "e",
"vowel_length": 0,
"pitch": 0
},
{
"text": "タ",
"consonant": "t",
"consonant_length": 0,
"vowel": "a",
"vowel_length": 0,
"pitch": 0
},
{
"text": "オ",
"consonant": null,
"consonant_length": null,
"vowel": "o",
"vowel_length": 0,
"pitch": 0
},
{
"text": "'",
"consonant": null,
"consonant_length": null,
"vowel": "pau",
"vowel_length": 0,
"pitch": 0
}
],
"accent": 8,
"pause_mora": null,
"is_interrogative": false
},
{
"moras": [
{
"text": "デ",
"consonant": "d",
"consonant_length": 0,
"vowel": "e",
"vowel_length": 0,
"pitch": 0
},
{
"text": "エ",
"consonant": null,
"consonant_length": null,
"vowel": "e",
"vowel_length": 0,
"pitch": 0
},
{
"text": "タ",
"consonant": "t",
"consonant_length": 0,
"vowel": "a",
"vowel_length": 0,
"pitch": 0
},
{
"text": "フ",
"consonant": "f",
"consonant_length": 0,
"vowel": "u",
"vowel_length": 0,
"pitch": 0
},
{
"text": "レ",
"consonant": "r",
"consonant_length": 0,
"vowel": "e",
"vowel_length": 0,
"pitch": 0
},
{
"text": "エ",
"consonant": null,
"consonant_length": null,
"vowel": "e",
"vowel_length": 0,
"pitch": 0
},
{
"text": "ム",
"consonant": "m",
"consonant_length": 0,
"vowel": "u",
"vowel_length": 0,
"pitch": 0
},
{
"text": "'",
"consonant": null,
"consonant_length": null,
"vowel": "pau",
"vowel_length": 0,
"pitch": 0
}
],
"accent": 4,
"pause_mora": null,
"is_interrogative": false
},
{
"moras": [
{
"text": "ト",
"consonant": "t",
"consonant_length": 0,
"vowel": "o",
"vowel_length": 0,
"pitch": 0
},
{
"text": "シ",
"consonant": "sh",
"consonant_length": 0,
"vowel": "i",
"vowel_length": 0,
"pitch": 0
},
{
"text": "テ",
"consonant": "t",
"consonant_length": 0,
"vowel": "e",
"vowel_length": 0,
"pitch": 0
}
],
"accent": 3,
"pause_mora": null,
"is_interrogative": false
},
{
"moras": [
{
"text": "ア",
"consonant": null,
"consonant_length": null,
"vowel": "a",
"vowel_length": 0,
"pitch": 0
},
{
"text": "ツ",
"consonant": "ts",
"consonant_length": 0,
"vowel": "u",
"vowel_length": 0,
"pitch": 0
},
{
"text": "カ",
"consonant": "k",
"consonant_length": 0,
"vowel": "a",
"vowel_length": 0,
"pitch": 0
},
{
"text": "イ",
"consonant": null,
"consonant_length": null,
"vowel": "i",
"vowel_length": 0,
"pitch": 0
},
{
"text": ",",
"consonant": null,
"consonant_length": null,
"vowel": "pau",
"vowel_length": 0,
"pitch": 0
}
],
"accent": 4,
"pause_mora": null,
"is_interrogative": false
},
{
"moras": [
{
"text": "ス",
"consonant": "s",
"consonant_length": 0,
"vowel": "u",
"vowel_length": 0,
"pitch": 0
},
{
"text": "ウ",
"consonant": null,
"consonant_length": null,
"vowel": "u",
"vowel_length": 0,
"pitch": 0
},
{
"text": "コ",
"consonant": "k",
"consonant_length": 0,
"vowel": "o",
"vowel_length": 0,
"pitch": 0
},
{
"text": "オ",
"consonant": null,
"consonant_length": null,
"vowel": "o",
"vowel_length": 0,
"pitch": 0
},
{
"text": "ノ",
"consonant": "n",
"consonant_length": 0,
"vowel": "o",
"vowel_length": 0,
"pitch": 0
},
{
"text": "コ",
"consonant": "k",
"consonant_length": 0,
"vowel": "o",
"vowel_length": 0,
"pitch": 0
},
{
"text": "オ",
"consonant": null,
"consonant_length": null,
"vowel": "o",
"vowel_length": 0,
"pitch": 0
},
{
"text": "ド",
"consonant": "d",
"consonant_length": 0,
"vowel": "o",
"vowel_length": 0,
"pitch": 0
},
{
"text": "デ",
"consonant": "d",
"consonant_length": 0,
"vowel": "e",
"vowel_length": 0,
"pitch": 0
}
],
"accent": 6,
"pause_mora": null,
"is_interrogative": false
},
{
"moras": [
{
"text": "ト",
"consonant": "t",
"consonant_length": 0,
"vowel": "o",
"vowel_length": 0,
"pitch": 0
},
{
"text": "オ",
"consonant": null,
"consonant_length": null,
"vowel": "o",
"vowel_length": 0,
"pitch": 0
},
{
"text": "ケ",
"consonant": "k",
"consonant_length": 0,
"vowel": "e",
"vowel_length": 0,
"pitch": 0
},
{
"text": "エ",
"consonant": null,
"consonant_length": null,
"vowel": "e",
"vowel_length": 0,
"pitch": 0
},
{
"text": "テ",
"consonant": "t",
"consonant_length": 0,
"vowel": "e",
"vowel_length": 0,
"pitch": 0
},
{
"text": "キ",
"consonant": "k",
"consonant_length": 0,
"vowel": "i",
"vowel_length": 0,
"pitch": 0
},
{
"text": "ナ",
"consonant": "n",
"consonant_length": 0,
"vowel": "a",
"vowel_length": 0,
"pitch": 0
}
],
"accent": 7,
"pause_mora": null,
"is_interrogative": false
},
{
"moras": [
{
"text": "ブ",
"consonant": "b",
"consonant_length": 0,
"vowel": "u",
"vowel_length": 0,
"pitch": 0
},
{
"text": "ン",
"consonant": null,
"consonant_length": null,
"vowel": "N",
"vowel_length": 0,
"pitch": 0
},
{
"text": "セ",
"consonant": "s",
"consonant_length": 0,
"vowel": "e",
"vowel_length": 0,
"pitch": 0
},
{
"text": "キ",
"consonant": "k",
"consonant_length": 0,
"vowel": "i",
"vowel_length": 0,
"pitch": 0
},
{
"text": "オ",
"consonant": null,
"consonant_length": null,
"vowel": "o",
"vowel_length": 0,
"pitch": 0
}
],
"accent": 5,
"pause_mora": null,
"is_interrogative": false
},
{
"moras": [
{
"text": "オ",
"consonant": null,
"consonant_length": null,
"vowel": "o",
"vowel_length": 0,
"pitch": 0
},
{
"text": "コ",
"consonant": "k",
"consonant_length": 0,
"vowel": "o",
"vowel_length": 0,
"pitch": 0
},
{
"text": "ナ",
"consonant": "n",
"consonant_length": 0,
"vowel": "a",
"vowel_length": 0,
"pitch": 0
},
{
"text": "ウ",
"consonant": null,
"consonant_length": null,
"vowel": "u",
"vowel_length": 0,
"pitch": 0
},
{
"text": "コ",
"consonant": "k",
"consonant_length": 0,
"vowel": "o",
"vowel_length": 0,
"pitch": 0
},
{
"text": "ト",
"consonant": "t",
"consonant_length": 0,
"vowel": "o",
"vowel_length": 0,
"pitch": 0
},
{
"text": "ガ",
"consonant": "g",
"consonant_length": 0,
"vowel": "a",
"vowel_length": 0,
"pitch": 0
}
],
"accent": 6,
"pause_mora": null,
"is_interrogative": false
},
{
"moras": [
{
"text": "カ",
"consonant": "k",
"consonant_length": 0,
"vowel": "a",
"vowel_length": 0,
"pitch": 0
},
{
"text": "ノ",
"consonant": "n",
"consonant_length": 0,
"vowel": "o",
"vowel_length": 0,
"pitch": 0
},
{
"text": "オ",
"consonant": null,
"consonant_length": null,
"vowel": "o",
"vowel_length": 0,
"pitch": 0
},
{
"text": "デ",
"consonant": "d",
"consonant_length": 0,
"vowel": "e",
"vowel_length": 0,
"pitch": 0
},
{
"text": "ス",
"consonant": "s",
"consonant_length": 0,
"vowel": "u",
"vowel_length": 0,
"pitch": 0
},
{
"text": ".",
"consonant": null,
"consonant_length": null,
"vowel": "pau",
"vowel_length": 0,
"pitch": 0
}
],
"accent": 4,
"pause_mora": null,
"is_interrogative": false
},
{
"moras": [
{
"text": "マ",
"consonant": "m",
"consonant_length": 0,
"vowel": "a",
"vowel_length": 0,
"pitch": 0
},
{
"text": "タ",
"consonant": "t",
"consonant_length": 0,
"vowel": "a",
"vowel_length": 0,
"pitch": 0
},
{
"text": ",",
"consonant": null,
"consonant_length": null,
"vowel": "pau",
"vowel_length": 0,
"pitch": 0
}
],
"accent": 2,
"pause_mora": null,
"is_interrogative": false
},
{
"moras": [
{
"text": "マ",
"consonant": "m",
"consonant_length": 0,
"vowel": "a",
"vowel_length": 0,
"pitch": 0
},
{
"text": "ッ",
"consonant": null,
"consonant_length": null,
"vowel": "cl",
"vowel_length": 0,
"pitch": 0
},
{
"text": "ト",
"consonant": "t",
"consonant_length": 0,
"vowel": "o",
"vowel_length": 0,
"pitch": 0
},
{
"text": "プ",
"consonant": "p",
"consonant_length": 0,
"vowel": "u",
"vowel_length": 0,
"pitch": 0
},
{
"text": "ロ",
"consonant": "r",
"consonant_length": 0,
"vowel": "o",
"vowel_length": 0,
"pitch": 0
},
{
"text": "ト",
"consonant": "t",
"consonant_length": 0,
"vowel": "o",
"vowel_length": 0,
"pitch": 0
},
{
"text": "リ",
"consonant": "r",
"consonant_length": 0,
"vowel": "i",
"vowel_length": 0,
"pitch": 0
},
{
"text": "ブ",
"consonant": "b",
"consonant_length": 0,
"vowel": "u",
"vowel_length": 0,
"pitch": 0
},
{
"text": "ヤ",
"consonant": "y",
"consonant_length": 0,
"vowel": "a",
"vowel_length": 0,
"pitch": 0
}
],
"accent": 9,
"pause_mora": null,
"is_interrogative": false
},
{
"moras": [
{
"text": "シ",
"consonant": "sh",
"consonant_length": 0,
"vowel": "i",
"vowel_length": 0,
"pitch": 0
},
{
"text": "イ",
"consonant": null,
"consonant_length": null,
"vowel": "i",
"vowel_length": 0,
"pitch": 0
},
{
"text": "ボ",
"consonant": "b",
"consonant_length": 0,
"vowel": "o",
"vowel_length": 0,
"pitch": 0
},
{
"text": "オ",
"consonant": null,
"consonant_length": null,
"vowel": "o",
"vowel_length": 0,
"pitch": 0
},
{
"text": "ン",
"consonant": null,
"consonant_length": null,
"vowel": "N",
"vowel_length": 0,
"pitch": 0
},
{
"text": "オ",
"consonant": null,
"consonant_length": null,
"vowel": "o",
"vowel_length": 0,
"pitch": 0
}
],
"accent": 1,
"pause_mora": null,
"is_interrogative": false
},
{
"moras": [
{
"text": "ツ",
"consonant": "ts",
"consonant_length": 0,
"vowel": "u",
"vowel_length": 0,
"pitch": 0
},
{
"text": "カ",
"consonant": "k",
"consonant_length": 0,
"vowel": "a",
"vowel_length": 0,
"pitch": 0
},
{
"text": "エ",
"consonant": null,
"consonant_length": null,
"vowel": "e",
"vowel_length": 0,
"pitch": 0
},
{
"text": "バ",
"consonant": "b",
"consonant_length": 0,
"vowel": "a",
"vowel_length": 0,
"pitch": 0
},
{
"text": ",",
"consonant": null,
"consonant_length": null,
"vowel": "pau",
"vowel_length": 0,
"pitch": 0
}
],
"accent": 3,
"pause_mora": null,
"is_interrogative": false
},
{
"moras": [
{
"text": "シ",
"consonant": "sh",
"consonant_length": 0,
"vowel": "i",
"vowel_length": 0,
"pitch": 0
},
{
"text": "カ",
"consonant": "k",
"consonant_length": 0,
"vowel": "a",
"vowel_length": 0,
"pitch": 0
},
{
"text": "ク",
"consonant": "k",
"consonant_length": 0,
"vowel": "u",
"vowel_length": 0,
"pitch": 0
},
{
"text": "テ",
"consonant": "t",
"consonant_length": 0,
"vowel": "e",
"vowel_length": 0,
"pitch": 0
},
{
"text": "キ",
"consonant": "k",
"consonant_length": 0,
"vowel": "i",
"vowel_length": 0,
"pitch": 0
},
{
"text": "ナ",
"consonant": "n",
"consonant_length": 0,
"vowel": "a",
"vowel_length": 0,
"pitch": 0
},
{
"text": "デ",
"consonant": "d",
"consonant_length": 0,
"vowel": "e",
"vowel_length": 0,
"pitch": 0
},
{
"text": "エ",
"consonant": null,
"consonant_length": null,
"vowel": "e",
"vowel_length": 0,
"pitch": 0
},
{
"text": "タ",
"consonant": "t",
"consonant_length": 0,
"vowel": "a",
"vowel_length": 0,
"pitch": 0
},
{
"text": "ノ",
"consonant": "n",
"consonant_length": 0,
"vowel": "o",
"vowel_length": 0,
"pitch": 0
}
],
"accent": 7,
"pause_mora": null,
"is_interrogative": false
},
{
"moras": [
{
"text": "カ",
"consonant": "k",
"consonant_length": 0,
"vowel": "a",
"vowel_length": 0,
"pitch": 0
},
{
"text": "シ",
"consonant": "sh",
"consonant_length": 0,
"vowel": "i",
"vowel_length": 0,
"pitch": 0
},
{
"text": "カ",
"consonant": "k",
"consonant_length": 0,
"vowel": "a",
"vowel_length": 0,
"pitch": 0
},
{
"text": "モ",
"consonant": "m",
"consonant_length": 0,
"vowel": "o",
"vowel_length": 0,
"pitch": 0
}
],
"accent": 4,
"pause_mora": null,
"is_interrogative": false
},
{
"moras": [
{
"text": "ヨ",
"consonant": "y",
"consonant_length": 0,
"vowel": "o",
"vowel_length": 0,
"pitch": 0
},
{
"text": "オ",
"consonant": null,
"consonant_length": null,
"vowel": "o",
"vowel_length": 0,
"pitch": 0
},
{
"text": "イ",
"consonant": null,
"consonant_length": null,
"vowel": "i",
"vowel_length": 0,
"pitch": 0
},
{
"text": "ニ",
"consonant": "n",
"consonant_length": 0,
"vowel": "i",
"vowel_length": 0,
"pitch": 0
}
],
"accent": 4,
"pause_mora": null,
"is_interrogative": false
},
{
"moras": [
{
"text": "ジ",
"consonant": "j",
"consonant_length": 0,
"vowel": "i",
"vowel_length": 0,
"pitch": 0
},
{
"text": "ツ",
"consonant": "ts",
"consonant_length": 0,
"vowel": "u",
"vowel_length": 0,
"pitch": 0
},
{
"text": "ゲ",
"consonant": "g",
"consonant_length": 0,
"vowel": "e",
"vowel_length": 0,
"pitch": 0
},
{
"text": "ン",
"consonant": null,
"consonant_length": null,
"vowel": "N",
"vowel_length": 0,
"pitch": 0
},
{
"text": "デ",
"consonant": "d",
"consonant_length": 0,
"vowel": "e",
"vowel_length": 0,
"pitch": 0
},
{
"text": "キ",
"consonant": "k",
"consonant_length": 0,
"vowel": "i",
"vowel_length": 0,
"pitch": 0
},
{
"text": "マ",
"consonant": "m",
"consonant_length": 0,
"vowel": "a",
"vowel_length": 0,
"pitch": 0
},
{
"text": "ス",
"consonant": "s",
"consonant_length": 0,
"vowel": "u",
"vowel_length": 0,
"pitch": 0
},
{
"text": ".",
"consonant": null,
"consonant_length": null,
"vowel": "pau",
"vowel_length": 0,
"pitch": 0
}
],
"accent": 7,
"pause_mora": null,
"is_interrogative": false
}
],
"speedScale": 1,
"intonationScale": 1,
"tempoDynamicsScale": 1,
"pitchScale": 0,
"volumeScale": 1,
"prePhonemeLength": 0.1,
"postPhonemeLength": 0.1,
"pauseLength": null,
"pauseLengthScale": 1,
"outputSamplingRate": 44100,
"outputStereo": false,
"kana": "Pythonは、データサイエンスの分野で圧倒的な支持を得ているプログラミング言語です。この言語の強みは、PandasやNumPyなどのライブラリを活用することで、大量のデータを効率的に処理できる点にあります。たとえば、数百万行のデータを「DataFrame」として扱い、数行のコードで統計的な分析を行うことが可能です。また、MatplotlibやSeabornを使えば、視覚的なデータの可視化も容易に実現できます。"
}
合成結果:
https://github.com/user-attachments/assets/5debf3a3-8377-4964-b8ca-3f58230a75e7
この結果から、文中に含まれる英単語は全て問題なく読み上げられています。
例えば、 DataFrame
のクエリを見てみると、
従来は ディイエエティイエエエフアアルエエエムイイ
と読み上げられてしまっていたところが、
デエタフレエム
として自然に読み上げられています。
{
"moras": [
{
"text": "デ",
"consonant": "d",
"consonant_length": 0,
"vowel": "e",
"vowel_length": 0,
"pitch": 0
},
{
"text": "エ",
"consonant": null,
"consonant_length": null,
"vowel": "e",
"vowel_length": 0,
"pitch": 0
},
{
"text": "タ",
"consonant": "t",
"consonant_length": 0,
"vowel": "a",
"vowel_length": 0,
"pitch": 0
},
{
"text": "フ",
"consonant": "f",
"consonant_length": 0,
"vowel": "u",
"vowel_length": 0,
"pitch": 0
},
{
"text": "レ",
"consonant": "r",
"consonant_length": 0,
"vowel": "e",
"vowel_length": 0,
"pitch": 0
},
{
"text": "エ",
"consonant": null,
"consonant_length": null,
"vowel": "e",
"vowel_length": 0,
"pitch": 0
},
{
"text": "ム",
"consonant": "m",
"consonant_length": 0,
"vowel": "u",
"vowel_length": 0,
"pitch": 0
},
{
"text": "'",
"consonant": null,
"consonant_length": null,
"vowel": "pau",
"vowel_length": 0,
"pitch": 0
}
],
"accent": 4,
"pause_mora": null,
"is_interrogative": false
},
(参考) VOICEVOXで生成されたクエリ
{
"accent_phrases": [
{
"moras": [
{
"text": "ピ",
"consonant": "p",
"consonant_length": 0.11647210270166397,
"vowel": "i",
"vowel_length": 0.10434596985578537,
"pitch": 5.449944496154785
},
{
"text": "イ",
"consonant": null,
"consonant_length": null,
"vowel": "i",
"vowel_length": 0.09888916462659836,
"pitch": 5.5960283279418945
},
{
"text": "ワ",
"consonant": "w",
"consonant_length": 0.06848771125078201,
"vowel": "a",
"vowel_length": 0.09778345376253128,
"pitch": 5.6040849685668945
},
{
"text": "イ",
"consonant": null,
"consonant_length": null,
"vowel": "i",
"vowel_length": 0.08036813884973526,
"pitch": 5.589580535888672
},
{
"text": "ティ",
"consonant": "t",
"consonant_length": 0.08351730555295944,
"vowel": "i",
"vowel_length": 0.10825805366039276,
"pitch": 5.695505142211914
},
{
"text": "イ",
"consonant": null,
"consonant_length": null,
"vowel": "i",
"vowel_length": 0.11330384761095047,
"pitch": 5.781085014343262
},
{
"text": "エ",
"consonant": null,
"consonant_length": null,
"vowel": "e",
"vowel_length": 0.12365090101957321,
"pitch": 5.901239395141602
},
{
"text": "イ",
"consonant": null,
"consonant_length": null,
"vowel": "i",
"vowel_length": 0.086192287504673,
"pitch": 5.946781158447266
},
{
"text": "チ",
"consonant": "ch",
"consonant_length": 0.10664952546358109,
"vowel": "i",
"vowel_length": 0.0770849660038948,
"pitch": 6.003418922424316
},
{
"text": "オ",
"consonant": null,
"consonant_length": null,
"vowel": "o",
"vowel_length": 0.11828164011240005,
"pitch": 6.050207614898682
},
{
"text": "オ",
"consonant": null,
"consonant_length": null,
"vowel": "o",
"vowel_length": 0.12116283923387527,
"pitch": 6.090602874755859
},
{
"text": "エ",
"consonant": null,
"consonant_length": null,
"vowel": "e",
"vowel_length": 0.09290414303541183,
"pitch": 6.113485336303711
},
{
"text": "ヌ",
"consonant": "n",
"consonant_length": 0.06207815557718277,
"vowel": "u",
"vowel_length": 0.08421283215284348,
"pitch": 6.0752058029174805
},
{
"text": "ワ",
"consonant": "w",
"consonant_length": 0.05523707717657089,
"vowel": "a",
"vowel_length": 0.15830503404140472,
"pitch": 5.773069858551025
}
],
"accent": 14,
"pause_mora": {
"text": "、",
"consonant": null,
"consonant_length": null,
"vowel": "pau",
"vowel_length": 0.31535759568214417,
"pitch": 0
},
"is_interrogative": false
},
{
"moras": [
{
"text": "デ",
"consonant": "d",
"consonant_length": 0.0791788324713707,
"vowel": "e",
"vowel_length": 0.10064834356307983,
"pitch": 5.590644359588623
},
{
"text": "エ",
"consonant": null,
"consonant_length": null,
"vowel": "e",
"vowel_length": 0.11976485699415207,
"pitch": 5.983151435852051
},
{
"text": "タ",
"consonant": "t",
"consonant_length": 0.05507245659828186,
"vowel": "a",
"vowel_length": 0.08500064164400101,
"pitch": 6.126258850097656
},
{
"text": "サ",
"consonant": "s",
"consonant_length": 0.09527775645256042,
"vowel": "a",
"vowel_length": 0.12112396210432053,
"pitch": 6.208809852600098
},
{
"text": "イ",
"consonant": null,
"consonant_length": null,
"vowel": "i",
"vowel_length": 0.09715259075164795,
"pitch": 6.204296588897705
},
{
"text": "エ",
"consonant": null,
"consonant_length": null,
"vowel": "e",
"vowel_length": 0.13706454634666443,
"pitch": 6.191609859466553
},
{
"text": "ン",
"consonant": null,
"consonant_length": null,
"vowel": "N",
"vowel_length": 0.07458440214395523,
"pitch": 6.078517913818359
},
{
"text": "ス",
"consonant": "s",
"consonant_length": 0.052474744617938995,
"vowel": "u",
"vowel_length": 0.061277613043785095,
"pitch": 5.852043151855469
},
{
"text": "ノ",
"consonant": "n",
"consonant_length": 0.05265315622091293,
"vowel": "o",
"vowel_length": 0.10409725457429886,
"pitch": 5.609150409698486
}
],
"accent": 4,
"pause_mora": null,
"is_interrogative": false
},
{
"moras": [
{
"text": "ブ",
"consonant": "b",
"consonant_length": 0.07080589979887009,
"vowel": "u",
"vowel_length": 0.11524605005979538,
"pitch": 5.600069046020508
},
{
"text": "ン",
"consonant": null,
"consonant_length": null,
"vowel": "N",
"vowel_length": 0.08382975310087204,
"pitch": 5.960224628448486
},
{
"text": "ヤ",
"consonant": "y",
"consonant_length": 0.04291022568941116,
"vowel": "a",
"vowel_length": 0.08775778114795685,
"pitch": 5.932217597961426
},
{
"text": "デ",
"consonant": "d",
"consonant_length": 0.05490022897720337,
"vowel": "e",
"vowel_length": 0.07744181156158447,
"pitch": 5.629036903381348
}
],
"accent": 1,
"pause_mora": null,
"is_interrogative": false
},
{
"moras": [
{
"text": "ア",
"consonant": null,
"consonant_length": null,
"vowel": "a",
"vowel_length": 0.1484280377626419,
"pitch": 5.438242435455322
},
{
"text": "ッ",
"consonant": null,
"consonant_length": null,
"vowel": "cl",
"vowel_length": 0.07073838263750076,
"pitch": 0
},
{
"text": "ト",
"consonant": "t",
"consonant_length": 0.06211525946855545,
"vowel": "o",
"vowel_length": 0.08425780385732651,
"pitch": 5.901317596435547
},
{
"text": "オ",
"consonant": null,
"consonant_length": null,
"vowel": "o",
"vowel_length": 0.09729136526584625,
"pitch": 5.969064712524414
},
{
"text": "テ",
"consonant": "t",
"consonant_length": 0.05663037300109863,
"vowel": "e",
"vowel_length": 0.08706774562597275,
"pitch": 6.025557994842529
},
{
"text": "キ",
"consonant": "k",
"consonant_length": 0.05723033845424652,
"vowel": "i",
"vowel_length": 0.047131218016147614,
"pitch": 6.072211265563965
},
{
"text": "ナ",
"consonant": "n",
"consonant_length": 0.04929392784833908,
"vowel": "a",
"vowel_length": 0.09504448622465134,
"pitch": 6.081124782562256
}
],
"accent": 7,
"pause_mora": null,
"is_interrogative": false
},
{
"moras": [
{
"text": "シ",
"consonant": "sh",
"consonant_length": 0.07959704101085663,
"vowel": "i",
"vowel_length": 0.0804038718342781,
"pitch": 6.144299507141113
},
{
"text": "ジ",
"consonant": "j",
"consonant_length": 0.07773097604513168,
"vowel": "i",
"vowel_length": 0.07793501764535904,
"pitch": 6.166547775268555
},
{
"text": "オ",
"consonant": null,
"consonant_length": null,
"vowel": "o",
"vowel_length": 0.1248132511973381,
"pitch": 6.084077835083008
}
],
"accent": 1,
"pause_mora": null,
"is_interrogative": false
},
{
"moras": [
{
"text": "エ",
"consonant": null,
"consonant_length": null,
"vowel": "e",
"vowel_length": 0.10704737156629562,
"pitch": 5.938371658325195
},
{
"text": "テ",
"consonant": "t",
"consonant_length": 0.06342015415430069,
"vowel": "e",
"vowel_length": 0.10860324651002884,
"pitch": 5.799697399139404
}
],
"accent": 1,
"pause_mora": null,
"is_interrogative": false
},
{
"moras": [
{
"text": "イ",
"consonant": null,
"consonant_length": null,
"vowel": "i",
"vowel_length": 0.07069236785173416,
"pitch": 5.629647731781006
},
{
"text": "ル",
"consonant": "r",
"consonant_length": 0.04299464076757431,
"vowel": "u",
"vowel_length": 0.08757925778627396,
"pitch": 5.4380388259887695
}
],
"accent": 2,
"pause_mora": null,
"is_interrogative": false
},
{
"moras": [
{
"text": "プ",
"consonant": "p",
"consonant_length": 0.06904631108045578,
"vowel": "u",
"vowel_length": 0.0594872310757637,
"pitch": 5.409878730773926
},
{
"text": "ロ",
"consonant": "r",
"consonant_length": 0.03905259817838669,
"vowel": "o",
"vowel_length": 0.0911882072687149,
"pitch": 5.505580902099609
},
{
"text": "グ",
"consonant": "g",
"consonant_length": 0.06128575652837753,
"vowel": "u",
"vowel_length": 0.0694003701210022,
"pitch": 5.707350254058838
},
{
"text": "ラ",
"consonant": "r",
"consonant_length": 0.04014255106449127,
"vowel": "a",
"vowel_length": 0.10773012787103653,
"pitch": 5.847405433654785
},
{
"text": "ミ",
"consonant": "m",
"consonant_length": 0.0762600377202034,
"vowel": "i",
"vowel_length": 0.11173645406961441,
"pitch": 5.948212623596191
},
{
"text": "ン",
"consonant": null,
"consonant_length": null,
"vowel": "N",
"vowel_length": 0.0764211043715477,
"pitch": 6.020254135131836
},
{
"text": "グ",
"consonant": "g",
"consonant_length": 0.04909630864858627,
"vowel": "u",
"vowel_length": 0.07857421785593033,
"pitch": 6.063906669616699
},
{
"text": "ゲ",
"consonant": "g",
"consonant_length": 0.06576967239379883,
"vowel": "e",
"vowel_length": 0.15068034827709198,
"pitch": 6.157106399536133
},
{
"text": "ン",
"consonant": null,
"consonant_length": null,
"vowel": "N",
"vowel_length": 0.06534631550312042,
"pitch": 6.226661682128906
},
{
"text": "ゴ",
"consonant": "g",
"consonant_length": 0.044613003730773926,
"vowel": "o",
"vowel_length": 0.07870868593454361,
"pitch": 6.147334575653076
},
{
"text": "デ",
"consonant": "d",
"consonant_length": 0.05973442643880844,
"vowel": "e",
"vowel_length": 0.10202005505561829,
"pitch": 5.862994194030762
},
{
"text": "ス",
"consonant": "s",
"consonant_length": 0.06595473736524582,
"vowel": "U",
"vowel_length": 0.09269415587186813,
"pitch": 0
}
],
"accent": 8,
"pause_mora": {
"text": "、",
"consonant": null,
"consonant_length": null,
"vowel": "pau",
"vowel_length": 0.36131060123443604,
"pitch": 0
},
"is_interrogative": false
},
{
"moras": [
{
"text": "コ",
"consonant": "k",
"consonant_length": 0.09516528993844986,
"vowel": "o",
"vowel_length": 0.06781398504972458,
"pitch": 5.505016326904297
},
{
"text": "ノ",
"consonant": "n",
"consonant_length": 0.05592925101518631,
"vowel": "o",
"vowel_length": 0.09946737438440323,
"pitch": 5.748182773590088
}
],
"accent": 2,
"pause_mora": null,
"is_interrogative": false
},
{
"moras": [
{
"text": "ゲ",
"consonant": "g",
"consonant_length": 0.06155547499656677,
"vowel": "e",
"vowel_length": 0.14945416152477264,
"pitch": 6.0509209632873535
},
{
"text": "ン",
"consonant": null,
"consonant_length": null,
"vowel": "N",
"vowel_length": 0.07178374379873276,
"pitch": 6.172045707702637
},
{
"text": "ゴ",
"consonant": "g",
"consonant_length": 0.04337330907583237,
"vowel": "o",
"vowel_length": 0.07962529361248016,
"pitch": 6.088384628295898
},
{
"text": "ノ",
"consonant": "n",
"consonant_length": 0.05300658196210861,
"vowel": "o",
"vowel_length": 0.09286292642354965,
"pitch": 5.735976219177246
}
],
"accent": 1,
"pause_mora": null,
"is_interrogative": false
},
{
"moras": [
{
"text": "ツ",
"consonant": "ts",
"consonant_length": 0.09484250098466873,
"vowel": "u",
"vowel_length": 0.06796897202730179,
"pitch": 5.496433258056641
},
{
"text": "ヨ",
"consonant": "y",
"consonant_length": 0.056066691875457764,
"vowel": "o",
"vowel_length": 0.08405620604753494,
"pitch": 5.579542636871338
},
{
"text": "ミ",
"consonant": "m",
"consonant_length": 0.06550165265798569,
"vowel": "i",
"vowel_length": 0.07494194060564041,
"pitch": 5.686807155609131
},
{
"text": "ワ",
"consonant": "w",
"consonant_length": 0.07060857862234116,
"vowel": "a",
"vowel_length": 0.16270872950553894,
"pitch": 5.654221057891846
}
],
"accent": 3,
"pause_mora": {
"text": "、",
"consonant": null,
"consonant_length": null,
"vowel": "pau",
"vowel_length": 0.30311113595962524,
"pitch": 0
},
"is_interrogative": false
},
{
"moras": [
{
"text": "ピ",
"consonant": "p",
"consonant_length": 0.10981322079896927,
"vowel": "i",
"vowel_length": 0.10163123905658722,
"pitch": 5.597873687744141
},
{
"text": "イ",
"consonant": null,
"consonant_length": null,
"vowel": "i",
"vowel_length": 0.1164672002196312,
"pitch": 5.80003023147583
},
{
"text": "エ",
"consonant": null,
"consonant_length": null,
"vowel": "e",
"vowel_length": 0.1419031172990799,
"pitch": 5.9253668785095215
},
{
"text": "エ",
"consonant": null,
"consonant_length": null,
"vowel": "e",
"vowel_length": 0.10216100513935089,
"pitch": 5.980390548706055
},
{
"text": "エ",
"consonant": null,
"consonant_length": null,
"vowel": "e",
"vowel_length": 0.10106061398983002,
"pitch": 6.005149841308594
},
{
"text": "ヌ",
"consonant": "n",
"consonant_length": 0.0633179098367691,
"vowel": "u",
"vowel_length": 0.07938491553068161,
"pitch": 5.989894866943359
},
{
"text": "ディ",
"consonant": "d",
"consonant_length": 0.06564389914274216,
"vowel": "i",
"vowel_length": 0.11610249429941177,
"pitch": 5.986366271972656
},
{
"text": "イ",
"consonant": null,
"consonant_length": null,
"vowel": "i",
"vowel_length": 0.12271050363779068,
"pitch": 6.056523323059082
},
{
"text": "エ",
"consonant": null,
"consonant_length": null,
"vowel": "e",
"vowel_length": 0.13144049048423767,
"pitch": 6.112059593200684
},
{
"text": "エ",
"consonant": null,
"consonant_length": null,
"vowel": "e",
"vowel_length": 0.096237413585186,
"pitch": 6.1477885246276855
},
{
"text": "エ",
"consonant": null,
"consonant_length": null,
"vowel": "e",
"vowel_length": 0.10501740127801895,
"pitch": 6.1619768142700195
},
{
"text": "ス",
"consonant": "s",
"consonant_length": 0.08188342303037643,
"vowel": "u",
"vowel_length": 0.07447949796915054,
"pitch": 6.10214376449585
},
{
"text": "ヤ",
"consonant": "y",
"consonant_length": 0.06230451911687851,
"vowel": "a",
"vowel_length": 0.10508281737565994,
"pitch": 5.840000629425049
}
],
"accent": 13,
"pause_mora": null,
"is_interrogative": false
},
{
"moras": [
{
"text": "エ",
"consonant": null,
"consonant_length": null,
"vowel": "e",
"vowel_length": 0.08169642835855484,
"pitch": 5.53563928604126
},
{
"text": "ヌ",
"consonant": "n",
"consonant_length": 0.06293167918920517,
"vowel": "u",
"vowel_length": 0.09680631756782532,
"pitch": 5.568340301513672
},
{
"text": "ユ",
"consonant": "y",
"consonant_length": 0.08312869071960449,
"vowel": "u",
"vowel_length": 0.07910134643316269,
"pitch": 5.960270404815674
},
{
"text": "ウ",
"consonant": null,
"consonant_length": null,
"vowel": "u",
"vowel_length": 0.09418963640928268,
"pitch": 6.079494476318359
},
{
"text": "エ",
"consonant": null,
"consonant_length": null,
"vowel": "e",
"vowel_length": 0.1032748743891716,
"pitch": 6.142519950866699
},
{
"text": "ム",
"consonant": "m",
"consonant_length": 0.06540684401988983,
"vowel": "u",
"vowel_length": 0.09105082601308823,
"pitch": 6.1659955978393555
},
{
"text": "ピ",
"consonant": "p",
"consonant_length": 0.09199593216180801,
"vowel": "i",
"vowel_length": 0.10702002793550491,
"pitch": 6.20999813079834
},
{
"text": "イ",
"consonant": null,
"consonant_length": null,
"vowel": "i",
"vowel_length": 0.0886157974600792,
"pitch": 6.230593681335449
},
{
"text": "ワ",
"consonant": "w",
"consonant_length": 0.06495178490877151,
"vowel": "a",
"vowel_length": 0.09927791357040405,
"pitch": 6.178698539733887
},
{
"text": "イ",
"consonant": null,
"consonant_length": null,
"vowel": "i",
"vowel_length": 0.0695655420422554,
"pitch": 6.0586700439453125
},
{
"text": "ナ",
"consonant": "n",
"consonant_length": 0.056117571890354156,
"vowel": "a",
"vowel_length": 0.09347119182348251,
"pitch": 5.861547470092773
},
{
"text": "ド",
"consonant": "d",
"consonant_length": 0.051037050783634186,
"vowel": "o",
"vowel_length": 0.07053395360708237,
"pitch": 5.766819000244141
},
{
"text": "ノ",
"consonant": "n",
"consonant_length": 0.05312969535589218,
"vowel": "o",
"vowel_length": 0.08525709062814713,
"pitch": 5.638947486877441
}
],
"accent": 13,
"pause_mora": null,
"is_interrogative": false
},
{
"moras": [
{
"text": "ラ",
"consonant": "r",
"consonant_length": 0.040184758603572845,
"vowel": "a",
"vowel_length": 0.11606501787900925,
"pitch": 5.69476318359375
},
{
"text": "イ",
"consonant": null,
"consonant_length": null,
"vowel": "i",
"vowel_length": 0.07164096087217331,
"pitch": 5.988162994384766
},
{
"text": "ブ",
"consonant": "b",
"consonant_length": 0.06024050712585449,
"vowel": "u",
"vowel_length": 0.07096447795629501,
"pitch": 5.826568603515625
},
{
"text": "ラ",
"consonant": "r",
"consonant_length": 0.03622134029865265,
"vowel": "a",
"vowel_length": 0.1108507588505745,
"pitch": 5.683210372924805
},
{
"text": "リ",
"consonant": "r",
"consonant_length": 0.044067420065402985,
"vowel": "i",
"vowel_length": 0.07653085142374039,
"pitch": 5.61351203918457
},
{
"text": "オ",
"consonant": null,
"consonant_length": null,
"vowel": "o",
"vowel_length": 0.12777049839496613,
"pitch": 5.510312080383301
}
],
"accent": 1,
"pause_mora": null,
"is_interrogative": false
},
{
"moras": [
{
"text": "カ",
"consonant": "k",
"consonant_length": 0.059260688722133636,
"vowel": "a",
"vowel_length": 0.09181319922208786,
"pitch": 5.363773822784424
},
{
"text": "ツ",
"consonant": "ts",
"consonant_length": 0.0855044573545456,
"vowel": "u",
"vowel_length": 0.0635102242231369,
"pitch": 5.595673561096191
},
{
"text": "ヨ",
"consonant": "y",
"consonant_length": 0.06259886175394058,
"vowel": "o",
"vowel_length": 0.07523084431886673,
"pitch": 5.753589630126953
},
{
"text": "オ",
"consonant": null,
"consonant_length": null,
"vowel": "o",
"vowel_length": 0.08637432008981705,
"pitch": 5.854658126831055
}
],
"accent": 4,
"pause_mora": null,
"is_interrogative": false
},
{
"moras": [
{
"text": "ス",
"consonant": "s",
"consonant_length": 0.08577578514814377,
"vowel": "u",
"vowel_length": 0.06377749890089035,
"pitch": 5.908664703369141
},
{
"text": "ル",
"consonant": "r",
"consonant_length": 0.03849299997091293,
"vowel": "u",
"vowel_length": 0.0868462398648262,
"pitch": 5.9237494468688965
}
],
"accent": 2,
"pause_mora": null,
"is_interrogative": false
},
{
"moras": [
{
"text": "コ",
"consonant": "k",
"consonant_length": 0.058107271790504456,
"vowel": "o",
"vowel_length": 0.07558708637952805,
"pitch": 6.022544860839844
},
{
"text": "ト",
"consonant": "t",
"consonant_length": 0.063152976334095,
"vowel": "o",
"vowel_length": 0.08366978913545609,
"pitch": 6.088356018066406
},
{
"text": "デ",
"consonant": "d",
"consonant_length": 0.05032471567392349,
"vowel": "e",
"vowel_length": 0.1577090173959732,
"pitch": 5.825331687927246
}
],
"accent": 2,
"pause_mora": {
"text": "、",
"consonant": null,
"consonant_length": null,
"vowel": "pau",
"vowel_length": 0.2815333306789398,
"pitch": 0
},
"is_interrogative": false
},
{
"moras": [
{
"text": "タ",
"consonant": "t",
"consonant_length": 0.0918598398566246,
"vowel": "a",
"vowel_length": 0.13152170181274414,
"pitch": 5.567140579223633
},
{
"text": "イ",
"consonant": null,
"consonant_length": null,
"vowel": "i",
"vowel_length": 0.08161205798387527,
"pitch": 5.806894302368164
},
{
"text": "リョ",
"consonant": "ry",
"consonant_length": 0.07669024914503098,
"vowel": "o",
"vowel_length": 0.07279934734106064,
"pitch": 5.973067283630371
},
{
"text": "オ",
"consonant": null,
"consonant_length": null,
"vowel": "o",
"vowel_length": 0.07849704474210739,
"pitch": 6.037090301513672
},
{
"text": "ノ",
"consonant": "n",
"consonant_length": 0.05377703160047531,
"vowel": "o",
"vowel_length": 0.09243559092283249,
"pitch": 6.064888000488281
}
],
"accent": 5,
"pause_mora": null,
"is_interrogative": false
},
{
"moras": [
{
"text": "デ",
"consonant": "d",
"consonant_length": 0.0554865226149559,
"vowel": "e",
"vowel_length": 0.08222859352827072,
"pitch": 6.04914665222168
},
{
"text": "エ",
"consonant": null,
"consonant_length": null,
"vowel": "e",
"vowel_length": 0.11807743459939957,
"pitch": 6.169621467590332
},
{
"text": "タ",
"consonant": "t",
"consonant_length": 0.06642961502075195,
"vowel": "a",
"vowel_length": 0.09589282423257828,
"pitch": 6.0314788818359375
},
{
"text": "オ",
"consonant": null,
"consonant_length": null,
"vowel": "o",
"vowel_length": 0.11849305778741837,
"pitch": 5.717338562011719
}
],
"accent": 1,
"pause_mora": null,
"is_interrogative": false
},
{
"moras": [
{
"text": "コ",
"consonant": "k",
"consonant_length": 0.07630421966314316,
"vowel": "o",
"vowel_length": 0.08465094119310379,
"pitch": 5.647669792175293
},
{
"text": "オ",
"consonant": null,
"consonant_length": null,
"vowel": "o",
"vowel_length": 0.0902937725186348,
"pitch": 5.809906959533691
},
{
"text": "リ",
"consonant": "r",
"consonant_length": 0.033055104315280914,
"vowel": "i",
"vowel_length": 0.08749037235975266,
"pitch": 5.916666507720947
},
{
"text": "ツ",
"consonant": "ts",
"consonant_length": 0.06321781128644943,
"vowel": "u",
"vowel_length": 0.04883772134780884,
"pitch": 6.011509895324707
},
{
"text": "テ",
"consonant": "t",
"consonant_length": 0.05526966601610184,
"vowel": "e",
"vowel_length": 0.10295989364385605,
"pitch": 6.051679611206055
},
{
"text": "キ",
"consonant": "k",
"consonant_length": 0.06465547531843185,
"vowel": "i",
"vowel_length": 0.05521322041749954,
"pitch": 6.051602840423584
},
{
"text": "ニ",
"consonant": "n",
"consonant_length": 0.05144401639699936,
"vowel": "i",
"vowel_length": 0.07746019214391708,
"pitch": 5.965667247772217
}
],
"accent": 4,
"pause_mora": null,
"is_interrogative": false
},
{
"moras": [
{
"text": "ショ",
"consonant": "sh",
"consonant_length": 0.08435368537902832,
"vowel": "o",
"vowel_length": 0.08475746214389801,
"pitch": 6.014983654022217
},
{
"text": "リ",
"consonant": "r",
"consonant_length": 0.03965490311384201,
"vowel": "i",
"vowel_length": 0.06808914989233017,
"pitch": 5.954586505889893
}
],
"accent": 1,
"pause_mora": null,
"is_interrogative": false
},
{
"moras": [
{
"text": "デ",
"consonant": "d",
"consonant_length": 0.05779454857110977,
"vowel": "e",
"vowel_length": 0.09093142300844193,
"pitch": 5.681344985961914
},
{
"text": "キ",
"consonant": "k",
"consonant_length": 0.07284275442361832,
"vowel": "i",
"vowel_length": 0.05097901076078415,
"pitch": 5.884847640991211
},
{
"text": "ル",
"consonant": "r",
"consonant_length": 0.04159467667341232,
"vowel": "u",
"vowel_length": 0.08827918022871017,
"pitch": 5.833207607269287
}
],
"accent": 2,
"pause_mora": null,
"is_interrogative": false
},
{
"moras": [
{
"text": "テ",
"consonant": "t",
"consonant_length": 0.06972762197256088,
"vowel": "e",
"vowel_length": 0.14346221089363098,
"pitch": 5.724767684936523
},
{
"text": "ン",
"consonant": null,
"consonant_length": null,
"vowel": "N",
"vowel_length": 0.07059536129236221,
"pitch": 5.789414882659912
},
{
"text": "ニ",
"consonant": "n",
"consonant_length": 0.05226953327655792,
"vowel": "i",
"vowel_length": 0.08222214132547379,
"pitch": 5.880636215209961
}
],
"accent": 3,
"pause_mora": null,
"is_interrogative": false
},
{
"moras": [
{
"text": "ア",
"consonant": null,
"consonant_length": null,
"vowel": "a",
"vowel_length": 0.1325875073671341,
"pitch": 5.899967193603516
},
{
"text": "リ",
"consonant": "r",
"consonant_length": 0.040087416768074036,
"vowel": "i",
"vowel_length": 0.06954852491617203,
"pitch": 5.934174537658691
},
{
"text": "マ",
"consonant": "m",
"consonant_length": 0.06134319305419922,
"vowel": "a",
"vowel_length": 0.10631085187196732,
"pitch": 6.031805038452148
},
{
"text": "ス",
"consonant": "s",
"consonant_length": 0.0679827630519867,
"vowel": "U",
"vowel_length": 0.08781219273805618,
"pitch": 0
}
],
"accent": 3,
"pause_mora": {
"text": "、",
"consonant": null,
"consonant_length": null,
"vowel": "pau",
"vowel_length": 0.37237635254859924,
"pitch": 0
},
"is_interrogative": false
},
{
"moras": [
{
"text": "タ",
"consonant": "t",
"consonant_length": 0.08162282407283783,
"vowel": "a",
"vowel_length": 0.10334759205579758,
"pitch": 5.496123313903809
},
{
"text": "ト",
"consonant": "t",
"consonant_length": 0.07651012390851974,
"vowel": "o",
"vowel_length": 0.12081205099821091,
"pitch": 6.068231105804443
},
{
"text": "エ",
"consonant": null,
"consonant_length": null,
"vowel": "e",
"vowel_length": 0.10037042200565338,
"pitch": 6.100641250610352
},
{
"text": "バ",
"consonant": "b",
"consonant_length": 0.05835694074630737,
"vowel": "a",
"vowel_length": 0.16030515730381012,
"pitch": 5.81519889831543
}
],
"accent": 2,
"pause_mora": {
"text": "、",
"consonant": null,
"consonant_length": null,
"vowel": "pau",
"vowel_length": 0.3114999830722809,
"pitch": 0
},
"is_interrogative": false
},
{
"moras": [
{
"text": "ス",
"consonant": "s",
"consonant_length": 0.13666947185993195,
"vowel": "u",
"vowel_length": 0.07120925933122635,
"pitch": 5.711211204528809
},
{
"text": "ウ",
"consonant": null,
"consonant_length": null,
"vowel": "u",
"vowel_length": 0.08350200206041336,
"pitch": 6.03699254989624
},
{
"text": "ヒャ",
"consonant": "hy",
"consonant_length": 0.08444409817457199,
"vowel": "a",
"vowel_length": 0.09995033591985703,
"pitch": 6.122326850891113
},
{
"text": "ク",
"consonant": "k",
"consonant_length": 0.06465062499046326,
"vowel": "u",
"vowel_length": 0.05612628906965256,
"pitch": 5.931809425354004
}
],
"accent": 4,
"pause_mora": null,
"is_interrogative": false
},
{
"moras": [
{
"text": "マ",
"consonant": "m",
"consonant_length": 0.06436198949813843,
"vowel": "a",
"vowel_length": 0.1505230814218521,
"pitch": 5.8308210372924805
},
{
"text": "ン",
"consonant": null,
"consonant_length": null,
"vowel": "N",
"vowel_length": 0.08035620301961899,
"pitch": 6.02802848815918
},
{
"text": "ギョ",
"consonant": "gy",
"consonant_length": 0.06927255541086197,
"vowel": "o",
"vowel_length": 0.07102874666452408,
"pitch": 6.128126621246338
},
{
"text": "オ",
"consonant": null,
"consonant_length": null,
"vowel": "o",
"vowel_length": 0.08002924174070358,
"pitch": 6.166011333465576
},
{
"text": "ノ",
"consonant": "n",
"consonant_length": 0.05385180562734604,
"vowel": "o",
"vowel_length": 0.09280390292406082,
"pitch": 6.009934425354004
}
],
"accent": 2,
"pause_mora": null,
"is_interrogative": false
},
{
"moras": [
{
"text": "デ",
"consonant": "d",
"consonant_length": 0.055982060730457306,
"vowel": "e",
"vowel_length": 0.08279933035373688,
"pitch": 5.893813133239746
},
{
"text": "エ",
"consonant": null,
"consonant_length": null,
"vowel": "e",
"vowel_length": 0.11860852688550949,
"pitch": 6.0262451171875
},
{
"text": "タ",
"consonant": "t",
"consonant_length": 0.06447722762823105,
"vowel": "a",
"vowel_length": 0.09222430735826492,
"pitch": 5.816139221191406
},
{
"text": "オ",
"consonant": null,
"consonant_length": null,
"vowel": "o",
"vowel_length": 0.1649218052625656,
"pitch": 5.475996971130371
}
],
"accent": 1,
"pause_mora": {
"text": "、",
"consonant": null,
"consonant_length": null,
"vowel": "pau",
"vowel_length": 0.3254386782646179,
"pitch": 0
},
"is_interrogative": false
},
{
"moras": [
{
"text": "ディ",
"consonant": "d",
"consonant_length": 0.0680452212691307,
"vowel": "i",
"vowel_length": 0.1068062037229538,
"pitch": 5.402994155883789
},
{
"text": "イ",
"consonant": null,
"consonant_length": null,
"vowel": "i",
"vowel_length": 0.12195143848657608,
"pitch": 5.49931526184082
},
{
"text": "エ",
"consonant": null,
"consonant_length": null,
"vowel": "e",
"vowel_length": 0.12862417101860046,
"pitch": 5.546713829040527
},
{
"text": "エ",
"consonant": null,
"consonant_length": null,
"vowel": "e",
"vowel_length": 0.1199302151799202,
"pitch": 5.484532356262207
},
{
"text": "ティ",
"consonant": "t",
"consonant_length": 0.08103083074092865,
"vowel": "i",
"vowel_length": 0.11284735053777695,
"pitch": 5.34913969039917
},
{
"text": "イ",
"consonant": null,
"consonant_length": null,
"vowel": "i",
"vowel_length": 0.11909497529268265,
"pitch": 5.298459529876709
},
{
"text": "エ",
"consonant": null,
"consonant_length": null,
"vowel": "e",
"vowel_length": 0.14273525774478912,
"pitch": 5.295376777648926
},
{
"text": "エ",
"consonant": null,
"consonant_length": null,
"vowel": "e",
"vowel_length": 0.10519509762525558,
"pitch": 5.322658061981201
},
{
"text": "エ",
"consonant": null,
"consonant_length": null,
"vowel": "e",
"vowel_length": 0.11244089901447296,
"pitch": 5.397887706756592
},
{
"text": "フ",
"consonant": "f",
"consonant_length": 0.0815981850028038,
"vowel": "u",
"vowel_length": 0.07383427768945694,
"pitch": 5.464523792266846
},
{
"text": "ア",
"consonant": null,
"consonant_length": null,
"vowel": "a",
"vowel_length": 0.11367859691381454,
"pitch": 5.537107467651367
},
{
"text": "ア",
"consonant": null,
"consonant_length": null,
"vowel": "a",
"vowel_length": 0.12357815355062485,
"pitch": 5.6989288330078125
},
{
"text": "ル",
"consonant": "r",
"consonant_length": 0.04975270479917526,
"vowel": "u",
"vowel_length": 0.09683375060558319,
"pitch": 5.804760456085205
},
{
"text": "エ",
"consonant": null,
"consonant_length": null,
"vowel": "e",
"vowel_length": 0.12751029431819916,
"pitch": 5.902454376220703
},
{
"text": "エ",
"consonant": null,
"consonant_length": null,
"vowel": "e",
"vowel_length": 0.11559361219406128,
"pitch": 5.964119911193848
},
{
"text": "エ",
"consonant": null,
"consonant_length": null,
"vowel": "e",
"vowel_length": 0.0972735658288002,
"pitch": 5.996393203735352
},
{
"text": "ム",
"consonant": "m",
"consonant_length": 0.07872456312179565,
"vowel": "u",
"vowel_length": 0.12620626389980316,
"pitch": 5.9651947021484375
},
{
"text": "イ",
"consonant": null,
"consonant_length": null,
"vowel": "i",
"vowel_length": 0.11973222345113754,
"pitch": 5.82058048248291
},
{
"text": "イ",
"consonant": null,
"consonant_length": null,
"vowel": "i",
"vowel_length": 0.13203099370002747,
"pitch": 5.682436943054199
}
],
"accent": 19,
"pause_mora": {
"text": "、",
"consonant": null,
"consonant_length": null,
"vowel": "pau",
"vowel_length": 0.4024682641029358,
"pitch": 0
},
"is_interrogative": false
},
{
"moras": [
{
"text": "ト",
"consonant": "t",
"consonant_length": 0.09510844200849533,
"vowel": "o",
"vowel_length": 0.10374713689088821,
"pitch": 5.656457424163818
}
],
"accent": 1,
"pause_mora": null,
"is_interrogative": false
},
{
"moras": [
{
"text": "シ",
"consonant": "sh",
"consonant_length": 0.02902268059551716,
"vowel": "I",
"vowel_length": 0.06720047444105148,
"pitch": 0
},
{
"text": "テ",
"consonant": "t",
"consonant_length": 0.07240323722362518,
"vowel": "e",
"vowel_length": 0.08337840437889099,
"pitch": 5.890864849090576
}
],
"accent": 2,
"pause_mora": null,
"is_interrogative": false
},
{
"moras": [
{
"text": "ア",
"consonant": null,
"consonant_length": null,
"vowel": "a",
"vowel_length": 0.11240630596876144,
"pitch": 5.839886665344238
},
{
"text": "ツ",
"consonant": "ts",
"consonant_length": 0.06770000606775284,
"vowel": "U",
"vowel_length": 0.05765476077795029,
"pitch": 0
},
{
"text": "カ",
"consonant": "k",
"consonant_length": 0.08871427923440933,
"vowel": "a",
"vowel_length": 0.10973168909549713,
"pitch": 5.9015727043151855
},
{
"text": "イ",
"consonant": null,
"consonant_length": null,
"vowel": "i",
"vowel_length": 0.10946480184793472,
"pitch": 5.884652614593506
}
],
"accent": 4,
"pause_mora": {
"text": "、",
"consonant": null,
"consonant_length": null,
"vowel": "pau",
"vowel_length": 0.34261441230773926,
"pitch": 0
},
"is_interrogative": false
},
{
"moras": [
{
"text": "ス",
"consonant": "s",
"consonant_length": 0.15427082777023315,
"vowel": "u",
"vowel_length": 0.08693871647119522,
"pitch": 5.603739261627197
},
{
"text": "ウ",
"consonant": null,
"consonant_length": null,
"vowel": "u",
"vowel_length": 0.10004929453134537,
"pitch": 5.89428186416626
},
{
"text": "コ",
"consonant": "k",
"consonant_length": 0.08246496319770813,
"vowel": "o",
"vowel_length": 0.08348247408866882,
"pitch": 6.07985258102417
},
{
"text": "オ",
"consonant": null,
"consonant_length": null,
"vowel": "o",
"vowel_length": 0.08473152667284012,
"pitch": 6.094343185424805
},
{
"text": "ノ",
"consonant": "n",
"consonant_length": 0.0549691841006279,
"vowel": "o",
"vowel_length": 0.10089285671710968,
"pitch": 6.083752155303955
}
],
"accent": 5,
"pause_mora": null,
"is_interrogative": false
},
{
"moras": [
{
"text": "コ",
"consonant": "k",
"consonant_length": 0.07581844180822372,
"vowel": "o",
"vowel_length": 0.08187878876924515,
"pitch": 6.148675918579102
},
{
"text": "オ",
"consonant": null,
"consonant_length": null,
"vowel": "o",
"vowel_length": 0.08600953966379166,
"pitch": 6.159041404724121
},
{
"text": "ド",
"consonant": "d",
"consonant_length": 0.050996892154216766,
"vowel": "o",
"vowel_length": 0.07150810956954956,
"pitch": 6.022956848144531
},
{
"text": "デ",
"consonant": "d",
"consonant_length": 0.04721209406852722,
"vowel": "e",
"vowel_length": 0.09043016284704208,
"pitch": 5.6642866134643555
}
],
"accent": 1,
"pause_mora": null,
"is_interrogative": false
},
{
"moras": [
{
"text": "ト",
"consonant": "t",
"consonant_length": 0.07408378273248672,
"vowel": "o",
"vowel_length": 0.09296489506959915,
"pitch": 5.582301139831543
},
{
"text": "オ",
"consonant": null,
"consonant_length": null,
"vowel": "o",
"vowel_length": 0.11299329251050949,
"pitch": 5.7562761306762695
},
{
"text": "ケ",
"consonant": "k",
"consonant_length": 0.07210972905158997,
"vowel": "e",
"vowel_length": 0.08987338095903397,
"pitch": 6.031528949737549
},
{
"text": "エ",
"consonant": null,
"consonant_length": null,
"vowel": "e",
"vowel_length": 0.09458611160516739,
"pitch": 6.047226905822754
},
{
"text": "テ",
"consonant": "t",
"consonant_length": 0.04981071501970291,
"vowel": "e",
"vowel_length": 0.08100578933954239,
"pitch": 5.972437858581543
},
{
"text": "キ",
"consonant": "k",
"consonant_length": 0.06171040236949921,
"vowel": "i",
"vowel_length": 0.05161810666322708,
"pitch": 5.9557271003723145
},
{
"text": "ナ",
"consonant": "n",
"consonant_length": 0.05521527677774429,
"vowel": "a",
"vowel_length": 0.10474193841218948,
"pitch": 5.900506973266602
}
],
"accent": 4,
"pause_mora": null,
"is_interrogative": false
},
{
"moras": [
{
"text": "ブ",
"consonant": "b",
"consonant_length": 0.061573319137096405,
"vowel": "u",
"vowel_length": 0.09780359268188477,
"pitch": 5.761079788208008
},
{
"text": "ン",
"consonant": null,
"consonant_length": null,
"vowel": "N",
"vowel_length": 0.07809194177389145,
"pitch": 5.871772766113281
},
{
"text": "セ",
"consonant": "s",
"consonant_length": 0.058039918541908264,
"vowel": "e",
"vowel_length": 0.0917612686753273,
"pitch": 5.974715232849121
},
{
"text": "キ",
"consonant": "k",
"consonant_length": 0.07487434893846512,
"vowel": "i",
"vowel_length": 0.09070887416601181,
"pitch": 6.020077228546143
},
{
"text": "オ",
"consonant": null,
"consonant_length": null,
"vowel": "o",
"vowel_length": 0.10621538013219833,
"pitch": 6.0301008224487305
}
],
"accent": 5,
"pause_mora": null,
"is_interrogative": false
},
{
"moras": [
{
"text": "オ",
"consonant": null,
"consonant_length": null,
"vowel": "o",
"vowel_length": 0.11164122074842453,
"pitch": 5.978598117828369
},
{
"text": "コ",
"consonant": "k",
"consonant_length": 0.06790011376142502,
"vowel": "o",
"vowel_length": 0.07781907171010971,
"pitch": 6.036202430725098
},
{
"text": "ナ",
"consonant": "n",
"consonant_length": 0.06320025771856308,
"vowel": "a",
"vowel_length": 0.10387641936540604,
"pitch": 6.078321933746338
},
{
"text": "ウ",
"consonant": null,
"consonant_length": null,
"vowel": "u",
"vowel_length": 0.08072009682655334,
"pitch": 6.104283809661865
}
],
"accent": 4,
"pause_mora": null,
"is_interrogative": false
},
{
"moras": [
{
"text": "コ",
"consonant": "k",
"consonant_length": 0.05676496773958206,
"vowel": "o",
"vowel_length": 0.07939944416284561,
"pitch": 6.135366916656494
},
{
"text": "ト",
"consonant": "t",
"consonant_length": 0.05339302867650986,
"vowel": "o",
"vowel_length": 0.06621221452951431,
"pitch": 6.182483196258545
},
{
"text": "ガ",
"consonant": "g",
"consonant_length": 0.053399719297885895,
"vowel": "a",
"vowel_length": 0.09827200323343277,
"pitch": 5.884228229522705
}
],
"accent": 2,
"pause_mora": null,
"is_interrogative": false
},
{
"moras": [
{
"text": "カ",
"consonant": "k",
"consonant_length": 0.06440519541501999,
"vowel": "a",
"vowel_length": 0.07668773084878922,
"pitch": 5.465328693389893
},
{
"text": "ノ",
"consonant": "n",
"consonant_length": 0.059392645955085754,
"vowel": "o",
"vowel_length": 0.08279576152563095,
"pitch": 5.557544708251953
},
{
"text": "オ",
"consonant": null,
"consonant_length": null,
"vowel": "o",
"vowel_length": 0.08527524024248123,
"pitch": 5.898283004760742
},
{
"text": "デ",
"consonant": "d",
"consonant_length": 0.05418810993432999,
"vowel": "e",
"vowel_length": 0.10034583508968353,
"pitch": 6.013369560241699
},
{
"text": "ス",
"consonant": "s",
"consonant_length": 0.06657551229000092,
"vowel": "U",
"vowel_length": 0.08997533470392227,
"pitch": 0
}
],
"accent": 4,
"pause_mora": {
"text": "、",
"consonant": null,
"consonant_length": null,
"vowel": "pau",
"vowel_length": 0.393724650144577,
"pitch": 0
},
"is_interrogative": false
},
{
"moras": [
{
"text": "マ",
"consonant": "m",
"consonant_length": 0.07852453738451004,
"vowel": "a",
"vowel_length": 0.10260742157697678,
"pitch": 5.4748759269714355
},
{
"text": "タ",
"consonant": "t",
"consonant_length": 0.06637547165155411,
"vowel": "a",
"vowel_length": 0.17029573023319244,
"pitch": 5.740240573883057
}
],
"accent": 2,
"pause_mora": {
"text": "、",
"consonant": null,
"consonant_length": null,
"vowel": "pau",
"vowel_length": 0.4005759060382843,
"pitch": 0
},
"is_interrogative": false
},
{
"moras": [
{
"text": "エ",
"consonant": null,
"consonant_length": null,
"vowel": "e",
"vowel_length": 0.11029182374477386,
"pitch": 5.235879898071289
},
{
"text": "ム",
"consonant": "m",
"consonant_length": 0.06808509677648544,
"vowel": "u",
"vowel_length": 0.10545530170202255,
"pitch": 5.069831848144531
},
{
"text": "エ",
"consonant": null,
"consonant_length": null,
"vowel": "e",
"vowel_length": 0.1250702291727066,
"pitch": 5.065542221069336
},
{
"text": "エ",
"consonant": null,
"consonant_length": null,
"vowel": "e",
"vowel_length": 0.13449238240718842,
"pitch": 5.202377796173096
},
{
"text": "ティ",
"consonant": "t",
"consonant_length": 0.0806393101811409,
"vowel": "i",
"vowel_length": 0.08477061241865158,
"pitch": 5.345054626464844
},
{
"text": "イ",
"consonant": null,
"consonant_length": null,
"vowel": "i",
"vowel_length": 0.10038428753614426,
"pitch": 5.283383369445801
},
{
"text": "ピ",
"consonant": "p",
"consonant_length": 0.0941154882311821,
"vowel": "i",
"vowel_length": 0.09805998206138611,
"pitch": 5.310829162597656
},
{
"text": "イ",
"consonant": null,
"consonant_length": null,
"vowel": "i",
"vowel_length": 0.09804878383874893,
"pitch": 5.293640613555908
},
{
"text": "エ",
"consonant": null,
"consonant_length": null,
"vowel": "e",
"vowel_length": 0.13016508519649506,
"pitch": 5.3269944190979
},
{
"text": "ル",
"consonant": "r",
"consonant_length": 0.03694651275873184,
"vowel": "u",
"vowel_length": 0.0853460356593132,
"pitch": 5.374067783355713
},
{
"text": "オ",
"consonant": null,
"consonant_length": null,
"vowel": "o",
"vowel_length": 0.10468674451112747,
"pitch": 5.470987796783447
},
{
"text": "オ",
"consonant": null,
"consonant_length": null,
"vowel": "o",
"vowel_length": 0.12240449339151382,
"pitch": 5.58406925201416
},
{
"text": "ティ",
"consonant": "t",
"consonant_length": 0.08075124025344849,
"vowel": "i",
"vowel_length": 0.10278261452913284,
"pitch": 5.671698570251465
},
{
"text": "イ",
"consonant": null,
"consonant_length": null,
"vowel": "i",
"vowel_length": 0.09716808050870895,
"pitch": 5.76731014251709
},
{
"text": "エ",
"consonant": null,
"consonant_length": null,
"vowel": "e",
"vowel_length": 0.1306096762418747,
"pitch": 5.902414321899414
},
{
"text": "ル",
"consonant": "r",
"consonant_length": 0.04183109849691391,
"vowel": "u",
"vowel_length": 0.0818912461400032,
"pitch": 5.9774556159973145
},
{
"text": "ア",
"consonant": null,
"consonant_length": null,
"vowel": "a",
"vowel_length": 0.13850265741348267,
"pitch": 6.0510663986206055
},
{
"text": "イ",
"consonant": null,
"consonant_length": null,
"vowel": "i",
"vowel_length": 0.08670908957719803,
"pitch": 6.0619964599609375
},
{
"text": "ビ",
"consonant": "b",
"consonant_length": 0.07306835800409317,
"vowel": "i",
"vowel_length": 0.093415267765522,
"pitch": 5.99208927154541
},
{
"text": "イ",
"consonant": null,
"consonant_length": null,
"vowel": "i",
"vowel_length": 0.09047356992959976,
"pitch": 5.857359409332275
},
{
"text": "ヤ",
"consonant": "y",
"consonant_length": 0.06511764973402023,
"vowel": "a",
"vowel_length": 0.10088754445314407,
"pitch": 5.528942108154297
}
],
"accent": 21,
"pause_mora": null,
"is_interrogative": false
},
{
"moras": [
{
"text": "エ",
"consonant": null,
"consonant_length": null,
"vowel": "e",
"vowel_length": 0.08468779176473618,
"pitch": 4.988835334777832
},
{
"text": "ス",
"consonant": "s",
"consonant_length": 0.10707486420869827,
"vowel": "u",
"vowel_length": 0.09321016818284988,
"pitch": 5.2716474533081055
},
{
"text": "イ",
"consonant": null,
"consonant_length": null,
"vowel": "i",
"vowel_length": 0.12697571516036987,
"pitch": 5.438939571380615
},
{
"text": "イ",
"consonant": null,
"consonant_length": null,
"vowel": "i",
"vowel_length": 0.12282071262598038,
"pitch": 5.687595844268799
},
{
"text": "エ",
"consonant": null,
"consonant_length": null,
"vowel": "e",
"vowel_length": 0.11754288524389267,
"pitch": 5.82954740524292
},
{
"text": "エ",
"consonant": null,
"consonant_length": null,
"vowel": "e",
"vowel_length": 0.1028735563158989,
"pitch": 5.887553691864014
},
{
"text": "ビ",
"consonant": "b",
"consonant_length": 0.07227369397878647,
"vowel": "i",
"vowel_length": 0.1123097762465477,
"pitch": 5.865238189697266
},
{
"text": "イ",
"consonant": null,
"consonant_length": null,
"vowel": "i",
"vowel_length": 0.10485080629587173,
"pitch": 5.891720771789551
},
{
"text": "オ",
"consonant": null,
"consonant_length": null,
"vowel": "o",
"vowel_length": 0.12934981286525726,
"pitch": 5.989195823669434
},
{
"text": "オ",
"consonant": null,
"consonant_length": null,
"vowel": "o",
"vowel_length": 0.12075302749872208,
"pitch": 6.059061050415039
},
{
"text": "ア",
"consonant": null,
"consonant_length": null,
"vowel": "a",
"vowel_length": 0.11441995948553085,
"pitch": 6.1073102951049805
},
{
"text": "ア",
"consonant": null,
"consonant_length": null,
"vowel": "a",
"vowel_length": 0.11967862397432327,
"pitch": 6.132925033569336
},
{
"text": "ル",
"consonant": "r",
"consonant_length": 0.04475487023591995,
"vowel": "u",
"vowel_length": 0.08161141723394394,
"pitch": 6.140974998474121
},
{
"text": "エ",
"consonant": null,
"consonant_length": null,
"vowel": "e",
"vowel_length": 0.10019857436418533,
"pitch": 6.143551349639893
},
{
"text": "ヌ",
"consonant": "n",
"consonant_length": 0.06496862322092056,
"vowel": "u",
"vowel_length": 0.07755625993013382,
"pitch": 6.034786701202393
},
{
"text": "オ",
"consonant": null,
"consonant_length": null,
"vowel": "o",
"vowel_length": 0.12710608541965485,
"pitch": 5.723551273345947
}
],
"accent": 16,
"pause_mora": null,
"is_interrogative": false
},
{
"moras": [
{
"text": "ツ",
"consonant": "ts",
"consonant_length": 0.07276459783315659,
"vowel": "U",
"vowel_length": 0.06307903677225113,
"pitch": 0
},
{
"text": "カ",
"consonant": "k",
"consonant_length": 0.08030062168836594,
"vowel": "a",
"vowel_length": 0.12323544174432755,
"pitch": 5.739980697631836
},
{
"text": "エ",
"consonant": null,
"consonant_length": null,
"vowel": "e",
"vowel_length": 0.09613583236932755,
"pitch": 5.788263320922852
},
{
"text": "バ",
"consonant": "b",
"consonant_length": 0.05542264133691788,
"vowel": "a",
"vowel_length": 0.1560276746749878,
"pitch": 5.637535095214844
}
],
"accent": 3,
"pause_mora": {
"text": "、",
"consonant": null,
"consonant_length": null,
"vowel": "pau",
"vowel_length": 0.30613061785697937,
"pitch": 0
},
"is_interrogative": false
},
{
"moras": [
{
"text": "シ",
"consonant": "sh",
"consonant_length": 0.06538461893796921,
"vowel": "I",
"vowel_length": 0.05911170691251755,
"pitch": 0
},
{
"text": "カ",
"consonant": "k",
"consonant_length": 0.05888736993074417,
"vowel": "a",
"vowel_length": 0.08822593837976456,
"pitch": 5.928423881530762
},
{
"text": "ク",
"consonant": "k",
"consonant_length": 0.05211504548788071,
"vowel": "u",
"vowel_length": 0.048504769802093506,
"pitch": 6.071099281311035
},
{
"text": "テ",
"consonant": "t",
"consonant_length": 0.05428486317396164,
"vowel": "e",
"vowel_length": 0.09488621354103088,
"pitch": 6.035956382751465
},
{
"text": "キ",
"consonant": "k",
"consonant_length": 0.0612785741686821,
"vowel": "i",
"vowel_length": 0.051526837050914764,
"pitch": 6.003755569458008
},
{
"text": "ナ",
"consonant": "n",
"consonant_length": 0.05272102355957031,
"vowel": "a",
"vowel_length": 0.09976742416620255,
"pitch": 5.922281742095947
}
],
"accent": 3,
"pause_mora": null,
"is_interrogative": false
},
{
"moras": [
{
"text": "デ",
"consonant": "d",
"consonant_length": 0.05786009132862091,
"vowel": "e",
"vowel_length": 0.08060640841722488,
"pitch": 5.838850975036621
},
{
"text": "エ",
"consonant": null,
"consonant_length": null,
"vowel": "e",
"vowel_length": 0.10772784799337387,
"pitch": 6.055526256561279
},
{
"text": "タ",
"consonant": "t",
"consonant_length": 0.05102948099374771,
"vowel": "a",
"vowel_length": 0.07485056668519974,
"pitch": 5.975678443908691
},
{
"text": "ノ",
"consonant": "n",
"consonant_length": 0.053867585957050323,
"vowel": "o",
"vowel_length": 0.08666542172431946,
"pitch": 5.62253999710083
}
],
"accent": 1,
"pause_mora": null,
"is_interrogative": false
},
{
"moras": [
{
"text": "カ",
"consonant": "k",
"consonant_length": 0.06830106675624847,
"vowel": "a",
"vowel_length": 0.10001389682292938,
"pitch": 5.4681220054626465
},
{
"text": "シ",
"consonant": "sh",
"consonant_length": 0.03910141438245773,
"vowel": "I",
"vowel_length": 0.06169702857732773,
"pitch": 0
},
{
"text": "カ",
"consonant": "k",
"consonant_length": 0.06668663769960403,
"vowel": "a",
"vowel_length": 0.08050308376550674,
"pitch": 5.718845367431641
},
{
"text": "モ",
"consonant": "m",
"consonant_length": 0.06347890198230743,
"vowel": "o",
"vowel_length": 0.09668878465890884,
"pitch": 5.7357072830200195
}
],
"accent": 4,
"pause_mora": null,
"is_interrogative": false
},
{
"moras": [
{
"text": "ヨ",
"consonant": "y",
"consonant_length": 0.0721202865242958,
"vowel": "o",
"vowel_length": 0.0897517129778862,
"pitch": 5.706452369689941
},
{
"text": "オ",
"consonant": null,
"consonant_length": null,
"vowel": "o",
"vowel_length": 0.10947959125041962,
"pitch": 5.806802272796631
},
{
"text": "イ",
"consonant": null,
"consonant_length": null,
"vowel": "i",
"vowel_length": 0.07395301014184952,
"pitch": 5.84182071685791
},
{
"text": "ニ",
"consonant": "n",
"consonant_length": 0.06408850103616714,
"vowel": "i",
"vowel_length": 0.08724123984575272,
"pitch": 5.840785503387451
}
],
"accent": 4,
"pause_mora": null,
"is_interrogative": false
},
{
"moras": [
{
"text": "ジ",
"consonant": "j",
"consonant_length": 0.057947076857089996,
"vowel": "i",
"vowel_length": 0.07397115975618362,
"pitch": 5.680322647094727
},
{
"text": "ツ",
"consonant": "ts",
"consonant_length": 0.0843447670340538,
"vowel": "u",
"vowel_length": 0.05556487292051315,
"pitch": 5.711482524871826
},
{
"text": "ゲ",
"consonant": "g",
"consonant_length": 0.0639617070555687,
"vowel": "e",
"vowel_length": 0.1436133086681366,
"pitch": 5.7915754318237305
},
{
"text": "ン",
"consonant": null,
"consonant_length": null,
"vowel": "N",
"vowel_length": 0.06646636128425598,
"pitch": 5.912050247192383
}
],
"accent": 4,
"pause_mora": null,
"is_interrogative": false
},
{
"moras": [
{
"text": "デ",
"consonant": "d",
"consonant_length": 0.03959709405899048,
"vowel": "e",
"vowel_length": 0.09000203758478165,
"pitch": 5.889118194580078
},
{
"text": "キ",
"consonant": "k",
"consonant_length": 0.06520184129476547,
"vowel": "i",
"vowel_length": 0.05073026567697525,
"pitch": 6.0071330070495605
},
{
"text": "マ",
"consonant": "m",
"consonant_length": 0.06367786973714828,
"vowel": "a",
"vowel_length": 0.1216253861784935,
"pitch": 6.0617876052856445
},
{
"text": "ス",
"consonant": "s",
"consonant_length": 0.07545843720436096,
"vowel": "U",
"vowel_length": 0.10259313136339188,
"pitch": 0
}
],
"accent": 3,
"pause_mora": null,
"is_interrogative": false
}
],
"speedScale": 1,
"pitchScale": 0,
"intonationScale": 1,
"volumeScale": 1,
"prePhonemeLength": 0.1,
"postPhonemeLength": 0.1,
"pauseLength": null,
"pauseLengthScale": 1,
"outputSamplingRate": 24000,
"outputStereo": false,
"kana": "ピイワイティイエイチオオエヌワ'、デエタサ'イエンスノ/ブ'ンヤデ/アットオテキナ'/シ'ジオ/エ'テ/イル'/プログラミングゲ'ンゴデ_ス、コノ'/ゲ'ンゴノ/ツヨミ'ワ、ピイエエエヌディイエエエスヤ'/エヌユウエムピイワイナドノ'/ラ'イブラリオ/カツヨオ'/スル'/コト'デ、タイリョオノ'/デ'エタオ/コオリツ'テキニ/ショ'リ/デキ'ル/テンニ'/アリマ'_ス、タト'エバ、スウヒャク'/マン'ギョオノ/デ'エタオ、ディイエエティイエエエフアアルエエエムイイ'、ト'/_シテ'/ア_ツカイ'、スウコオノ'/コ'オドデ/トオケエ'テキナ/ブンセキオ'/オコナウ'/コト'ガ/カノオデ'_ス、マタ'、エムエエティイピイエルオオティイエルアイビイヤ'/エスイイエエビイオオアアルエヌオ'/_ツカエ'バ、_シカク'テキナ/デ'エタノ/カ_シカモ'/ヨオイニ'/ジツゲン'/デキマ'_ス"
}
おわりに
本記事では、Aivis Speech Engine のAPIを使って、音声合成を行う一連の流れをご紹介しました。
今回はローカル環境での実行を前提としていましたが、もちろんクラウド上に構築してサービスに組み込むことも可能です。
また、従量課金制のHTTP APIサービスも現在開発中のようです。
なお、Aivis Hubという音声合成モデルの共有サービスも提供されており、
今後、利用可能なモデルが徐々に増えていきそうです。
https://hub.aivis-project.com/
関連するサービス含め絶賛開発中のようですので、今後のアップデートに期待です。