ChatGPTなどのGPTモデルを使うのと同じように、同じOpen AIのSDKでTTS(Text to Speech = テキストから音声を生成)を試せます。
ちなみに、逆のSSTはこちら
the quick brown fox jumped over the lazy dogs
というテキストを読み上げてもらいました。mp3などを指定するとその形式で生成されます。
こんな感じです。
tts-1モデル
モデルはtts-1とtts-1-hdが使える模様ですね。
日本語対応
以下の言語が使える模様でJapaneseも入ってますね。
https://platform.openai.com/docs/guides/text-to-speech/supported-languages
Afrikaans, Arabic, Armenian, Azerbaijani, Belarusian, Bosnian, Bulgarian, Catalan, Chinese, Croatian, Czech, Danish, Dutch, English, Estonian, Finnish, French, Galician, German, Greek, Hebrew, Hindi, Hungarian, Icelandic, Indonesian, Italian, Japanese, Kannada, Kazakh, Korean, Latvian, Lithuanian, Macedonian, Malay, Marathi, Maori, Nepali, Norwegian, Persian, Polish, Portuguese, Romanian, Russian, Serbian, Slovak, Slovenian, Spanish, Swahili, Swedish, Tagalog, Tamil, Thai, Turkish, Ukrainian, Urdu, Vietnamese, and Welsh.
6人の声が使える
現時点で6人(6種類)の声を使えます。
https://platform.openai.com/docs/guides/text-to-speech/voice-options
Experiment with different voices (alloy, echo, fable, onyx, nova, and shimmer) to find one that matches your desired tone and audience. The current voices are optimized for English.
こちらのコードを参考にしたら動きました。
https://github.com/openai/openai-node/blob/master/examples/audio.ts#L14C1-L16C1
const { OpenAI, toFile } = require('openai');
const fs = require('node:fs/promises');
const path = require('node:path');
// gets API Key from environment variable OPENAI_API_KEY
const openai = new OpenAI({
apiKey: 'Open AIのAPIキー',
});
const speechFile = path.resolve(__dirname, './out.mp3');
async function main() {
const mp3 = await openai.audio.speech.create({
model: 'tts-1',
voice: 'alloy',
input: 'the quick brown fox jumped over the lazy dogs',
});
const buffer = Buffer.from(await mp3.arrayBuffer());
await fs.writeFile(speechFile, buffer);
const transcription = await openai.audio.transcriptions.create({
file: await toFile(buffer, 'out.mp3'),
model: 'whisper-1',
});
console.log(transcription.text);
const translation = await openai.audio.translations.create({
file: await toFile(buffer, 'out.mp3'),
model: 'whisper-1',
});
console.log(translation.text);
}
main();
日本語でnovaさん
const OpenAI = require('openai');
const fs = require('node:fs/promises');
const path = require('node:path');
// gets API Key from environment variable OPENAI_API_KEY
const openai = new OpenAI({
apiKey: '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',
});
const speechFile = path.resolve(__dirname, './out.mp3');
async function main() {
const mp3 = await openai.audio.speech.create({
model: 'tts-1',
voice: 'nova',
input: 'こんにちは、秋葉原の電気街口にいます。',
});
const buffer = Buffer.from(await mp3.arrayBuffer());
await fs.writeFile(speechFile, buffer);
}
main();
Gyazoの音声をそのまま埋め込めないぽいのでリンクで。