@terisuke (寺田康佑)posted at 2024-07-14

Next.jsアプリをHostingしたらGoogleTTSから音声出力されない

Q&A

Closed

Firebase HostingしたNext.jsアプリからGoogleTTSを使って音声出力したい

Next.jsでAIとチャットできるアプリをFirebase Hostingでデプロイしたが、VOICEVOXは音声合成できてもGoogleTTSに切り替えるとHosting環境でのみ音声が生成されない。
(ローカルだときちんと生成されて出力される)

発生している問題・エラー

コンソール上&Google Cloudコンソール上どちらでもエラーは見当たらず、APIのレスポンスだけが増えて行くので、問題の場所が特定できず困っている。

該当するソースコード

functions/src/index.ts

import * as functions from 'firebase-functions';
import { googleTts } from '../../src/features/googletts/googletts';
import express from 'express';
import cors from 'cors';

const app = express();

// CORS設定を追加
app.use(cors({ origin: true }));

app.get('/googleTtsFunction', async (req, res) => {
  const message = req.query.message as string || 'Hello, world!';
  const ttsType = req.query.ttsType as string || 'en-US-Standard-F';
  console.log(req.query);
  try {
    const { audio } = await googleTts(message, ttsType);
    res.setHeader('Content-Type', 'audio/wav');
    res.send(audio);
  } catch (err) {
    console.error('ERROR:', err);
    res.status(500).send(err);
  }
});

exports.googleTtsFunction = functions.https.onRequest(app);

src/features/googletts/googletts.ts

export async function googleTts(
  message: string,
  ttsType: string
) {
  try {
  // Imports the Google Cloud client library
  const textToSpeech = require('@google-cloud/text-to-speech');
  // Creates a client
  const client = new textToSpeech.TextToSpeechClient();

  // Construct the request
  const request = {
    input: {text: message},
    // Select the language and SSML voice gender (optional)
    voice: {languageCode: 'en-US', name: ttsType, ssmlGender: 'FEMALE'},
    // select the type of audio encoding
    audioConfig: {audioEncoding: 'LINEAR16'},
  };

  // Performs the text-to-speech request
  const [response] = await client.synthesizeSpeech(request);

  return { audio: response.audioContent };
  } catch (error) {
    console.error('Error during TTS request:', error);
    throw new Error('TTS request failed');
  }
}

この辺りが該当する箇所になる

自分で試したこと

src/pages/index.tsxにローカルデータとenvファイルを比較して、動的に変更してリロードする仕組みを作った
(241行目くらい)

src/pages/index.tsx

useEffect(() => {
    const storedChatVRMParams = window.localStorage.getItem("chatVRMParams");
    if (storedChatVRMParams) {
      const params = JSON.parse(storedChatVRMParams);
      const updatedParams = {
        ...params,
        gsviTtsServerUrl: process.env.NEXT_PUBLIC_TTS_URL || "http://127.0.0.1:5000/tts",
      };
      window.localStorage.setItem("chatVRMParams", JSON.stringify(updatedParams));
      setGSVITTSServerUrl(updatedParams.gsviTtsServerUrl);
    }
  }, [process.env.NEXT_PUBLIC_TTS_URL]);

curlコマンドを叩いてNEXT_PUBLIC_TTS_URLのリンク先からきちんと音声ファイルが出力されるか試した→成功

developerツールを用いてNEXT_PUBLIC_TTS_URLがgsviTtsServerUrlに反映されるか確認→反映されてる

APIがPublicか確認　→ Public、問題なし

予想

下記2ファイルで出力された音声に合わせて表情を変更しているのだが、表情、唇どちらも動いていない→音声出力自体がされていない?？

src/features/vrmViewer/model.ts

  /**
   * 音声を再生し、リップシンクを行う
   */
  public async speak(buffer: ArrayBuffer, screenplay: Screenplay) {
    this.emoteController?.playEmotion(screenplay.expression);
    await new Promise((resolve) => {
      this._lipSync?.playFromArrayBuffer(buffer, () => {
        resolve(true);
      });
    });
  }

src/features/emoteController/expressionController.ts

  public playEmotion(preset: VRMExpressionPresetName) {
    if (this._currentEmotion != "neutral") {
      this._expressionManager?.setValue(this._currentEmotion, 0);
    }

    if (preset == "neutral") {
      this._autoBlink?.setEnable(true);
      this._currentEmotion = preset;
      return;
    }

    const t = this._autoBlink?.setEnable(false) || 0;
    this._currentEmotion = preset;

    // 目を開ける
    this._expressionManager?.setValue(VRMExpressionPresetName.Blink, 0);

    // 表情をセット
    this._expressionManager?.setValue(preset, 1);

    // 表情保持時間を取得し、保持後に neutral に戻す
    const delayTime = this.getDelayTime(preset) * 1000;
    setTimeout(() => {
      this._expressionManager?.setValue(preset, 0);
      this._expressionManager?.setValue("neutral", 1);
      this._currentEmotion = "neutral";
    }, delayTime);
  }

src/features/emoteController/expressionController.ts

  public lipSync(preset: VRMExpressionPresetName, value: number) {
    if (this._currentLipSync) {
      this._expressionManager?.setValue(this._currentLipSync.preset, 0);
    }
    this._currentLipSync = {
      preset,
      value,
    };
  }

  public update(delta: number) {
    if (this._autoBlink) {
      this._autoBlink.update(delta);
    }

    if (this._currentLipSync) {
      const weight =
        this._currentEmotion === "neutral"
          ? this._currentLipSync.value * 0.5
          : this._currentLipSync.value * 0.25;
      this._expressionManager?.setValue(this._currentLipSync.preset, weight);
    }
  }

参考

デプロイしたもの
https://aipartner-426616.web.app/

GitHub
https://github.com/terisuke/AI-patner

追記

npm run dev→普通に出力される
npm run start→普通に出力される
firebase deploy→出力されない

現在ログをharファイルに出力して解析中

npm run startの時のネットワークタブスクショ

firebase deploの時のネットワークタブスクショ

308リダイレクトがされていないことが関係ありそう・・・？

0 likes

@terisuke

Questioner

ありがとうございます！
ネットワークタブで出力されているレスポンスを確認したところ、
他のAPIも含めて画像のようにindexらしきhtmlを出力していました。

そこでルーティング周りを確認したところ、firebase.jsonにて

firebase.json

"rewrites": [
        {
          "source": "**",
          "destination": "/index.html"
        }
      ]

とindex.htmlに全てリダイレクトするように指示をしている部分を発見しました。

ここを消去&tts関連のファイルがfunctionsのAPIを叩くようにしたところ、とりあえず画面から音声だけは出るようになりました！

functions/src/index.ts

const message = req.query.message as string || 'Hello, world!';
const ttsType = req.query.ttsType as string || 'en-US-Standard-B';

と設定したところ男性の声で 'Hello, world!';としか言わないのが辛いところですが・・・

何はともあれ音声出力に関しては解決したので一旦こちらの方はクローズしようと思います！

ありがとうございました！

Are you sure you want to delete the question?

Next.jsアプリをHostingしたらGoogleTTSから音声出力されない

Firebase HostingしたNext.jsアプリからGoogleTTSを使って音声出力したい

発生している問題・エラー

該当するソースコード

自分で試したこと

予想

参考

追記

1Answer

Comments

Your answer might help someone💌