@otukebo-yaposted at 2024-12-15

faster-whisperの結果を非同期で取り出したい

Q&A

Closed

Python 非同期 generator,faster-whisper,aiortc,

解決したいこと

pythonのaiortcを用いてwebrtcの通信を行い、音声を非同期でfaster-whisperによってリアルタイム文字起こしを行おうとしています。

generator型である処理結果からtextを取り出す際に時間がかかり、他の処理がブロッキングされ、メインのwebrtcでの映像や音声の通信が遅れてしまいます。

他の処理を遅らせずに結果を取り出す。あるいは、素早く取り出す解決法は無いでしょうか？

プログラム経験が少ないため、つたないコード・つたない説明になってしまいましたが、どうか協力お願いします。

環境

windows11
pythonのバージョン：3.12.8

問題の部分

async def transcribe_audio(self):
        while True:
            audio_data_np = await self.audio_queue.get()
            a = time.time()
            segments = await asyncio.to_thread(self.model_wrapper.transcribe, audio_data_np)

            b = time.time()
            diff = b - a
            print(f"faster-whisperの処理時間：{diff}")
            
            c = time.time()
            for segment in segments:
                print(segment)
                print("あなたの入力：　"+segment.text)
                await self.text_queue.put(segment.text)
                
            d = time.time()
            
            print(f"テキストを取り出す時間：{d-c}")
    # Whisper_Wrapperのtranscribeの処理時：0.002001047134399414
    # faster-whisperの処理時間：0.003004789352416992
    # Segment(id=1, seek=0, start=0.0, end=0.96, text='おはようございます', tokens=[6117, 3065, 17010, 43808], avg_logprob=-0.5223611831665039, compression_ratio=0.75, no_speech_prob=0.1387939453125, words=None, temperature=0.0)
    # あなたの入力：　おはようございます
    # テキストを取り出す時間：0.5869617462158203

上記のプログラムは、@reriiasu様のプログラムを参考にさせていただいています。

また、webrtcの通信部分に関しては、aiortcのサンプルプログラムの中のserber.pyをもとにしており、

asyncio.create_task(transcriber.transcribe_audio())

のようにして、faster-whisperの処理を呼び出しています。

Are you sure you want to delete the question?

faster-whisperの結果を非同期で取り出したい

解決したいこと

環境

問題の部分

2Answer

Comments

Your answer might help someone💌