More than 3 years have passed since last update.

FastAPIのバックグラウンド処理の多重度を同期・非同期で比較してみたよ

Posted at 2021-01-26

FastAPIには（Starletteには）レスポンスを先に返しておいて重たい処理はバックグラウンドで実行するための機能BackgroundTaskが標準で備わっています。今回、このバックグラウンドタスクの多重度がどれくらいまでいけるのか、同期・非同期で違いはあるのか？について検証してみました。

同期・非同期での呼び出し方の違い

そもそもバックグラウンドタスクとしたものはどのように処理されるのか調べてみました。Starletteの中身を斜め読みするに、非同期処理を登録した場合はメインスレッドにてノンブロッキングで、同期の場合はプールされたスレッドにて処理されるように見えます。雰囲気的に。

background.py

class BackgroundTask:
    def __init__(
        self, func: typing.Callable, *args: typing.Any, **kwargs: typing.Any
    ) -> None:
        self.func = func
        self.args = args
        self.kwargs = kwargs
        self.is_async = asyncio.iscoroutinefunction(func)

    async def __call__(self) -> None:
        if self.is_async:
            await self.func(*self.args, **self.kwargs)
        else:
            await run_in_threadpool(self.func, *self.args, **self.kwargs)

実験用サーバ・クライアントプログラムの作成

サーバプログラムを以下の通り作ります。仕様は以下の通り。

ドキュメントルートにアクセスすると、バックグラウンドタスクを3個追加して、即座にレスポンスを返す
バックグラウンドタスクの実体として、3秒待つだけの処理を用意
クエリパラメータas_asyncに1などを指定すると非同期処理、それ以外は同期処理がバックグラウンドタスクとして使用される

run.py

import time
from datetime import datetime
import threading
import asyncio
from fastapi import FastAPI, BackgroundTasks

app = FastAPI()

async def do_async(counter):
    await asyncio.sleep(3)
    print(f"{datetime.utcnow()} [{threading.get_ident()}]{counter} async end")

def do_sync(counter):
    time.sleep(3)
    print(f"{datetime.utcnow()} [{threading.get_ident()}]{counter} sync end")

@app.get("/")
async def docroot(background_tasks: BackgroundTasks, as_async: int = 0):
    for i in range(3):
        background_tasks.add_task(do_async if as_async else do_sync, i)

    print(f"main thread: {threading.get_ident()}")
    return "ok"

なおサーバプログラムの起動はこんな感じ。事前にUvicornを入れておきましょう。

$ uvicorn run:app

続いてこれを叩くクライアントプログラム。コマンドライン引数の1つめを非同期処理するかとしてas_asyncに、2つめを呼び出しの多重度としてcall_countに設定して、先程のサーバプログラムにリクエストを送ります。

client.py

import sys
import asyncio
import aiohttp

async def send_request(as_async: int, call_count: int):
    async with aiohttp.ClientSession() as session:
        for i in range(call_count):
            async with session.get(f"http://localhost:8000/?as_async={as_async}") as resp:
                print(resp.status)

if __name__ == "__main__":
    as_async = int(sys.argv[1])
    call_count = int(sys.argv[2])
    loop = asyncio.get_event_loop()
    loop.run_until_complete(send_request(as_async, call_count))

同期処理を試す

まずは同期処理から行ってみましょう。お手並み拝見ということで4多重でリクエストを送ってみます。

$ python client.py 0 4

結果は以下の通り。

先ほどの斜め読みの通り、メインスレッド4495089152とは異なるスレッドでバックグラウンドタスクが処理された
同一リクエストで追加したバックグラウンド処理は、同一スレッド内でシーケンシャルに処理されている
各リクエストのバックグラウンドタスクは、同時並行的に処理される（開始・終了時間がほぼ同じ）

main thread: 4495089152
INFO:     127.0.0.1:64633 - "GET /?as_async=0 HTTP/1.1" 200 OK
main thread: 4495089152
INFO:     127.0.0.1:64633 - "GET /?as_async=0 HTTP/1.1" 200 OK
main thread: 4495089152
INFO:     127.0.0.1:64633 - "GET /?as_async=0 HTTP/1.1" 200 OK
main thread: 4495089152
INFO:     127.0.0.1:64633 - "GET /?as_async=0 HTTP/1.1" 200 OK
2021-01-26 02:46:24.017549 [123145503653888]0 sync end
2021-01-26 02:46:24.017926 [123145520443392]0 sync end
2021-01-26 02:46:24.020092 [123145537232896]0 sync end
2021-01-26 02:46:24.020207 [123145554022400]0 sync end
2021-01-26 02:46:27.021055 [123145503653888]1 sync end
2021-01-26 02:46:27.021553 [123145520443392]1 sync end
2021-01-26 02:46:27.024704 [123145537232896]1 sync end
2021-01-26 02:46:27.024814 [123145554022400]1 sync end
2021-01-26 02:46:30.025810 [123145503653888]2 sync end
2021-01-26 02:46:30.025998 [123145520443392]2 sync end
2021-01-26 02:46:30.030431 [123145537232896]2 sync end
2021-01-26 02:46:30.030594 [123145554022400]2 sync end

これだけみると同期処理でいいじゃんという気もしてきますが、スレッドは（効率的かつ）無尽蔵に使えるものではないので、多重度を20に増やしてリトライしてみます。

$ python client.py 0 20

するとどうでしょう、12多重までは並行処理できているものの、その後の処理は3秒後から開始されています。スレッドの空きを待っていたものと思われます。

main thread: 4495089152
INFO:     127.0.0.1:64687 - "GET /?as_async=0 HTTP/1.1" 200 OK

　略

2021-01-26 02:53:13.232444 [123145520443392]0 sync end
2021-01-26 02:53:13.234803 [123145537232896]0 sync end
2021-01-26 02:53:13.235891 [123145503653888]0 sync end
2021-01-26 02:53:13.235952 [123145554022400]0 sync end
2021-01-26 02:53:13.236892 [123145587601408]0 sync end
2021-01-26 02:53:13.236963 [123145570811904]0 sync end
2021-01-26 02:53:13.243115 [123145604390912]0 sync end
2021-01-26 02:53:13.244411 [123145621180416]0 sync end
2021-01-26 02:53:13.244533 [123145637969920]0 sync end
2021-01-26 02:53:13.244556 [123145654759424]0 sync end
2021-01-26 02:53:13.249159 [123145671548928]0 sync end
2021-01-26 02:53:13.252734 [123145688338432]0 sync end
2021-01-26 02:53:16.235923 [123145520443392]0 sync end
2021-01-26 02:53:16.239009 [123145537232896]0 sync end
2021-01-26 02:53:16.239944 [123145503653888]0 sync end
2021-01-26 02:53:16.240097 [123145554022400]0 sync end
2021-01-26 02:53:16.240251 [123145570811904]0 sync end
2021-01-26 02:53:16.240354 [123145587601408]0 sync end
2021-01-26 02:53:16.247533 [123145604390912]0 sync end
2021-01-26 02:53:16.248408 [123145621180416]0 sync end
2021-01-26 02:53:16.249650 [123145637969920]1 sync end

　略

なお今回計測された多重度12については、実行するマシンのスペックやPythonのバージョンにより変わってきますので注意が必要です。Starletteをまたもや斜め読みしたところ、run_in_threadpoolからはasyncioのrun_in_executorを呼んでいて、ここで使われるexecutorはThreadPoolExecutorのインスタンスのようです。

Python 3.8の場合、

Default value of max_workers is changed to min(32, os.cpu_count() + 4).

とあります。今回使用したマシンは物理4コア・論理8コア、os.cpu_count()の結果は8になりますので、8 + 4 = 12スレッドになったということでしょう。ちょっと前までデフォルトでCPUコア数 * 5だと思っていたのですが、Pythonのバージョンにより変わっているみたいです。

非同期処理を試す

続いて非同期処理です。同じようにまずは4多重でリクエストを送ってみます。

$ python client.py 1 4

結果は以下の通り。実行スレッド以外に同期処理との違いはみられませんでした。

想定通り、メインスレッド4495089152と同じスレッドでバックグラウンドタスクが処理された
同一リクエストで追加したバックグラウンド処理は、同一スレッド内でシーケンシャルに処理されている（同期と同じ）
各リクエストのバックグラウンドタスクは、同時並行的に処理される（開始・終了時間がほぼ同じ）（同期と同じ）

main thread: 4495089152
INFO:     127.0.0.1:64795 - "GET /?as_async=1 HTTP/1.1" 200 OK
main thread: 4495089152
INFO:     127.0.0.1:64795 - "GET /?as_async=1 HTTP/1.1" 200 OK
main thread: 4495089152
INFO:     127.0.0.1:64795 - "GET /?as_async=1 HTTP/1.1" 200 OK
main thread: 4495089152
INFO:     127.0.0.1:64795 - "GET /?as_async=1 HTTP/1.1" 200 OK
2021-01-26 03:02:10.735846 [4495089152]0 async end
2021-01-26 03:02:10.735944 [4495089152]0 async end
2021-01-26 03:02:10.735978 [4495089152]0 async end
2021-01-26 03:02:10.735996 [4495089152]0 async end
2021-01-26 03:02:13.739600 [4495089152]1 async end
2021-01-26 03:02:13.739748 [4495089152]1 async end
2021-01-26 03:02:13.739806 [4495089152]1 async end
2021-01-26 03:02:13.739850 [4495089152]1 async end
2021-01-26 03:02:16.740209 [4495089152]2 async end
2021-01-26 03:02:16.740448 [4495089152]2 async end
2021-01-26 03:02:16.740551 [4495089152]2 async end
2021-01-26 03:02:16.740636 [4495089152]2 async end

続いて20多重を試してみましょう。

$ python client.py 1 20

なんと、以下のように20多重すべてが同一スレッド内で同時並行的に処理されました。非同期あっぱれです。

main thread: 4495089152
INFO:     127.0.0.1:64823 - "GET /?as_async=1 HTTP/1.1" 200 OK
2021-01-26 03:05:06.872799 [4495089152]0 async end
2021-01-26 03:05:06.872928 [4495089152]0 async end
2021-01-26 03:05:06.872950 [4495089152]0 async end
2021-01-26 03:05:06.872967 [4495089152]0 async end
2021-01-26 03:05:06.873612 [4495089152]0 async end
2021-01-26 03:05:06.875237 [4495089152]0 async end
2021-01-26 03:05:06.876944 [4495089152]0 async end
2021-01-26 03:05:06.877932 [4495089152]0 async end
2021-01-26 03:05:06.878991 [4495089152]0 async end
2021-01-26 03:05:06.881396 [4495089152]0 async end
2021-01-26 03:05:06.882299 [4495089152]0 async end
2021-01-26 03:05:06.883206 [4495089152]0 async end
2021-01-26 03:05:06.884182 [4495089152]0 async end
2021-01-26 03:05:06.885099 [4495089152]0 async end
2021-01-26 03:05:06.888296 [4495089152]0 async end
2021-01-26 03:05:06.890936 [4495089152]0 async end
2021-01-26 03:05:06.892672 [4495089152]0 async end
2021-01-26 03:05:06.893403 [4495089152]0 async end
2021-01-26 03:05:06.894526 [4495089152]0 async end
2021-01-26 03:05:06.895386 [4495089152]0 async end

CPU負荷がほとんどない処理なので、1000多重で試してみました。2 async endの最初のリクエスト分から最後のリクエスト分までの差が1秒程度ありますが、おそらくprintの処理時間と思われます。気になる方はprintの影響を受けない形で調べてみてください。（そして結果を教えてください）

2021-01-26 03:11:33.677665 [4495089152]1 async end
2021-01-26 03:11:35.298624 [4495089152]2 async end
2021-01-26 03:11:35.298859 [4495089152]2 async end

　略

2021-01-26 03:11:36.676492 [4495089152]2 async end
2021-01-26 03:11:36.677806 [4495089152]2 async end

まとめ

非同期のバックグラウンドタスクはスレッドをブロックしないので、たくさんの要求に対する並列処理に優れることが実証できました。もちろんCPU負荷の高い処理であれば大きな差は出ないかもしれませんが、裏でAPIコールを行うような処理であればこの特性がいきてくることでしょう。チャットボットとの相性も良さそう。

なお非同期処理について理解されている方であれば、この記事を読んでも「こんなの調べなくても知っとるわい」という話かと思いますが、「こうなるはず」というのを実際に試してみたよ！というのが本記事の趣旨でした。間違っているところ・不足しているところがあればご指摘いただけるとありがたいです。

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up