More than 3 years have passed since last update.

pythonで非同期リクエストするなら大人しくaiohttpを使いましょうという話

Last updated at 2020-12-07Posted at 2020-12-07

pythonで非同期リクエスト

pythonで非同期リクエストをしようとなると、asyncioでasync/awaitしつつ何かしらのhttpクライアントのライブラリを使うわけですが、 non-blockingなaiohttpを使う場合とrequests等のblockingなライブラリをマルチスレッドで使う場合とでどれくらい差がでるのか調べてみました。pythonでマルチスレッドにしても一つのコアを複数スレッドで貪り合うだけでさして性能は上がらないとはよくいいますが、使い慣れたrequestsさんとお別れするために必要な儀式だと思うのでやっておきます。
なお、asyncioそれ自体については特に細かい説明はしないです。

実装

httpクライアントにrequestsを使用したものとaiohttpを使用したサンプルをそれぞれ用意しました。検証用のリクエスト先はfake rest apiをお借りしました。

requests

まず下馬評低めのrequestsから。

requests_example.py

from functools import partial
import logging
from timeit import default_timer as timer

import asyncio
import requests


logging.basicConfig(level=logging.DEBUG)

def async_timeit(func):
    async def wrapper(*args, **kwargs):
        s = timer()
        print(f'start: {func.__name__}')
        await func(*args, **kwargs)
        e = timer()
        print(f'end: {func.__name__}')
        print(e - s)
    return wrapper

async def get(id_, loop):
    url = f'https://fakerestapi.azurewebsites.net/api/v1/Authors/{id_}'
    headers={'accept': 'application/json; v=1.0'}

    res = await loop.run_in_executor(
        None,
        partial(requests.get, url, headers=headers)
    )
    return res.json()

@async_timeit
async def main():
    loop = asyncio.get_event_loop()
    tasks = [get(str(i), loop) for i in range(1, 51)]
    res = await asyncio.gather(*tasks)
    print([ele['id'] for ele in res])

asyncio.run(main())

debugログを有効にして、実行時間計測用のデコレータをつけてます。
loop.run_in_executorはblokingな関数をマルチスレッド（もしくはマルチプロセス）に割り当ててasyncioのイベントループで実行できるようにしてくれます。第一引数はconcurrent.futures.ExecutorでNoneにするとデフォルトのThreadPoolExecutorが使用されます。

aiohttp

本命馬のaiohttp。

aiohttp_example.py

from time import sleep
import logging

import asyncio
import aiohttp

logging.basicConfig(level=logging.DEBUG)
logger = logging.getLogger()

async def get(id_, session):
    url = f'https://fakerestapi.azurewebsites.net/api/v1/Authors/{id_}'
    headers={
        'accept': 'application/json; v=1.0'
    }
    async with session.get(url, headers=headers) as res:
        return await res.json()

async def on_request_start(session, trace_config_ctx, params):
    logger.debug(f'Start request: {params.url.host}:{params.url.port}')

async def on_request_end(session, trace_config_ctx, params):
    logger.debug(f'End request: {params.url.host}:{params.url.port} "{params.method} {params.url.path} {params.response.status}"')

@async_timeit
async def main():
    trace_config = aiohttp.TraceConfig()
    trace_config.on_request_start.append(on_request_start)
    trace_config.on_request_end.append(on_request_end)

    async with aiohttp.ClientSession(trace_configs=[trace_config]) as session:
        tasks = [get(str(i), session) for i in range(1, 51)]
        res = await asyncio.gather(*tasks)
    print([ele['id'] for ele in res])

asyncio.run(main())

aiohttpのclientのソースコードにはデバッグが仕込まれてないので、出力したい場合はTraceConfigのインスタンスを作って自前で仕込みます。フックできるイベントは公式ドキュメントを参照してください。

実行

それぞれapiを叩く回数は50回ずつにしています。

requests

DEBUG:asyncio:Using selector: KqueueSelector
start: main
DEBUG:urllib3.connectionpool:Starting new HTTPS connection (1): fakerestapi.azurewebsites.net:443
DEBUG:urllib3.connectionpool:Starting new HTTPS connection (1): fakerestapi.azurewebsites.net:443
DEBUG:urllib3.connectionpool:Starting new HTTPS connection (1): fakerestapi.azurewebsites.net:443
DEBUG:urllib3.connectionpool:Starting new HTTPS connection (1): fakerestapi.azurewebsites.net:443
DEBUG:urllib3.connectionpool:Starting new HTTPS connection (1): fakerestapi.azurewebsites.net:443
DEBUG:urllib3.connectionpool:Starting new HTTPS connection (1): fakerestapi.azurewebsites.net:443
DEBUG:urllib3.connectionpool:Starting new HTTPS connection (1): fakerestapi.azurewebsites.net:443
DEBUG:urllib3.connectionpool:Starting new HTTPS connection (1): fakerestapi.azurewebsites.net:443
DEBUG:urllib3.connectionpool:https://fakerestapi.azurewebsites.net:443 "GET /api/v1/Authors/4 HTTP/1.1" 200 None
DEBUG:urllib3.connectionpool:https://fakerestapi.azurewebsites.net:443 "GET /api/v1/Authors/1 HTTP/1.1" 200 None
DEBUG:urllib3.connectionpool:https://fakerestapi.azurewebsites.net:443 "GET /api/v1/Authors/3 HTTP/1.1" 200 None

# 中略

DEBUG:urllib3.connectionpool:https://fakerestapi.azurewebsites.net:443 "GET /api/v1/Authors/49 HTTP/1.1" 200 None
DEBUG:urllib3.connectionpool:https://fakerestapi.azurewebsites.net:443 "GET /api/v1/Authors/50 HTTP/1.1" 200 None
[1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50]
end: main
8.105084322

手元の実行環境は4コアcpuなので8スレッド立っているようです。リクエスト数分スレッド立てれば(今回の条件だと2s程度まで)早くなるにはなるんですが、特定の実行環境で数千・数万単位のリクエストを捌きたいとなるとあまり現実的ではないかなという感じ。

aiohttp

DEBUG:asyncio:Using selector: KqueueSelector
start: main
DEBUG:root:Start request: fakerestapi.azurewebsites.net:443
DEBUG:root:Start request: fakerestapi.azurewebsites.net:443
DEBUG:root:Start request: fakerestapi.azurewebsites.net:443
DEBUG:root:Start request: fakerestapi.azurewebsites.net:443
DEBUG:root:Start request: fakerestapi.azurewebsites.net:443
DEBUG:root:Start request: fakerestapi.azurewebsites.net:443
DEBUG:root:Start request: fakerestapi.azurewebsites.net:443
DEBUG:root:Start request: fakerestapi.azurewebsites.net:443
DEBUG:root:Start request: fakerestapi.azurewebsites.net:443
DEBUG:root:Start request: fakerestapi.azurewebsites.net:443
DEBUG:root:Start request: fakerestapi.azurewebsites.net:443
DEBUG:root:Start request: fakerestapi.azurewebsites.net:443
DEBUG:root:Start request: fakerestapi.azurewebsites.net:443

# 中略

DEBUG:root:End request: fakerestapi.azurewebsites.net:443 "GET /api/v1/Authors/42 200"
DEBUG:root:End request: fakerestapi.azurewebsites.net:443 "GET /api/v1/Authors/32 200"
[1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50]
end: main
1.194258011

シングルスレッドで動作してこの実行時間。お別れの準備ができそうです。

まとめ

pythonで非同期リクエストを飛ばす際にblocking-ioのrequestsとnon-blocking-ioのaiohttpを比較してみました。当然aiohttpの方がよいパフォーマンスを示す結果となりましたが、お世話になったrequestsとお別れする決心もつけられたのでこれでよしとしようと思います。

コードのサンプルはgithubに置いておきます。

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up