28
27

Delete article

Deleted articles cannot be recovered.

Draft of this article would be also deleted.

Are you sure you want to delete this article?

More than 3 years have passed since last update.

【備忘録】Python requests モジュール

Last updated at Posted at 2019-04-01

環境

Request

dataパラメタにstringを渡す場合はencodeする

# OK
requests.post("https://httpbin.org/post", data="a")         

# NG
requests.post("https://httpbin.org/post", data="")       
#UnicodeEncodeError: 'latin-1' codec can't encode character '\u3042' in position 0: Body ('あ') is not valid Latin-1. Use body.encode('utf-8') if you want to send it encoded in UTF-8.
 
# OK
requests.post("https://httpbin.org/post", data="".encode("utf-8"))                      

data – (optional) Dictionary, list of tuples, bytes, or file-like object to send in the body of the Request.

Response

その他

同一のホストにリクエストを投げる場合は、sessionオブジェクトを利用する

So if you’re making several requests to the same host, the underlying TCP connection will be reused, which can result in a significant performance increase (see HTTP persistent connection).

同じホストに10回リクエストを投げたときで比較すると、確かにsessionオブジェクトを使った方がかかった時間が短いです。

IPython
In [52]: %time for i in range(10):requests.get("https://httpbin.org/get")                                                                                                                                          
CPU times: user 187 ms, sys: 6.27 ms, total: 193 ms
Wall time: 6.94 s

In [53]: s = requests.Session()                                                                                                                                                                                    
In [54]: %time for i in range(10):s.get("https://httpbin.org/get")                                                                                                                                                 
CPU times: user 58.5 ms, sys: 67 µs, total: 58.6 ms
Wall time: 2.27 s

logging

http://docs.python-requests.org/en/master/api/#api-changes 参考

urllib3のロガーを有効にする

import logging
# 適切なloggerに設定する
requests_log = logging.getLogger("urllib3")
requests_log.setLevel(logging.DEBUG)
IPython
In [37]: r = requests.post("https://httpbin.org/post", data="a", params={"x":1})                                                                                                                                   
DEBUG:urllib3.connectionpool:Starting new HTTPS connection (1): httpbin.org:443
DEBUG:urllib3.connectionpool:https://httpbin.org:443 "POST /post?x=1 HTTP/1.1" 200 236

http.clientのデバッグを有効にする

import http
http.client.HTTPConnection.debuglevel=1
In [37]: r = requests.post("https://httpbin.org/post", data="a", params={"x":1})                                                                                                                                   
send: b'POST /post?x=1 HTTP/1.1\r\nHost: httpbin.org\r\nUser-Agent: python-requests/2.21.0\r\nAccept-Encoding: gzip, deflate\r\nAccept: */*\r\nConnection: keep-alive\r\nContent-Length: 1\r\n\r\n'
send: b'a'
reply: 'HTTP/1.1 200 OK\r\n'
header: Access-Control-Allow-Credentials header: Access-Control-Allow-Origin header: Content-Encoding header: Content-Type header: Date header: Server header: Content-Length header: Connection 

リトライ方法

requestsモジュールにはリトライの仕組みがないので、リトライモジュールと併用するのが良い。
たとえばbackoffモジュール(https://pypi.org/project/backoff/)など。

# HTTP Status Codeが429 or 5XXのときはリトライする. 最大5分間リトライする。
 
import backoff

def fatal_code(e):
    """Too many Requests(429)のときはリトライする。それ以外の4XXはretryしない"""
    if e.response is None:
        return True
    code = e.response.status_code
    return 400 <= code < 500 and code != 429


@backoff.on_exception(backoff.expo, requests.exceptions.RequestException,
                                    jitter=backoff.full_jitter,
                                    max_time=300,
                                    giveup=fatal_code)
def get_response_text(url):
    r = requests.get(url)
    r.raise_for_status()
    return r.text

backoffモジュールの設定を共通化したい場合は、backoffデコレータをラップしたデコレータを作成する。

def my_backoff(function):
    @functools.wraps(function)
    def wrapped(*args, **kwargs):
        def fatal_code(e):
            """Too many Requests(429)のときはリトライする。それ以外の4XXはretryしない"""
            if e.response is None:
                return True
            code = e.response.status_code
            return 400 <= code < 500 and code != 429

        return backoff.on_exception(backoff.expo, requests.exceptions.RequestException,
                                    jitter=backoff.full_jitter,
                                    max_time=300,
                                    giveup=fatal_code)(function)(*args, **kwargs)

    return wrapped

@my_backoff
def get_response_text(url):
    pass

backoffアルゴリズムについては以下を参照。
https://codezine.jp/article/detail/10739
https://aws.typepad.com/sajp/2015/03/backoff.html

動作確認方法

以下のサイトにアクセスするのがよい。

28
27
0

Register as a new user and use Qiita more conveniently

  1. You get articles that match your needs
  2. You can efficiently read back useful information
  3. You can use dark theme
What you can do with signing up
28
27

Delete article

Deleted articles cannot be recovered.

Draft of this article would be also deleted.

Are you sure you want to delete this article?