cProfileモジュールでPythonスクリプトのボトルネックを調べる

Python

Last updated at 2020-12-24Posted at 2020-12-24

この記事はTakumi Akashiro ひとり Advent Calendar 2020の24日目の記事です。

始めに

Python製のツールが遅いけれど、どこを直せば早くなるのか分からない……そんな経験はありませんか？
そんなときに便利なのが、これ、cProfileモジュールです！

とりあえず使ってみる

cProfile.runにstringを渡すとexec(<string>)を計測して、結果が出力されます。

>>> import time
>>> import cProfile

>>> cProfile.run('[time.sleep(1) for x in range(3)]') 
         7 function calls in 3.000 seconds

   Ordered by: standard name

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.000    0.000    3.000    3.000 <string>:1(<listcomp>)
        1    0.000    0.000    3.000    3.000 <string>:1(<module>)
        1    0.000    0.000    3.000    3.000 {built-in method builtins.exec}
        3    3.000    1.000    3.000    1.000 {built-in method time.sleep}
        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}

ぴったし3秒ですね！¹こんな感じで色々情報が出てきます。

要素名	意味
ncall	関数の呼び出し回数
tottime	関数の総時間 (その関数のみの処理時間。子関数は除く。)
percall(1回目)	関数の1回にかかる時間 (その関数のみの処理時間。子関数は除く。)
cumtime	関数の総時間 (累積)
percall(2回目)	関数の1回にかかる時間(累積)

基本的にncallが妙に多くて、tottime()が膨れ上がってる関数を探して、効率化する感じですね。

え、文字列を渡してのが面倒？そうですよね……
実は自分もこの方法は使っておらず、デコレータ化したものを関数にくっつけて計測しています。

というわけでデコレータ化した関数がこちらです。

profiler.py


import cProfile
import pstats

def profile(func):
    LENGTH = 10
    def func_(*args, **kwargs):
        profile = cProfile.Profile()
        profile.enable()
        result = func(*args, **kwargs)
        profile.disable()
        pstats.Stats(profile).sort_stats(2).print_stats(LENGTH)
        return result
    return func_

デコレータを使ってみる

高速化するネタが思いつかなかったので、適当な正規表現を100万回ぶんまわしてみます。

sample.py

import profiler
import re

@profiler.profile
def v_1():
    [re.search(r'\d*', str(x)) for x in range(1000000)]

@profiler.profile
def sample_2():
    reg = re.compile(r'\d*')
    [reg.search(str(x)) for x in range(1000000)]

if __name__ == '__main__':
    sample_1()
    sample_2()

python.exe d:/catbox/sample.py
         4000139 function calls (4000135 primitive calls) in 1.365 seconds

   Ordered by: cumulative time
   List reduced from 53 to 10 due to restriction <10>

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.032    0.032    1.365    1.365 d:/catbox/sample_1.py:4(sample_1)
        1    0.361    0.361    1.333    1.333 d:/catbox/sample_1.py:6(<listcomp>)
  1000000    0.286    0.000    0.972    0.000 C:\Users\<username>\AppData\Local\Programs\Python\Python38\lib\re.py:196(search)
  1000000    0.272    0.000    0.382    0.000 C:\Users\<username>\AppData\Local\Programs\Python\Python38\lib\re.py:287(_compile)
  1000000    0.303    0.000    0.303    0.000 {method 'search' of 're.Pattern' objects}
  1000016    0.110    0.000    0.110    0.000 {built-in method builtins.isinstance}
        1    0.000    0.000    0.000    0.000 C:\Users\<username>\AppData\Local\Programs\Python\Python38\lib\sre_compile.py:759(compile)
        1    0.000    0.000    0.000    0.000 C:\Users\<username>\AppData\Local\Programs\Python\Python38\lib\sre_parse.py:937(parse)
        1    0.000    0.000    0.000    0.000 C:\Users\<username>\AppData\Local\Programs\Python\Python38\lib\enum.py:828(__and__)
        2    0.000    0.000    0.000    0.000 C:\Users\<username>\AppData\Local\Programs\Python\Python38\lib\enum.py:278(__call__)


         1000006 function calls in 0.582 seconds

   Ordered by: cumulative time

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.033    0.033    0.582    0.582 d:/catbox/sample_1.py:8(sample_2)
        1    0.276    0.276    0.550    0.550 d:/catbox/sample_1.py:11(<listcomp>)
  1000000    0.273    0.000    0.273    0.000 {method 'search' of 're.Pattern' objects}
        1    0.000    0.000    0.000    0.000 C:\Users\<username>\AppData\Local\Programs\Python\Python38\lib\re.py:248(compile)
        1    0.000    0.000    0.000    0.000 C:\Users\<username>\AppData\Local\Programs\Python\Python38\lib\re.py:287(_compile)
        1    0.000    0.000    0.000    0.000 {built-in method builtins.isinstance}
        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}

re.compileで事前にコンパイルしていたsample_2の方が0.7秒程度早くなりましたね！
(良いネタが思いつかなかった……)

締め

呼び出し回数って思っている以上に増えがちなので、計測して無駄な処理を減らすだけで、
結構快適に動くようになる時もあります。なんか遅いと思ったらとりあえずかけてみましょう！

~~もう少しマシなネタを出そうと思ってましたが、検証した結果実現できなかった……今年は悲しいクリスマスイブでした。~~

実はtottimeが3.003とかズレたので6回ぐらいか測り直しました。 ↩

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up