2
0

Delete article

Deleted articles cannot be recovered.

Draft of this article would be also deleted.

Are you sure you want to delete this article?

Pythonのheapqにタプルは避けたいという話

Posted at

TL;DR

pythonのheapqにtupleはおそいので、ハッシュ化しましょう。

1. はじめに

はじめまして、snmsnbです。
先日、ABC409に参加した際、heapqでのtupleの遅さを痛感したので備忘録として書いておきます。

2. 計測

ある数$N$を定めて、要素数が$N$に到達するまでheappushを行い、そこから$N$回だけpoppushを繰り返す実験を10回行い平均と分散を取っています。

条件は

  • tupleの要素数を2、3個にする場合
  • heapqの要素が$10^4$、$10^5$、$10^6$にした場合

を組み合わせた12パターンで、atcoderのコードテストで計測しました。使用言語はpypyで、単位は秒です。

要素が2つの場合

heapqの要素\heapqの要素数 $10^4$ $10^5$ $10^6$
(i, j) ave:0.02027
var:0.06974
ave:0.07533
var:0.17644
ave:0.95970
var:0.58200
i*(1 << 20) + j ave:0.01627
var:0.06880
ave:0.03958
var:0.09585
ave:0.33724
var:0.22353
平均時間差 0.00400(-19.7%) 0.03548(-47.1%) 0.62246(-64.9%)

要素数の増加に伴って、実行時間に対する影響が表れているのが分かります。

要素が3つの場合

heapqの要素\heapqの要素数 $10^4$ $10^5$ $10^6$
(i, j, k) ave:0.03369
var:0.09102
ave:0.23662
var:0.20992
ave:5.22803
var:2.06670
i*(1 << 40) + j*(1 << 20) + k ave:0.02335
var:0.08787
ave:0.06169
var:0.17638
ave:0.40682
var:0.33941
平均時間差 0.01034(-30.7%) 0.17493(-73.9%) 4.82121(-92.2%)

(i, j, k)の $N=10^6$ のみ、手動集計です。

要素数が2つの場合に比べて、速度の差がかなり顕著です。特に $N \ge 10^5$ では、オーダーのレベルで実行速度に差が出ています。

3. まとめ

heapqにタプルはおそいので、ハッシュ化したほうがよさそうです。

比較に使用したコード

(i, j)をあつかうheapq
import heapq
import random
import time
from math import sqrt

RANGE = 10**6

def rand():
    return random.randint(0, 100000)

h = []
for _ in range(RANGE):
    i, j = rand(), rand()
    heapq.heappush(h, (i, j))

def main():
    for _ in range(RANGE):
        heapq.heappop(h)
        i, j = rand(), rand()
        heapq.heappush(h, (i, j))

if __name__ == '__main__':
    res = []
    for _ in range(10):
        start = time.time()
        main()
        res.append(time.time() - start)
    ave = sum(res)/len(res)
    var = sqrt(sum((ele-ave)**2 for ele in res))
    print(f'ave:{ave:.5f}<br>var:{var:.5f}')
(i, j, k)をあつかうheapq
import heapq
import random
import time
from math import sqrt

RANGE = 10**6

def rand():
    return random.randint(0, 100000)

h = []
for _ in range(RANGE):
    i, j, k = rand(), rand(), rand()
    heapq.heappush(h, (i, j, k))

def main():
    for _ in range(RANGE):
        heapq.heappop(h)
        i, j, k = rand(), rand(), rand()
        heapq.heappush(h, (i, j, k))

if __name__ == '__main__':
    res = []
    for _ in range(10):
        start = time.time()
        main()
        res.append(time.time() - start)
    ave = sum(res)/len(res)
    var = sqrt(sum((ele-ave)**2 for ele in res))
    print(f'ave:{ave:.5f}<br>var:{var:.5f}')
i*(1 << 20) + jをあつかうheapq
import heapq
import random
import time
from math import sqrt

RANGE = 10**6

def rand():
    return random.randint(0, 100000)

def make_hash(i, j):
    return i * (1 << 20) + j

h = []
for _ in range(RANGE):
    i, j = rand(), rand()
    heapq.heappush(h, make_hash(i, j))

def main():
    for _ in range(RANGE):
        heapq.heappop(h)
        i, j = rand(), rand()
        heapq.heappush(h, make_hash(i, j))

if __name__ == '__main__':
    res = []
    for _ in range(10):
        start = time.time()
        main()
        res.append(time.time() - start)
    ave = sum(res)/len(res)
    var = sqrt(sum((ele-ave)**2 for ele in res))
    print(ave, var)
i*(1 << 40)+ j*(1 << 20) + kをあつかうheapq
import heapq
import random
import time
from math import sqrt

RANGE = 10**6

def rand():
    return random.randint(0, 100000)

def make_hash(i, j, k):
    return i * (1 << 40) + j * (1 << 20) + k

h = []
for _ in range(RANGE):
    i, j, k = rand(), rand(), rand()
    heapq.heappush(h, make_hash(i, j, k))

def main():
    for _ in range(RANGE):
        heapq.heappop(h)
        i, j, k = rand(), rand(), rand()
        heapq.heappush(h, make_hash(i, j, k))

if __name__ == '__main__':
    res = []
    for _ in range(10):
        start = time.time()
        main()
        res.append(time.time() - start)
    ave = sum(res)/len(res)
    var = sqrt(sum((ele-ave)**2 for ele in res))
    print(f'ave:{ave:.5f}<br>var:{var:.5f}')
2
0
0

Register as a new user and use Qiita more conveniently

  1. You get articles that match your needs
  2. You can efficiently read back useful information
  3. You can use dark theme
What you can do with signing up
2
0

Delete article

Deleted articles cannot be recovered.

Draft of this article would be also deleted.

Are you sure you want to delete this article?