More than 5 years have passed since last update.

python 2.7.9とpypy 2.5.0のファイル書き込み速度比較実験

Python

Last updated at 2015-02-19Posted at 2015-02-19

研究でpythonを使っているのですが，データがGB単位になると計算がなかなか終わらない．
そこで，JITコンパイラを搭載したpypyなるものを使って
ササッと終わらせちゃおうと考えました！！
同じようにコードを書いてみたら，確かに計算の部分はめちゃめちゃ早くなったが，
csvモジュール使ったファイル書き込みに関してはあまり早くなっていない気がする，，
なんでだろうと思ったところ，pypyのページにこんなことが書いてあった

Missing RPython modules: A few modules of the standard library (like csv and cPickle) are written in C in CPython, but written natively in pure Python in PyPy. Sometimes the JIT is able to do a good job on them, and sometimes not. In most cases (like csv and cPickle), we're slower than CPython, with the notable exception of json and heapq.

マジかよ．と思ってその場はpythonを使ったんですが，やはりpythonでは遅い，，，

公式から遅いと言われているpypyはどれくらい遅いのか気になって今回実験してみました．

それがこれ

test.py

# coding:utf-8

import csv
import time



start = time.clock()
test_sentence = ["which","is","faster","Python","or","Pypy","?"]
iter_list = [100,1000,10000,100000,1000000,2000000,5000000,10000000]
time_list = []

with open("fast_writing.csv","wb")as f:
    writer = csv.writer(f)
    for i in iter_list:
        loop_start = time.clock()
        for j in xrange(i):
            writer.writerow(test_sentence)
        end_time = time.clock() -loop_start
        time_list.append(end_time)

end_loop = time.clock() - start
time_list.append(end_loop)
print "かかった時間は{0}".format(end_loop)

with open("pythonresult.csv","wb")as f:
    writer= csv.writer(f)
    writer.writerow(time_list)

回数が増えると段々とpypyが早くなっていますね．
確かに1万回程度の書き込みであればpythonの方が早そうです

条件にもよるんでしょうか．こんなしょっぱい実験ではあまり効果が分からないんですかね

しかし書き込み回数が増えると，pypyのほうが早くなっています．
大規模な行列に対して処理をかけるときはpypyの方が調子が良さそうです．
pypyにはnumpyもあるらしく行列処理であればpypyを使って高速に処理するのがいいんじゃないでしょうか．ただし，pypyを使うときには注意すべき点がいくつかあり，怠ると処理時間が無駄に長くなるので気をつけたいですね．

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up