LoginSignup
13
9

More than 5 years have passed since last update.

Pythonオブジェクトを圧縮して保存したい

Posted at

Pythonオブジェクトをファイルに書き込む際には,pickleによる直列化(シリアライズ)を使うのが効率的です.

しかし,pickleシリアライズは,単にオブジェクトを文字列に変換するだけであるため,メモリ・ディスク効率は必ずしも良くありません.

そこで,bz2による圧縮をこれに組み合わせます.以下が,ソースになります.

compressed_pickle.py
import cPickle as pickle
import bz2

def loads(comp):
    return pickle.loads(bz2.decompress(comp))

def dumps(obj):
    return bz2.decompress(pickle.dumps(obj))

def load(fname):
    fin = bz2.BZ2File(fname, 'rb')
    try:
        pkl = fin.read()
    finally:
        fin.close()
    return pickle.loads(pkl)

def dump(obj, fname, level=9):
    pkl = pickle.dump(obj)
    fout = bz2.BZ2File(fname, 'wb', compresslevel=level)
    try:
        fout.write(pkl)
    finally:
        fout.close()

利用するときには,こんな感じになります.

test.py
import compressed_pickle as pickle

# object
text = 'This implementation is written by @_akisato.'

# write to a file
outname = 'simple_text.pkl.bz2'
pickle.dump(text, outname)

# read from a file
inname = 'simple_text.pkl.bz2'
text2 = pickle.load(inname)

# dump to a memory
comp = pickle.dumps(text)

# load from a memory
text3 = pickle.loads(comp)
13
9
0

Register as a new user and use Qiita more conveniently

  1. You get articles that match your needs
  2. You can efficiently read back useful information
  3. You can use dark theme
What you can do with signing up
13
9