1
2

Delete article

Deleted articles cannot be recovered.

Draft of this article would be also deleted.

Are you sure you want to delete this article?

More than 5 years have passed since last update.

csvファイルのpostgres転送時間比較。pandas.to_sql,embulk,copy

Posted at

1000万行のcsvファイルのpostgres転送時間を比較してみた

結論

copyが圧倒的に早い。csvファイルの出力より早い。

方法 時間
pandas.to_sql 22分2秒
embulk(insert_direct) 6分3秒
copy(postgres) 0分12秒
参考)csvファイル出力 0分50秒

環境

cpu:ryzen 7 1700
メモリ:32G
postgres:Ver10
os:win10
記憶装置:HDD

考察

  • postgresの設定デフォルトのままなので、最適化するとちょっと結果が変わるかも
  • pg_bulkloadを使えばembulkでも高速化できそうだがwin10だと大変そう。

転送データの作成

pythonでダミーデータを作成。1000万行

import pandas as pd
import numpy as np
from sqlalchemy import create_engine

df = pd.DataFrame({ 'A' : np.random.randint(0,100,10000000),
                        'B' : np.random.randint(0,100,10000000),
                        'C' : np.random.randint(0,100,10000000),
                        'D' : np.random.randint(0,100,10000000)
                        })
df.to_csv('embulk_test.csv',index=False)

image.png

pandas.to_sqlで転送

engine = create_engine('postgresql://postgres:password@localhost:5432/test')
df.to_sql('pandas_to_sql', engine, if_exists='replace')

image.png

embulkで転送

in:
  type: file
  path_prefix: embulk_test.csv
  parser:
    charset: UTF-8
    newline: CRLF
    type: csv
    delimiter: ','
    quote: '"'
    escape: '"'
    trim_if_not_quoted: false
    skip_hander_lines: 1
    allow_extra_columns: false
    allow_optional_columns: false
    columns:
    - {name: 'A', type: long}
    - {name: 'B', type: long}
    - {name: 'C', type: long}
    - {name: 'D', type: long}

out:
  type: postgresql
  mode: insert_direct
  default_timezone: "Asia/Tokyo"
  host: localhost
  port: 5432
  user: postgres
  password: "password"
  database: test
  table: embulk_table

image.png

1
2
0

Register as a new user and use Qiita more conveniently

  1. You get articles that match your needs
  2. You can efficiently read back useful information
  3. You can use dark theme
What you can do with signing up
1
2

Delete article

Deleted articles cannot be recovered.

Draft of this article would be also deleted.

Are you sure you want to delete this article?