More than 5 years have passed since last update.

EC-CUBE4で会員データを大量に作成する

Last updated at 2019-11-22Posted at 2019-11-11

環境

・EC-CUBE4.0.3
・PostgreSQL10

はじめに

EC-CUBEで会員データや商品/受注データを大量に作るときには、EC-CUBE3系の頃からnanasessさん作成の Fakerを使ったGenerateDummyDataCommand コマンドに大変お世話になっています。
基本的に言って、GenerateDummyDataCommandで事足りますが、今回は別の方法で、50万件、100万件の会員データをサクッと作る方法を載せます。

概要

50万件の会員データを作るということは、50万行のCSVファイルを作り、' \COPY 'コマンドを流す(PostgreSQLの場合)ということに帰着します。
今回は、数万のファイルもすぐに書けるPythonでCSVファイルを作ります。

Google Colaboratoryでの実行

Jupyter Notebookのコードはこちらになります。
Playgroundで開くをクリックし、「ドライブにコピー」のリンクを押すと各自の環境で実行できます。

Jupyter Notebookの流れ

ec-cubecreatecustomer.ipynb

# まずはインポート
import numpy as np
import pandas as pd
import random, string

この辺はいいですねｗ Google Colabで最後にCSVをダウンロードするためのfilesもこの後インポートしています。

sample_customer_ja = pd.DataFrame([
["髙﨑","棗","タカサキ","ナツメ","takasaki@example.com","2","1993/10/25",
["𠮷田","景子","ヨシダ","ケイコ","yoshida+110@example.com","2","1952/4/30",
["草彅","剛","クサナギ","ツヨシ","kusanagi@example.com","1","1975/8/31",
["坪内","柚月","ツボウチ ユヅキ","ユヅキ","I73WwXd@example.com","2","1991/8/30",
["奥平","克洋","オクダイラ カツヒロ","カツヒロ","viDF89@example.com","","1986/12/29",
["藤代","敏嗣","フジシロ トシツグ","トシツグ","lxm5oVZqy@example.com","1","1967/7/4",

サンプルのデータを列挙しています。はしご高とか下の棒が長い吉とか、英語名だと,(カンマ)とかテストデータとして入れています。

これらのテストデータを np.random.randint で抽出し、dtb_customerの各フィールドに map で入れていきます。

import time
start = time.time()

customer_data['name01'] = customer_data['zseed'].map(random_name01)
customer_data['name02'] = customer_data['zseed'].map(random_name02)
customer_data['kana01'] = customer_data['zseed'].map(random_kana01)
customer_data['kana02'] = customer_data['zseed'].map(random_kana02)

customer_data['postal_code'] = customer_data['zseed'].map(random_zip)
customer_data['pref_id'] = customer_data['zseed'].map(random_prefid)
customer_data['addr01'] = customer_data['zseed'].map(random_addr01)
customer_data['addr02'] = customer_data['zseed'].map(random_addr02)
customer_data['email'] = customer_data['zseed'].map(random_addr02)
customer_data['birth'] = customer_data['zseed'].map(random_birth)

end = time.time()
print(end-start, '秒')

時間を計測すると、10万件で3分でした。

customer_data = customer_data.drop('zseed', axis=1)
customer_data = customer_data.drop('language_cd', axis=1)

customer_data.to_csv('dtb_customer.csv', encoding="UTF-8", index=False)
files.download('dtb_customer.csv')

出来上がったデータには"seed"を入れていましたので、不要な列を削除し、CSVを出力します。

Copyコマンド

最後に\COPYコマンドを流します。

ec403_db=# \COPY dtb_customer ( customer_status_id,sex_id,job_id,country_id,pref_id,name01,name02,kana01,kana02,company_name,postal_code,addr01,addr02,email,phone_number,birth,password,salt,secret_key,first_buy_date,last_buy_date,buy_times,buy_total,note,reset_key,reset_expire,point,create_date,update_date,discriminator_type) from '/var/www/html/dtb_customer.csv' CSV HEADER

こちらは一瞬でした。

以上で、dtb_customerに大量データが作られました。

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up