- 自分用のメモの色合いが濃いです
- 下記の手順を記述してあります
- インストール
- プラグインのインストール
- 設定ファイルの自動生成
- jsonlのデータをsqliteにいれる
インストール
$ curl -o ~/bin/embulk -L https://bintray.com/artifact/download/embulk/maven/embulk-0.8.13.jar
$ chmod +x ~/bin/embulk
作業ディレクトリ作成, embulk bundle
$ mkdir embulktest && cd $_
$ embulk mkbundle bundle
$ echo "gem 'embulk-parser-jsonl'" | tee -a ./bundle/Gemfile
$ echo "gem 'embulk-output-sqlite3'" | tee -a ./bundle/Gemfile
$ (cd bundle/ && embulk bundle)
seed.yml.liquid をこんな感じで作成
in:
type: file
path_prefix: "{{ env.PWD }}/{{ env.YYYYmmdd }}"
out:
type: stdout
データはjsonlでテスト用のものをとても適当を作成
ymlに環境変数を埋め込んで使ってみたかったのでファイル名はdate +"%Y%m%d
.jsonl
$ pbpaste | tee ./`date +"%Y%m%d"`.jsonl
{"id":1, "title":"Sample Konfabulator Widget","nameName":"main_window","width":500,"height":500}
{"id":2, "title":"NEXT Widget","nameName":"sub_window","width":100,"height":200}
{"id":3, "title":"tiger","nameName":"so_l","width":250,"height":400}
{"id":4, "title":"Sample Konfabulator Widget","nameName":"main_window","width":500,"height":500}
{"id":5, "title":"NEXT Widget","nameName":"sub_window","width":100,"height":200}
{"id":5, "title":"tiger","nameName":"so_l","width":250,"height":400}
{"id":7, "title":"Sample Konfabulator Widget","nameName":"main_window","width":500,"height":500}
$ PWD=`pwd` YYYYmmdd=`date +'%Y%m%d'` embulk guess -b ./bundle/ -g jsonl seed.yml.liquid
$ PWD=`pwd` YYYYmmdd=`date +'%Y%m%d'` embulk guess -b ./bundle/ -g jsonl seed.yml.liquid -o jsonl-test.yml
$ PWD=`pwd` YYYYmmdd=`date +'%Y%m%d'` embulk preview ./jsonl-test.yml -b ./bundle/
...
+---------+----------------------------+-----------------+------------+-------------+
| id:long | title:string | nameName:string | width:long | height:long |
+---------+----------------------------+-----------------+------------+-------------+
| 1 | Sample Konfabulator Widget | main_window | 500 | 500 |
| 2 | NEXT Widget | sub_window | 100 | 200 |
| 3 | tiger | so_l | 250 | 400 |
| 4 | Sample Konfabulator Widget | main_window | 500 | 500 |
| 5 | NEXT Widget | sub_window | 100 | 200 |
| 5 | tiger | so_l | 250 | 400 |
| 7 | Sample Konfabulator Widget | main_window | 500 | 500 |
+---------+----------------------------+-----------------+------------+-------------+
ちゃんとguess出来てそうなのでjsonl-test.yml
にsqliteのoutput設定を記述
# 追記した部分
out:
type: sqlite3
database: '/tmp/embulktest.db' # 特にDBファイル,テーブルを作っておく必要はない
table: 'embulktest1'
実行
$ PWD=`pwd` YYYYmmdd=`date +'%Y%m%d'` embulk run ./jsonl-test.yml -b ./bundle/
2016-08-22 23:49:21.754 +0900: Embulk v0.8.13
2016-08-22 23:49:23.289 +0900 [INFO] (0001:transaction): Loaded plugin embulk-output-sqlite3 (0.0.1)
2016-08-22 23:49:23.334 +0900 [INFO] (0001:transaction): Loaded plugin embulk-parser-jsonl (0.2.0)
2016-08-22 23:49:23.364 +0900 [INFO] (0001:transaction): Listing local files at directory '/Users/sakamotoakira/work/embulktest' filtering filename by prefix '20160822'
2016-08-22 23:49:23.377 +0900 [INFO] (0001:transaction): Loading files [/Users/sakamotoakira/work/embulktest/20160822.jsonl]
2016-08-22 23:49:23.454 +0900 [INFO] (0001:transaction): Using local thread executor with max_threads=8 / output tasks 4 = input tasks 1 * 4
2016-08-22 23:49:23.716 +0900 [INFO] (0001:transaction): {done: 0 / 1, running: 0}
2016-08-22 23:49:23.919 +0900 [INFO] (0001:transaction): {done: 1 / 1, running: 0}
2016-08-22 23:49:23.955 +0900 [INFO] (main): Committed.
2016-08-22 23:49:23.956 +0900 [INFO] (main): Next config diff: {"in":{"last_path":"/Users/sakamotoakira/work/embulktest/20160822.jsonl"},"out":{}}
DBファイルも無いところからテーブルまで生成できる (ちょっとびっくりした)
$ sqlite3 /tmp/embulktest.db "select * from embulktest1;"
1|Sample Konfabulator Widget|main_window|500|500
2|NEXT Widget|sub_window|100|200
3|tiger|so_l|250|400
4|Sample Konfabulator Widget|main_window|500|500
5|NEXT Widget|sub_window|100|200
5|tiger|so_l|250|400
7|Sample Konfabulator Widget|main_window|500|500
まとめ
- embulkはjarひとつから簡単に使い始めることが出来る
- guessやpreviewなど、作業時にうれしい機能が揃っている
- sqliteプラグインは便利で、けっこう横着させてくれる
- 調査や集計などで作り捨てのsqliteファイルを作るのに良さそうだと思った