embulkサンプルプラグインの実行
Embulkを使ってプラグインを開発をする時は、サンプルのプラグインを動作させてみましょう。
この記事は現在のembulkに即していません。プラグインを開発する場合、embulk new
コマンドを使いましょう
準備
bundleコマンドを使ってプラグインを作成するディレクトリを作成します。例ではmy_bundleになります。
java -jar embulk.jar bundle my_bundle
2015-02-20 08:37:04,854 +0900: Embulk v0.4.4
Initializing my_bundle...
Creating my_bundle/.bundle/config
Creating my_bundle/embulk/input/example.rb
Creating my_bundle/embulk/output/example.rb
Creating my_bundle/embulk/filter/example.rb
Creating my_bundle/Gemfile
Fetching: bundler-1.8.2.gem (100%)
Successfully installed bundler-1.8.2
1 gem installed
The Gemfile specifies no dependencies
Resolving dependencies...
Bundle complete! 0 Gemfile dependencies, 1 gem now installed.
Bundled gems are installed into ..
生成されたディレクトリにはサンプルのプラグインが含まれています。プラグインはmy_bundle/embulkに入っています。
tree my_bundle/embulk
my_bundle/embulk
|-- filter
| `-- example.rb
|-- input
| `-- example.rb
`-- output
`-- example.rb
3 directories, 3 files
- filter/example.rb : フィルタプラグインサンプル
- input/example.rb : インプットプラグインサンプル
- output/example.rb : アウトプットプラグインサンプル
二つの設定ファイルを使ってそれぞれのプラグインを動かします。
サンプルの実行1
サンプルの実行1
- インプット: exampleプラグイン
- フィルタ: なし
- アウトプット: stdout
exec: {}
in:
type: example
out: {type: stdout}
サンプルの実行
プレビュー
% java -jar embulk.jar -b my_bundle preview sample1.yml
2015-02-20 08:37:43,896 +0900: Embulk v0.4.4
Example input started.
Example input thread 0...
+-------------+-----------------+-----------+-------------+
| file:string | hostname:string | col0:long | col1:double |
+-------------+-----------------+-----------+-------------+
| file1 | | 0 | 10.0 |
| file1 | | 1 | 10.0 |
| file1 | | 2 | 10.0 |
| file1 | | 3 | 10.0 |
| file1 | | 4 | 10.0 |
| file1 | | 5 | 10.0 |
| file1 | | 6 | 10.0 |
| file1 | | 7 | 10.0 |
| file1 | | 8 | 10.0 |
| file1 | | 9 | 10.0 |
+-------------+-----------------+-----------+-------------+
実行
java -jar embulk.jar -b my_bundle run sample1.yml
2015-02-20 08:38:01,772 +0900: Embulk v0.4.4
Example input started.
2015-02-20 08:38:03.195 +0900 [INFO] (transaction): {done: 0 / 2, running: 0}
Example input thread 1...
Example input thread 0...
file2,,0,10.0
file2,,1,10.0
file2,,2,10.0
file2,,3,10.0
file2,,4,10.0
file2,,5,10.0
file2,,6,10.0
file2,,7,10.0
file2,,8,10.0
file2,,9,10.0
file1,,0,10.0
file1,,1,10.0
file1,,2,10.0
file1,,3,10.0
file1,,4,10.0
file1,,5,10.0
file1,,6,10.0
file1,,7,10.0
file1,,8,10.0
file1,,9,10.0
2015-02-20 08:38:03.345 +0900 [INFO] (transaction): {done: 2 / 2, running: 0}
2015-02-20 08:38:03.345 +0900 [INFO] (transaction): {done: 2 / 2, running: 0}
Example input finished. Commit reports = [{},{}]
2015-02-20 08:38:03.413 +0900 [INFO] (main): Committed.
2015-02-20 08:38:03.414 +0900 [INFO] (main): Next config diff: {"in":{},"out":{}}
サンプルの実行2
- インプット: exampleプラグイン
- フィルタ: exampleプラグイン(カラム追加)
- アウトプット: exampleプラグイン(JSON出力)
設定ファイル
下記の例ではexampleプラグイン(入力)で生成されたデータに、カラム名(filter_column)を追加し、各項目の値をvalueに設定します。アウトプットプラグインを使ってJSON形式で出力します。
exec: {}
in:
type: example
filters:
- type: example
filter_key: "filter_column"
value: "*********"
out: {type: example}
プレビュー
0.4.4より前のバージョンは、input部のみプレビューしていましたが、0.4.4からフィルタまでプレビューできるようになりました。GitHub#55
[arch@bear02 ~]$ java -jar embulk.jar -b my_bundle preview sample2.yml
2015-02-20 09:52:30,015 +0900: Embulk v0.4.4
Example input started.
Example filter started.
Example input thread 0...
+-------------+-----------------+-----------+-------------+-------------------+
| file:string | hostname:string | col0:long | col1:double | filter_key:string |
+-------------+-----------------+-----------+-------------+-------------------+
| file1 | | 0 | 10.0 | ********* |
| file1 | | 1 | 10.0 | ********* |
| file1 | | 2 | 10.0 | ********* |
| file1 | | 3 | 10.0 | ********* |
| file1 | | 4 | 10.0 | ********* |
| file1 | | 5 | 10.0 | ********* |
| file1 | | 6 | 10.0 | ********* |
| file1 | | 7 | 10.0 | ********* |
| file1 | | 8 | 10.0 | ********* |
| file1 | | 9 | 10.0 | ********* |
+-------------+-----------------+-----------+-------------+-------------------+
実行(run)
java -jar embulk.jar -b my_bundle run sample2.yml
2015-02-20 09:53:41,050 +0900: Embulk v0.4.4
Example input started.
Example filter started.
Example output started.
2015-02-20 09:53:42.443 +0900 [INFO] (transaction): {done: 0 / 2, running: 0}
Example output thread 1...
Example output thread 0...
Example input thread 1...
Example input thread 0...
record: {"file":"file1","hostname":null,"col0":0,"col1":10.0,"filter_key":"*********"}
record: {"file":"file1","hostname":null,"col0":1,"col1":10.0,"filter_key":"*********"}
record: {"file":"file1","hostname":null,"col0":2,"col1":10.0,"filter_key":"*********"}
record: {"file":"file1","hostname":null,"col0":3,"col1":10.0,"filter_key":"*********"}
record: {"file":"file1","hostname":null,"col0":4,"col1":10.0,"filter_key":"*********"}
record: {"file":"file1","hostname":null,"col0":5,"col1":10.0,"filter_key":"*********"}
record: {"file":"file1","hostname":null,"col0":6,"col1":10.0,"filter_key":"*********"}
record: {"file":"file1","hostname":null,"col0":7,"col1":10.0,"filter_key":"*********"}
record: {"file":"file1","hostname":null,"col0":8,"col1":10.0,"filter_key":"*********"}
record: {"file":"file1","hostname":null,"col0":9,"col1":10.0,"filter_key":"*********"}
2015-02-20 09:53:42.780 +0900 [INFO] (transaction): {done: 1 / 2, running: 1}
record: {"file":"file2","hostname":null,"col0":0,"col1":10.0,"filter_key":"*********"}
record: {"file":"file2","hostname":null,"col0":1,"col1":10.0,"filter_key":"*********"}
record: {"file":"file2","hostname":null,"col0":2,"col1":10.0,"filter_key":"*********"}
record: {"file":"file2","hostname":null,"col0":3,"col1":10.0,"filter_key":"*********"}
record: {"file":"file2","hostname":null,"col0":4,"col1":10.0,"filter_key":"*********"}
record: {"file":"file2","hostname":null,"col0":5,"col1":10.0,"filter_key":"*********"}
record: {"file":"file2","hostname":null,"col0":6,"col1":10.0,"filter_key":"*********"}
record: {"file":"file2","hostname":null,"col0":7,"col1":10.0,"filter_key":"*********"}
record: {"file":"file2","hostname":null,"col0":8,"col1":10.0,"filter_key":"*********"}
record: {"file":"file2","hostname":null,"col0":9,"col1":10.0,"filter_key":"*********"}
2015-02-20 09:53:42.785 +0900 [INFO] (transaction): {done: 2 / 2, running: 0}
Example output finished. Commit reports = [{"records":10},{"records":10}]
Example filter finished.
Example input finished. Commit reports = [{},{}]
2015-02-20 09:53:42.831 +0900 [INFO] (main): Committed.
2015-02-20 09:53:42.832 +0900 [INFO] (main): Next config diff: {"in":{},"out":{}}