embulkサンプルプラグインの実行

  • 5
    Like
  • 0
    Comment
More than 1 year has passed since last update.

embulkサンプルプラグインの実行

Embulkを使ってプラグインを開発をする時は、サンプルのプラグインを動作させてみましょう。

この記事は現在のembulkに即していません。プラグインを開発する場合、embulk newコマンドを使いましょう

準備

bundleコマンドを使ってプラグインを作成するディレクトリを作成します。例ではmy_bundleになります。

java -jar embulk.jar bundle my_bundle
2015-02-20 08:37:04,854 +0900: Embulk v0.4.4
Initializing my_bundle...
  Creating my_bundle/.bundle/config
  Creating my_bundle/embulk/input/example.rb
  Creating my_bundle/embulk/output/example.rb
  Creating my_bundle/embulk/filter/example.rb
  Creating my_bundle/Gemfile
Fetching: bundler-1.8.2.gem (100%)
Successfully installed bundler-1.8.2
1 gem installed
The Gemfile specifies no dependencies
Resolving dependencies...
Bundle complete! 0 Gemfile dependencies, 1 gem now installed.
Bundled gems are installed into ..

生成されたディレクトリにはサンプルのプラグインが含まれています。プラグインはmy_bundle/embulkに入っています。

tree my_bundle/embulk 
my_bundle/embulk
|-- filter
|   `-- example.rb
|-- input
|   `-- example.rb
`-- output
    `-- example.rb

3 directories, 3 files
  • filter/example.rb : フィルタプラグインサンプル
  • input/example.rb : インプットプラグインサンプル
  • output/example.rb : アウトプットプラグインサンプル

二つの設定ファイルを使ってそれぞれのプラグインを動かします。

サンプルの実行1

サンプルの実行1

  • インプット: exampleプラグイン
  • フィルタ: なし
  • アウトプット: stdout
exec: {}
in:
  type: example
out: {type: stdout}

サンプルの実行

プレビュー

% java -jar embulk.jar -b my_bundle preview sample1.yml
2015-02-20 08:37:43,896 +0900: Embulk v0.4.4
Example input started.
Example input thread 0...
+-------------+-----------------+-----------+-------------+
| file:string | hostname:string | col0:long | col1:double |
+-------------+-----------------+-----------+-------------+
|       file1 |                 |         0 |        10.0 |
|       file1 |                 |         1 |        10.0 |
|       file1 |                 |         2 |        10.0 |
|       file1 |                 |         3 |        10.0 |
|       file1 |                 |         4 |        10.0 |
|       file1 |                 |         5 |        10.0 |
|       file1 |                 |         6 |        10.0 |
|       file1 |                 |         7 |        10.0 |
|       file1 |                 |         8 |        10.0 |
|       file1 |                 |         9 |        10.0 |
+-------------+-----------------+-----------+-------------+

実行

java -jar embulk.jar -b my_bundle run sample1.yml
2015-02-20 08:38:01,772 +0900: Embulk v0.4.4
Example input started.
2015-02-20 08:38:03.195 +0900 [INFO] (transaction): {done:  0 / 2, running: 0}
Example input thread 1...
Example input thread 0...
file2,,0,10.0
file2,,1,10.0
file2,,2,10.0
file2,,3,10.0
file2,,4,10.0
file2,,5,10.0
file2,,6,10.0
file2,,7,10.0
file2,,8,10.0
file2,,9,10.0
file1,,0,10.0
file1,,1,10.0
file1,,2,10.0
file1,,3,10.0
file1,,4,10.0
file1,,5,10.0
file1,,6,10.0
file1,,7,10.0
file1,,8,10.0
file1,,9,10.0
2015-02-20 08:38:03.345 +0900 [INFO] (transaction): {done:  2 / 2, running: 0}
2015-02-20 08:38:03.345 +0900 [INFO] (transaction): {done:  2 / 2, running: 0}
Example input finished. Commit reports = [{},{}]
2015-02-20 08:38:03.413 +0900 [INFO] (main): Committed.
2015-02-20 08:38:03.414 +0900 [INFO] (main): Next config diff: {"in":{},"out":{}}

サンプルの実行2

  • インプット: exampleプラグイン
  • フィルタ: exampleプラグイン(カラム追加)
  • アウトプット: exampleプラグイン(JSON出力)

設定ファイル

下記の例ではexampleプラグイン(入力)で生成されたデータに、カラム名(filter_column)を追加し、各項目の値をvalueに設定します。アウトプットプラグインを使ってJSON形式で出力します。

exec: {}
in:
  type: example
filters:
  - type: example
    filter_key: "filter_column"
    value: "*********"
out: {type: example}

プレビュー

0.4.4より前のバージョンは、input部のみプレビューしていましたが、0.4.4からフィルタまでプレビューできるようになりました。GitHub#55

[arch@bear02 ~]$ java -jar embulk.jar -b my_bundle preview sample2.yml
2015-02-20 09:52:30,015 +0900: Embulk v0.4.4
Example input started.
Example filter started.
Example input thread 0...
+-------------+-----------------+-----------+-------------+-------------------+
| file:string | hostname:string | col0:long | col1:double | filter_key:string |
+-------------+-----------------+-----------+-------------+-------------------+
|       file1 |                 |         0 |        10.0 |         ********* |
|       file1 |                 |         1 |        10.0 |         ********* |
|       file1 |                 |         2 |        10.0 |         ********* |
|       file1 |                 |         3 |        10.0 |         ********* |
|       file1 |                 |         4 |        10.0 |         ********* |
|       file1 |                 |         5 |        10.0 |         ********* |
|       file1 |                 |         6 |        10.0 |         ********* |
|       file1 |                 |         7 |        10.0 |         ********* |
|       file1 |                 |         8 |        10.0 |         ********* |
|       file1 |                 |         9 |        10.0 |         ********* |
+-------------+-----------------+-----------+-------------+-------------------+

実行(run)

java -jar embulk.jar -b my_bundle run sample2.yml
2015-02-20 09:53:41,050 +0900: Embulk v0.4.4
Example input started.
Example filter started.
Example output started.
2015-02-20 09:53:42.443 +0900 [INFO] (transaction): {done:  0 / 2, running: 0}
Example output thread 1...
Example output thread 0...
Example input thread 1...
Example input thread 0...
record: {"file":"file1","hostname":null,"col0":0,"col1":10.0,"filter_key":"*********"}
record: {"file":"file1","hostname":null,"col0":1,"col1":10.0,"filter_key":"*********"}
record: {"file":"file1","hostname":null,"col0":2,"col1":10.0,"filter_key":"*********"}
record: {"file":"file1","hostname":null,"col0":3,"col1":10.0,"filter_key":"*********"}
record: {"file":"file1","hostname":null,"col0":4,"col1":10.0,"filter_key":"*********"}
record: {"file":"file1","hostname":null,"col0":5,"col1":10.0,"filter_key":"*********"}
record: {"file":"file1","hostname":null,"col0":6,"col1":10.0,"filter_key":"*********"}
record: {"file":"file1","hostname":null,"col0":7,"col1":10.0,"filter_key":"*********"}
record: {"file":"file1","hostname":null,"col0":8,"col1":10.0,"filter_key":"*********"}
record: {"file":"file1","hostname":null,"col0":9,"col1":10.0,"filter_key":"*********"}
2015-02-20 09:53:42.780 +0900 [INFO] (transaction): {done:  1 / 2, running: 1}
record: {"file":"file2","hostname":null,"col0":0,"col1":10.0,"filter_key":"*********"}
record: {"file":"file2","hostname":null,"col0":1,"col1":10.0,"filter_key":"*********"}
record: {"file":"file2","hostname":null,"col0":2,"col1":10.0,"filter_key":"*********"}
record: {"file":"file2","hostname":null,"col0":3,"col1":10.0,"filter_key":"*********"}
record: {"file":"file2","hostname":null,"col0":4,"col1":10.0,"filter_key":"*********"}
record: {"file":"file2","hostname":null,"col0":5,"col1":10.0,"filter_key":"*********"}
record: {"file":"file2","hostname":null,"col0":6,"col1":10.0,"filter_key":"*********"}
record: {"file":"file2","hostname":null,"col0":7,"col1":10.0,"filter_key":"*********"}
record: {"file":"file2","hostname":null,"col0":8,"col1":10.0,"filter_key":"*********"}
record: {"file":"file2","hostname":null,"col0":9,"col1":10.0,"filter_key":"*********"}
2015-02-20 09:53:42.785 +0900 [INFO] (transaction): {done:  2 / 2, running: 0}
Example output finished. Commit reports = [{"records":10},{"records":10}]
Example filter finished.
Example input finished. Commit reports = [{},{}]
2015-02-20 09:53:42.831 +0900 [INFO] (main): Committed.
2015-02-20 09:53:42.832 +0900 [INFO] (main): Next config diff: {"in":{},"out":{}}

参考