More than 5 years have passed since last update.

EmbulkでCloudtrailログをelasticsearchに放り込む

Last updated at 2015-04-26Posted at 2015-04-24

始めに

Embulkを使ってCloudtrailログをelasticsearchに放り込むのが便利だったのでので、やり方をまとめておきます。

準備

elasticsearchとkibanaをインストールしておく。手順は下記リンクなどを参考に。
- http://www.elasticsearch.org/download/
- http://www.elasticsearch.org/overview/kibana/installation/
elasticsearch上にマッピングを作成。kibanaで分析する場合は各フィールド適宜not_analyzed を指定する。今回は以下ののmappingを使用。

{
    "mappings": {
        "log": {
            "properties": {
                "eventName": {
                    "type": "string",
                    "index": "not_analyzed"
                },
                "eventSource": {
                    "type": "string",
                    "index": "not_analyzed"
                },
                "awsRegion": {
                    "type": "string",
                    "index": "not_analyzed"
                },
                "sourceIPAddress": {
                    "type": "string",
                    "index": "not_analyzed"
                },
                "errorCode": {
                    "type": "string",
                    "index": "not_analyzed"
                },
                "requestID": {
                    "type": "string",
                    "index": "not_analyzed"
                },
                "eventID": {
                    "type": "string",
                    "index": "not_analyzed"
                },
                "userType": {
                    "type": "string",
                    "index": "not_analyzed"
                },
                "userArn": {
                    "type": "string",
                    "index": "not_analyzed"
                },
                "userName": {
                    "type": "string",
                    "index": "not_analyzed"
                },
                "accesskeyid": {
                    "type": "string",
                    "index": "not_analyzed"
                },
                "userAgent": {
                    "type": "string",
                    "index": "not_analyzed"
                },
                "eventTime": {
                    "type": "date",
                    "format": "dateOptionalTime"
                },
                "errorMessage": {
                    "type": "string",
                    "index": "not_analyzed"
                }
            }
        }
    }
}

えんばるくる（Embulkの実行）

Embulkをインストールする。

$ curl –create-dirs. –o ~/.embulk/bin/embulk –L http://dl.embulk.org/embulk-latest.jar
$ chmod +x ~/.embulk/bin/embulk
$ echo 'export PATH="$HOME/.embulk/bin:$PATH"' >> ~/.bashrc
$ source ~/.bashrc

S3 input、json parser、elasticsearch output のプラグインをインストールする。

$ embulk gem install embulk-input-s3
$ embulk gem install embulk-output-elasticsearch
$ embulk gem install embulk-parser-json

設定ファイルを書く。

in:
  type: s3
  bucket: <bucket name>
  path_prefix: <path to cloudtrail log>
  endpoint: <s3 region endpoint>
  access_key_id: <access key>
  secret_access_key: <secret key>
  decoders:
  - {type: gzip}
  parser: 
    type: json
    root: $.Records
    schema:
      - {name: eventName, type: string}
      - {name: eventSource, type: string}
      - {name: awsRegion, type: string}
      - {name: sourceIPAddress, type: string}
      - {name: eventTime, type: string}
      - {name: requestID, type: string}
      - {name: eventID, type: string}
      - {name: userType, path: userIdentity.type, type: string}
      - {name: userArn, path: userIdentity.arn, type: string}
      - {name: userName, path: userIdentity.userName, type: string}
      - {name: accesskeyid, path: userIdentity.accessKeyId, type: string}
      - {name: userAgent, type: string}
      - {name: errorCode, type: string}
      - {name: errorMessage, type: string}
out: 
  type: elasticsearch
  cluster_name: <clustername>
  nodes:
    - {host: "localhost", port: 9300}
  index: cloudtrail
  index_type: log

previewコマンドでinput側のdry-runができる。