1
4

Delete article

Deleted articles cannot be recovered.

Draft of this article would be also deleted.

Are you sure you want to delete this article?

More than 1 year has passed since last update.

Common Log Formatのdate型ログをEmbulkでElasticsearchにdate型で入れたメモ

Last updated at Posted at 2017-02-07

※こちらの記事は個人ブログに移行しました

なんか苦労したのでメモ。
#読み込むログ

log.json
{"time_local":"05/Feb/2017:03:38:34 +0900","hoge":"piyo"}

NginxやApacheでよく見かけるフォーマット。

#Embulkの設定

Elasticsearch的にUTCで入れたほうが扱いやすいのでtimestamp型で取り込んでUTCに戻してあげる。

config.yml
in:
  type: file
  path_prefix: /path/to/json/
  parser:
    type: jsonl
    charset: UTF-8
    newline: LF
    columns:
      - { name: "time_local", type: "timestamp", format: "%d/%b/%Y:%T %z" }
      - { name: "hoge", type: "string" }
out:
  type: elasticsearch_ruby
  cluster_name: elasticsearch
  mode: normal
  nodes:
    - {host: "es_host", port: 9200}
  index: jsonlog
  index_type: log

このEmbulkの定義で読み込むとtime_localの値は以下の感じで出力される。

2017-02-04 18:38:34 UTC

#Elasticsearchのインデックステンプレート

EmbulkでUTCに戻すのはいいが、このまま突っ込むだけだとElasticsearchはdate型としては受け入れてくれない。
突っ込む前にフォーマットを定義してマッピングしておかないと、文字列としてドキュメントが登録されてしまう。

template.json
{
  "template": "jsonlog",
  "mappings": {
    "log": {
      "properties": {
        "time_local": {
          "type": "date",
          "format": "YYYY-MM-dd HH:mm:ss' UTC'"
        },
        "hoge": {
          "type": "string",
          "index": "not_analyzed"
        }
      }
    }
  }
}
$ curl -XPUT es_host:9200/_template/jsonlog?pretty -d "$(cat template.json)"

あとは上の設定でEmbulkを使ってbulkインポートしてあげれば時系列データとして意味のあるログになる。

1
4
0

Register as a new user and use Qiita more conveniently

  1. You get articles that match your needs
  2. You can efficiently read back useful information
  3. You can use dark theme
What you can do with signing up
1
4

Delete article

Deleted articles cannot be recovered.

Draft of this article would be also deleted.

Are you sure you want to delete this article?