LoginSignup
0
0

More than 3 years have passed since last update.

5rep - Embulk CSV・TSV読み込み

Posted at

汎用的な読み込み方法

  • guess.ymlに読み込みフォーマット パースフォーマット記載
  • 下記TSV読み込み
  • 読み込んだらstdoutにoutput
guess.yml
in:
  type: file
  path_prefix: ./{File Name}
  delimiter: "\t"
  quote: '"'
  escape: '\'
  null_string: ''
out:
  type: stdout

guessコマンドでファイル読み込みファイル生成

$ embulk guess ./guess.yml -o config.yml
  • 生成後ファイル
  • ファイルを読み込む際columns名指定・型指定などを記述する必要があるがguessコマンドでよしなに生成してくれる
config.yml
in:
  type: file
  path_prefix: ./{File Name}
  file_ext: .tsv
  parser:
    charset: UTF-8
    newline: LF
    type: csv
    delimiter: "\t"
    quote: '"'
    escape: '"'
    trim_if_not_quoted: false
    skip_header_lines: 0
    allow_extra_columns: false
    allow_optional_columns: false
    columns:                                      # サンプルです
    - {name: c0, type: long}
    - {name: c1, type: long}
    - {name: c2, type: string}
    - {name: c3, type: timestamp, format: '%Y-%m-%d %H:%M:%S%z'}
    - {name: c4, type: string}
    - {name: c5, type: string}
    - {name: c6, type: string}
    - {name: c7, type: string}
    - {name: c8, type: long}
    - {name: c9, type: timestamp, format: '%Y-%m-%d %H:%M:%S.%N%z'}
out:
  type: stdout
0
0
0

Register as a new user and use Qiita more conveniently

  1. You get articles that match your needs
  2. You can efficiently read back useful information
  3. You can use dark theme
What you can do with signing up
0
0