LoginSignup
1
0

tsv を 1行1件ずつの json にする・改

Last updated at Posted at 2023-05-18

以前こんな記事書いたんですけど、
https://qiita.com/arc279/items/2515272a7050c0a13cbf

もうちょいイケてるやり方があったのでご査収ください。

コード

tsv2jsonl.jq
#!/usr/bin/env jq -n -R -f

input|split("\t") as $header
| inputs
| [$header, split("\t")]
| transpose
| map({(.[0]): .[1]})
| add

で、実行権限付けておく

bash
$ chmod +x tsv2jsonl.jq

これだけ。

使用例

bash
$ { seq -f 'c%g' 5; seq -f '%04g' inf; } | paste - - - - - | ./tsv2jsonl.jq -c | head -n 10
{"c1":"0001","c2":"0002","c3":"0003","c4":"0004","c5":"0005"}
{"c1":"0006","c2":"0007","c3":"0008","c4":"0009","c5":"0010"}
{"c1":"0011","c2":"0012","c3":"0013","c4":"0014","c5":"0015"}
{"c1":"0016","c2":"0017","c3":"0018","c4":"0019","c5":"0020"}
{"c1":"0021","c2":"0022","c3":"0023","c4":"0024","c5":"0025"}
{"c1":"0026","c2":"0027","c3":"0028","c4":"0029","c5":"0030"}
{"c1":"0031","c2":"0032","c3":"0033","c4":"0034","c5":"0035"}
{"c1":"0036","c2":"0037","c3":"0038","c4":"0039","c5":"0040"}
{"c1":"0041","c2":"0042","c3":"0043","c4":"0044","c5":"0045"}
{"c1":"0046","c2":"0047","c3":"0048","c4":"0049","c5":"0050"}

seq -f '%04g' inf; ってやってる通り、1行ずつ処理してくのでメモリに優しい。

ポイント

-n オプションつけると標準入力から読むのをやめる(null が渡された扱いになる)

bash
$ jq -n .
null

で、

  • input :標準入力から自力で1行読む(bash の read みたいな感じ)
  • inputs :標準入力から残りを全部読む(cat みたいな感じ)

を使って1行ずつ処理してく。っていう話。

cf. https://stedolan.github.io/jq/manual/#IO

1
0
0

Register as a new user and use Qiita more conveniently

  1. You get articles that match your needs
  2. You can efficiently read back useful information
  3. You can use dark theme
What you can do with signing up
1
0