More than 5 years have passed since last update.

CSVデータのヒストグラム値（頻度数）の計算（qコマンド利用）

AWK
CSV
ヒストグラム
qコマンド

Last updated at 2015-11-21Posted at 2015-06-15

こんにちは。
CSVデータに対して、qコマンドを使ってヒストグラムデータ（頻度数）を計算してみました。
第一カラムの値に対して、0.5 刻みで出力する例は

$ cat data.csv | q "select round(c1/0.5-0.5)*0.5, count(round(c1/0.5-0.5)) from - group by round(c1/0.5-0.5)" | awk '{r="";i=$2;while(i-->0)r=r"#";printf "%s %2d %s\n",$1,$2,r;}'
1.0  4 #### 
1.5  8 ######## 
2.0 12 ############ 
2.5  5 ##### 
3.0  1 #

1.0 以上、1.5 未満の頻度数が 4 です
awkコマンドで図示（#の連続）を付け加えています

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up