6
6

Delete article

Deleted articles cannot be recovered.

Draft of this article would be also deleted.

Are you sure you want to delete this article?

More than 5 years have passed since last update.

key-value 形式のデータをシェルコマンドでランキング集計する

Last updated at Posted at 2014-05-09

key-value 形式のデータをシェルコマンドでランキング集計する

別にそんなランキングが好きってワケでもないのだけど、
sortuniqawkでいけそうなので試してみた。

元のファイル

scores.txt
user_1 2
user_2 4
user_3 3
user_4 5
user_5 5
user_6 10
user_7 10
user_8 2
user_9 5
user_10 9
user_11 1
user_12 10
user_13 5
user_14 3
user_15 10
user_16 3
user_17 8
user_18 8
user_19 6
user_20 8

値で降順ソート

bash
cat scores.txt | sort -n -k2 -r
output
user_7 10
user_6 10
user_15 10
user_12 10
user_10 9
user_20 8
user_18 8
user_17 8
user_19 6
user_9 5
user_5 5
user_4 5
user_13 5
user_2 4
user_3 3
user_16 3
user_14 3
user_8 2
user_1 2
user_11 1

ランク付け

bash
echo -e "rank\tscore\t同一値が何件あるか"
echo "---------------------------------"
cat scores.txt | sort -n -k2 -r | uniq -c --skip-fields=1 | awk 'BEGIN {rank=1} {print rank "\t" $3 "\t" $1; rank += $1}'
output
rank	score	同一値が何件あるか
---------------------------------
1	10	4
5	9	1
6	8	3
9	6	1
10	5	4
14	4	1
15	3	3
18	2	2
20	1	1

用途とか

DBから吐き出したデータ流し込んでみたり。
1万件程度ならたぶん一瞬で終わるはず。

手元のmac book airだと50万件で5secくらいだった。

bash
$ echo "select user_id, some_value from some_scores" | mysql -N > output
$ wc output
  560064 1120128 7120489 output
$ time cat output | sort -n -k2 -r | uniq -c --skip-fields=1 | awk 'BEGIN {rank=1} {print rank "\t" $3 "\t" $1; rank += $1 }'
1       15931976        1
2       12639996        1
3       12243584        1
4       10977612        1
5       10884615        1
6       10373565        1
7       10055521        1
8       10042108        1
9       9980887 1
10      9505452 1

----

real	0m5.271s
user	0m3.688s
sys	0m0.094s

ちなみに実際はgsortguniqgawkでやってます。例によってmacなので。

6
6
0

Register as a new user and use Qiita more conveniently

  1. You get articles that match your needs
  2. You can efficiently read back useful information
  3. You can use dark theme
What you can do with signing up
6
6

Delete article

Deleted articles cannot be recovered.

Draft of this article would be also deleted.

Are you sure you want to delete this article?