Posted at

bashのreadは入力データのバイト数だけシステムコールを実行している、の確認

少し前の記事だけど

https://qiita.com/8x9/items/69c4968d3cd1a2eec13f

こちらのコメントが気になったので。

% wc data.txt

10000 10000 210000 data.txt

% head data.txt
uRkbV6,i2GQ9N,uYd8dP
WqxpAE,QR20EU,ldwZKK
cpEEO0,Ac2RgT,EhIU2M
ph7r8l,udlcEM,kVCkHP
njvAFy,fHWIsj,eA5lOn
i5Dd9E,U2aeHD,7y7x6T
rZZGSI,BB8Kvy,UQSOR1
2TMU0i,ENjgK5,k1893E
i5I6pv,k1ootn,B0UxWk
we3zXI,fk8p35,GdZt3d

% cat bash.sh

#!/bin/bash

total=0
while IFS=, read -r x target x; do
target="${target//[!aeiou]/}"
let "total += ${#target}"
done
echo "$total"

% cat by_cut_tr_wc.sh

#!/bin/bash

cut -d, -f2 |tr -dc 'aeiou' |wc -c

% cat data.txt| strace -c -o strace1.txt ./bash.sh

4866

% cat data.txt| strace -c -o strace2.txt ./by_cut_tr_wc.sh
4866

% cat strace1.txt

% time seconds usecs/call calls errors syscall
------ ----------- ----------- --------- --------- ----------------
91.82 0.059388 0 210007 read
4.34 0.002808 0 10004 10003 ioctl
3.83 0.002480 0 10004 10001 lseek
0.00 0.000000 0 1 write
0.00 0.000000 0 8 open
0.00 0.000000 0 8 close
0.00 0.000000 0 8 stat
0.00 0.000000 0 7 fstat
0.00 0.000000 0 12 mmap
0.00 0.000000 0 8 mprotect
0.00 0.000000 0 1 munmap
0.00 0.000000 0 23 brk
0.00 0.000000 0 14 rt_sigaction
0.00 0.000000 0 5 rt_sigprocmask
0.00 0.000000 0 5 5 access
0.00 0.000000 0 1 dup2
0.00 0.000000 0 2 getpid
0.00 0.000000 0 1 execve
0.00 0.000000 0 1 uname
0.00 0.000000 0 3 1 fcntl
0.00 0.000000 0 2 getrlimit
0.00 0.000000 0 1 sysinfo
0.00 0.000000 0 1 getuid
0.00 0.000000 0 1 getgid
0.00 0.000000 0 1 geteuid
0.00 0.000000 0 1 getegid
0.00 0.000000 0 1 getppid
0.00 0.000000 0 1 getpgrp
0.00 0.000000 0 1 arch_prctl
------ ----------- ----------- --------- --------- ----------------
100.00 0.064676 230133 20010 total

% cat strace2.txt

% time seconds usecs/call calls errors syscall
------ ----------- ----------- --------- --------- ----------------
0.00 0.000000 0 6 read
0.00 0.000000 0 8 open
0.00 0.000000 0 16 4 close
0.00 0.000000 0 8 stat
0.00 0.000000 0 7 fstat
0.00 0.000000 0 3 lseek
0.00 0.000000 0 12 mmap
0.00 0.000000 0 8 mprotect
0.00 0.000000 0 1 munmap
0.00 0.000000 0 20 brk
0.00 0.000000 0 16 rt_sigaction
0.00 0.000000 0 17 rt_sigprocmask
0.00 0.000000 0 1 rt_sigreturn
0.00 0.000000 0 3 2 ioctl
0.00 0.000000 0 5 5 access
0.00 0.000000 0 2 pipe
0.00 0.000000 0 1 dup2
0.00 0.000000 0 2 getpid
0.00 0.000000 0 3 clone
0.00 0.000000 0 1 execve
0.00 0.000000 0 4 1 wait4
0.00 0.000000 0 1 uname
0.00 0.000000 0 3 1 fcntl
0.00 0.000000 0 2 getrlimit
0.00 0.000000 0 1 sysinfo
0.00 0.000000 0 1 getuid
0.00 0.000000 0 1 getgid
0.00 0.000000 0 1 geteuid
0.00 0.000000 0 1 getegid
0.00 0.000000 0 1 getppid
0.00 0.000000 0 1 getpgrp
0.00 0.000000 0 1 arch_prctl
------ ----------- ----------- --------- --------- ----------------
100.00 0.000000 158 13 total

おお、圧倒的だ。

ちなみに

https://qiita.com/8x9/items/f1156503694d3683e78d

こちらのVERY_BAD.sh

#!/bin/sh

# とても悪い例

total=0
while read -r line; do
count=$(printf '%s' "$line" |awk -F, '{ print gsub(/[aeiou]/, "", $2) }')
total=$(( total + count ))
done
echo "$total"

これは

% cat strace3.txt

% time seconds usecs/call calls errors syscall
------ ----------- ----------- --------- --------- ----------------
77.42 0.202407 1 230004 read
13.64 0.035662 4 10000 clone
3.30 0.008627 0 20003 close
2.53 0.006619 1 10000 wait4
1.63 0.004262 0 10000 rt_sigreturn
1.48 0.003859 0 10000 pipe
0.00 0.000000 0 1 write
0.00 0.000000 0 3 open
0.00 0.000000 0 2 stat
0.00 0.000000 0 2 fstat
0.00 0.000000 0 6 mmap
0.00 0.000000 0 4 mprotect
0.00 0.000000 0 1 munmap
0.00 0.000000 0 3 brk
0.00 0.000000 0 7 rt_sigaction
0.00 0.000000 0 3 3 access
0.00 0.000000 0 1 getpid
0.00 0.000000 0 1 execve
0.00 0.000000 0 2 fcntl
0.00 0.000000 0 1 geteuid
0.00 0.000000 0 1 getppid
0.00 0.000000 0 1 arch_prctl
------ ----------- ----------- --------- --------- ----------------
100.00 0.261436 290046 3 total

こうなった。同じくreadが圧倒的。コメントにある通り、意外とcloneは高くない。