はじめに
Helgrindを用いてマルチスレッドプログラムのデバッグをする.
Helgrindとは
- POSIX pthreads API の誤用
- ロック順序の問題から生じるデッドロックの可能性
- データ競合
このような再現性のないタイミング依存型のクラッシュやデッドロック,その他の誤動作の原因を調査するThread error detector.
詳しくはこちら
Install
環境: Ubuntu 20.04 LTS
> sudo apt-get install valgrind
お試し
今回は変数cntのデータ競合が生じる以下のCプログラムでお試し.
本来はインクリメントの際に排他制御をするべきだが行っていないので,競合が生じ,エラーが確認されるはず.
#include <pthread.h>
int cnt;
void *inc_cnt(void *arg) {
cnt++;
return NULL;
}
int main(){
pthread_t t;
pthread_create(&t, NULL, increment, NULL);
pthread_join(t, NULL);
return 0;
}
実行結果
先ほどのプログラムをコンパイルしてHelgrindを実行
gcc -Wall -g sample.c -lpthread
valgrind --tool=helgrind ./a.out
結果
---Thread-Announcement------------------------------------------
Thread #1 is the program's root thread
---Thread-Announcement------------------------------------------
Thread #2 was created
at 0x49990F2: clone (clone.S:71)
by 0x485C2EB: create_thread (createthread.c:101)
by 0x485DE0F: pthread_create@@GLIBC_2.2.5 (pthread_create.c:817)
by 0x4842917: ??? (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_helgrind-amd64-linux.so)
by 0x1091E2: main (sample.c:12)
----------------------------------------------------------------
Possible data race during read of size 4 at 0x10C014 by thread #1
Locks held: none
at 0x109195: inc_count (sample.c:6)
by 0x1091EC: main (sample.c:13)
This conflicts with a previous write of size 4 by thread #2
Locks held: none
at 0x10919E: inc_count (sample.c:6)
by 0x4842B1A: ??? (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_helgrind-amd64-linux.so)
by 0x485D608: start_thread (pthread_create.c:477)
by 0x4999102: clone (clone.S:95)
Address 0x10c014 is 0 bytes inside data symbol "count"
----------------------------------------------------------------
Possible data race during write of size 4 at 0x10C014 by thread #1
Locks held: none
at 0x10919E: inc_count (sample.c:6)
by 0x1091EC: main (sample.c:13)
This conflicts with a previous write of size 4 by thread #2
Locks held: none
at 0x10919E: inc_count (sample.c:6)
by 0x4842B1A: ??? (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_helgrind-amd64-linux.so)
by 0x485D608: start_thread (pthread_create.c:477)
by 0x4999102: clone (clone.S:95)
Address 0x10c014 is 0 bytes inside data symbol "count"
Use --history-level=approx or =none to gain increased speed, at
the cost of reduced accuracy of conflicting-access information
For lists of detected and suppressed errors, rerun with: -s
ERROR SUMMARY: 2 errors from 2 contexts (suppressed: 0 from 0)
thread #1(main関数)とthread #2でコンフリクトしていることがわかる.
一応,競合を回避させたプログラムでの挙動も確認してみる.ここではmutexを用いる.
#include <pthread.h>
int count;
pthread_mutex_t mutex;
void *inc_cnt(void *arg) {
pthread_mutex_lock(&mutex);
count++;
pthread_mutex_unlock(&mutex);
return NULL;
}
int main(){
pthread_t tid;
pthread_mutex_init(&mutex, NULL);
pthread_create(&tid, NULL, inc_cnt, NULL);
pthread_join(tid, NULL);
pthread_mutex_destroy(&mutex);
return 0;
}
結果
ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 7 from 7)
おわりに
いかがでしたか?