3
3

Delete article

Deleted articles cannot be recovered.

Draft of this article would be also deleted.

Are you sure you want to delete this article?

More than 5 years have passed since last update.

bioをブロックのrequest_queueにつなげる

Posted at

Linuxのブロック層について調べた。ファイルシステムで作ったbio構造体をブロックのキューにつなげる部分を読んだ。

カーネルバージョン

v4.4.0

ext4のバックトレース

blk_sq_make_request()あたりが書き込み時に通る関数のようなので、そこまでのバックトレースをみた。kjournald2はext4のログ書き込みのためのスレッドらしい。

>>> bt
#0  blk_sq_make_request (q=0xffff880233a458c0, bio=0xffff8802330a8a00) at block/blk-mq.c:1340
#1  0xffffffff813f30b3 in generic_make_request (bio=0xffff8802330a8a00) at block/blk-core.c:2065
#2  0xffffffff813f3206 in submit_bio (rw=<optimized out>, bio=0xffff8802330a8a00) at block/blk-core.c:2128
#3  0xffffffff81228daf in submit_bh_wbc (rw=866408640, bh=0xffff8802330a8a00, bio_flags=1, wbc=0x0 <irq_stack_union>) at fs/buffer.c:3045
#4  0xffffffff81228e12 in submit_bh (rw=<optimized out>, bh=<optimized out>) at fs/buffer.c:3057
#5  0xffffffff812c2de9 in jbd2_journal_commit_transaction (journal=0xffff8802362ee000) at fs/jbd2/commit.c:740
#6  0xffffffff812c79eb in kjournald2 (arg=0xffff8802362ee000) at fs/jbd2/journal.c:223
#7  0xffffffff81095e39 in kthread (_create=0xffff880233111f40) at kernel/kthread.c:209
#8  0xffffffff818293cf in ret_from_fork () at arch/x86/entry/entry_64.S:468
#9  0x0000000000000000 in ?? ()

もう一つ、ddのオプションでoflag=direct,syncとすると、次のようになった。submit_bio()からはおなじ。

>>> bt
#0  blk_sq_make_request (q=0xffff880233a458c0, bio=0xffff880233053000) at block/blk-mq.c:1340
#1  0xffffffff813f30b3 in generic_make_request (bio=0xffff880233053000) at block/blk-core.c:2065
#2  0xffffffff813f3206 in submit_bio (rw=<optimized out>, bio=0xffff880233053000) at block/blk-core.c:2128
#3  0xffffffff81230a91 in dio_bio_submit (sdio=<optimized out>, dio=<optimized out>) at fs/direct-io.c:409
#4  do_blockdev_direct_IO (iocb=<optimized out>, inode=0xffff880233053000, bdev=<optimized out>, iter=<optimized out>, offset=<optimized out>, get_block=<optimized out>, end_io=0x0 <irq_stack_union>, submit_io=<optimized out>, flags=3) at fs/direct-io.c:1282
#5  0xffffffff81231473 in __blockdev_direct_IO (iocb=<optimized out>, inode=<optimized out>, bdev=<optimized out>, iter=<optimized out>, offset=<optimized out>, get_block=<optimized out>, end_io=0x0 <irq_stack_union>, submit_io=<optimized out>, flags=<optimized out>) at fs/direct-io.c:1341
#6  0xffffffff812b5494 in blockdev_direct_IO (get_block=<optimized out>, offset=<optimized out>, iter=<optimized out>, inode=<optimized out>, iocb=<optimized out>) at include/linux/fs.h:2697
#7  ext4_ind_direct_IO (iocb=0xffff880233a458c0, iter=0xffff880233053000, offset=0) at fs/ext4/indirect.c:709
#8  0xffffffff81277130 in ext4_ext_direct_IO (offset=<optimized out>, iter=<optimized out>, iocb=<optimized out>) at fs/ext4/inode.c:3131
#9  ext4_direct_IO (iocb=0xffff880234e33e68, iter=<optimized out>, offset=0) at fs/ext4/inode.c:3281
#10 0xffffffff8117b4fa in generic_file_direct_write (iocb=0xffff880234e33e68, from=0xffff880234e33e90, pos=0) at mm/filemap.c:2437
#11 0xffffffff8117b670 in __generic_file_write_iter (iocb=0xffff880234e33e68, from=0xffff880234e33e90) at mm/filemap.c:2617
#12 0xffffffff812717cd in ext4_file_write_iter (iocb=0xffff880234e33e68, from=0xffff880234e33e90) at fs/ext4/file.c:171
#13 0xffffffff811f2c4a in new_sync_write (ppos=<optimized out>, len=<optimized out>, buf=<optimized out>, filp=<optimized out>) at fs/read_write.c:478
#14 __vfs_write (file=0xffff880235687100, p=<optimized out>, count=<optimized out>, pos=0xffff880234e33f20) at fs/read_write.c:491
#15 0xffffffff811f3509 in vfs_write (file=0xffff880235687100, buf=0x2225000 "", count=<optimized out>, pos=0xffff880234e33f20) at fs/read_write.c:538
#16 0xffffffff811f4106 in SYSC_write (count=<optimized out>, buf=<optimized out>, fd=<optimized out>) at fs/read_write.c:585
#17 SyS_write (fd=<optimized out>, buf=35803136, count=1024) at fs/read_write.c:577
#18 0xffffffff81829036 in entry_SYSCALL_64_fastpath () at arch/x86/entry/entry_64.S:185
#19 0x00007ffc929c8a88 in ?? ()
#20 0x0000000000000000 in ?? ()

bioがデバイスのキューに繋がるまで

bioとはデバイス上の書き込み位置(デバイスとセクタ番号など)、メモリ上の位置(io_vec)を示した構造体だ。
submit_bio()は実質generic_make_request()のラッパーになっている。次の箇所がメインの処理で、q->make_request_fn()でリクエストをブロックデバイスに渡す。

	bio_list_init(&bio_list_on_stack);
	current->bio_list = &bio_list_on_stack;
	do {
		struct request_queue *q = bdev_get_queue(bio->bi_bdev);

		if (likely(blk_queue_enter(q, __GFP_DIRECT_RECLAIM) == 0)) {

			ret = q->make_request_fn(q, bio);

			blk_queue_exit(q);

			bio = bio_list_pop(current->bio_list);
		} else {
			struct bio *bio_next = bio_list_pop(current->bio_list);

			bio_io_error(bio);
			bio = bio_next;
		}
	} while (bio);
	current->bio_list = NULL; /* deactivate */

バックトレースでは、blk_sq_make_request()が呼ばれているが、これはblk_mq_init_queue(), blk_init_allocated_queue()から登録されている。

blk-mqではIOスケジューラを使わない

IOスケジューラ、呼ばれないなぁと思っていたら、

## cat /sys/block/vda/queue/scheduler
none

どうやらblk-mqなる仕組みではIOスケジューラはバイパスされるらしい

SCSIディスク(sdX)

書籍によると、一般的には__make_request()がよばれるらしいが、今のカーネルでは別の関数みたい。
SCSIデバイスではblk_init_queue()からblk_queue_bio()が登録されている。
scsiデバイスからの呼び出しはscsi_alloc_sdev()あたりから。つまり、blk_queue_bio()__make_request()相当なのだろう。この辺りはblk-mq関連と合わせて、次回に見たい。

3
3
0

Register as a new user and use Qiita more conveniently

  1. You get articles that match your needs
  2. You can efficiently read back useful information
  3. You can use dark theme
What you can do with signing up
3
3

Delete article

Deleted articles cannot be recovered.

Draft of this article would be also deleted.

Are you sure you want to delete this article?