17
7

Delete article

Deleted articles cannot be recovered.

Draft of this article would be also deleted.

Are you sure you want to delete this article?

More than 5 years have passed since last update.

x86-64 / Linux ~ Inter processor interrupt (IPI)を理解する

Last updated at Posted at 2017-05-21

はじめに

Inter processor interrupt(IPI) の動作を説明します。
IPI は Multi-core processor(including hyper-threading) 間の通信に利用されます。

Linux kernel / QEMU(x86-64)を動作させて、gdbで動きを確認します。
QEMU / gdb で Linux kernel の動きを確認するを利用します。

IPIを発行しているところ

__default_send_IPI_dest_fieldを使ってIPIを発行します。

arch/x86/include/asm/ipi.h
static inline void
__default_send_IPI_dest_field(unsigned int mask, int vector, unsigned int dest)

コンソールでEnterを入力したタイミングでgdb break / backtraceを取得しました。
smp_irq_work_interruptからの一連の呼出しで__default_send_IPI_dest_fieldがCallされています。

__default_send_IPI_dest_fieldは次のIPIを発行しています。

vector=253

はRESCHEDULE_VECTORです。スケジュール用のIPIです。

backtrace
(gdb) b __default_send_IPI_dest_field

Breakpoint 14 at 0xffffffff81020f9f: __default_send_IPI_dest_field. (5 locations)
(gdb) commands
Type commands for breakpoint(s) 14, one per line.
End with a line saying just "end".
>bt
>c
>end
(gdb) 
(gdb) c
Continuing.

Breakpoint 14, _flat_send_IPI_mask (vector=253, mask=1) at arch/x86/kernel/apic/apic_flat_64.c:64
64		__default_send_IPI_dest_field(mask, vector, apic->dest_logical);
#0  _flat_send_IPI_mask (vector=253, mask=1) at arch/x86/kernel/apic/apic_flat_64.c:64
#1  flat_send_IPI_mask (cpumask=0xffffffff81206e28 <cpu_bit_bitmap+8>, vector=253) at arch/x86/kernel/apic/apic_flat_64.c:72
#2  0xffffffff8104d7ae in smp_send_reschedule (cpu=0) at /home/user/linux-3.13.0/arch/x86/include/asm/smp.h:140
#3  ttwu_queue_remote (cpu=0, p=0xffff88000744a160) at kernel/sched/core.c:1540
#4  ttwu_queue (cpu=0, p=0xffff88000744a160) at kernel/sched/core.c:1556
#5  try_to_wake_up (p=0xffff88000744a160, state=<optimized out>, wake_flags=<optimized out>) at kernel/sched/core.c:1629
#6  0xffffffff8104d7fe in default_wake_function (curr=curr@entry=0xffff880007467e80, mode=<optimized out>, wake_flags=<optimized out>, key=<optimized out>) at kernel/sched/core.c:2722
#7  0xffffffff81055f79 in autoremove_wake_function (wait=0xffff880007467e80, mode=<optimized out>, sync=<optimized out>, key=<optimized out>) at kernel/sched/wait.c:292
#8  0xffffffff81055f3f in __wake_up_common (q=q@entry=0xffffffff8141f060 <rcu_sched_state+288>, mode=mode@entry=3, nr_exclusive=nr_exclusive@entry=1, wake_flags=wake_flags@entry=0, key=key@entry=0x0 <irq_stack_union>) at kernel/sched/wait.c:72
#9  0xffffffff8105611f in __wake_up (q=0xffffffff8141f060 <rcu_sched_state+288>, mode=3, nr_exclusive=1, key=0x0 <irq_stack_union>) at kernel/sched/wait.c:94
#10 0xffffffff8106b99f in __irq_work_run () at kernel/irq_work.c:140
#11 irq_work_run () at kernel/irq_work.c:156
#12 0xffffffff81005d16 in __smp_irq_work_interrupt () at arch/x86/kernel/irq_work.c:22
#13 smp_irq_work_interrupt (regs=<optimized out>) at arch/x86/kernel/irq_work.c:28
#14 <signal handler called>
#15 0xffffffffffffff09 in ?? ()

smp_irq_work_interruptの割り込みを発行している場所を見ましょう。
smp_apic_timer_interruptからの一連の呼出しでCallされています。

Breakpoint 27, arch_irq_work_raise () at arch/x86/kernel/irq_work.c:44
44		if (!cpu_has_apic)
#0  arch_irq_work_raise () at arch/x86/kernel/irq_work.c:44
#1  0xffffffff8106b8c7 in irq_work_queue (work=work@entry=0xffffffff8141f1e8 <rcu_sched_state+680>) at kernel/irq_work.c:82
#2  0xffffffff81061dd0 in rcu_start_gp_advanced (rsp=rsp@entry=0xffffffff8141ef40 <rcu_sched_state>, rdp=<optimized out>, rnp=<optimized out>) at kernel/rcu/tree.c:1664
#3  0xffffffff8106272d in rcu_start_gp (rsp=rsp@entry=0xffffffff8141ef40 <rcu_sched_state>) at kernel/rcu/tree.c:1689
#4  0xffffffff81062a28 in __rcu_process_callbacks (rsp=0xffffffff8141ef40 <rcu_sched_state>) at kernel/rcu/tree.c:2297
#5  rcu_process_callbacks (unused=<optimized out>) at kernel/rcu/tree.c:2319
#6  0xffffffff8103060d in __do_softirq () at kernel/softirq.c:253
#7  0xffffffff810308f5 in invoke_softirq () at kernel/softirq.c:339
#8  irq_exit () at kernel/softirq.c:381
#9  0xffffffff8102000b in exiting_irq () at /home/user/linux-3.13.0/arch/x86/include/asm/apic.h:708
#10 smp_apic_timer_interrupt (regs=<optimized out>) at arch/x86/kernel/apic/apic.c:931
#11 <signal handler called>
#12 0xffffffffffffff10 in ?? ()

smp_apic_timer_interruptの割り込みを発行している場所を見ましょう。
smp_apic_timer_interruptは頻繁に呼び出されていました。

Breakpoint 79, raise_softirq (nr=nr@entry=1) at kernel/softirq.c:408
408	{
#0  raise_softirq (nr=nr@entry=1) at kernel/softirq.c:408
#1  0xffffffff81036001 in run_local_timers () at kernel/timer.c:1385
#2  update_process_times (user_tick=0) at kernel/timer.c:1356
#3  0xffffffff810699a3 in tick_periodic (cpu=cpu@entry=1) at kernel/time/tick-common.c:90
#4  0xffffffff81069ad8 in tick_handle_periodic (dev=0xffff880007b0c900) at kernel/time/tick-common.c:102
#5  0xffffffff81020006 in smp_apic_timer_interrupt (regs=<optimized out>) at arch/x86/kernel/apic/apic.c:930
#6  <signal handler called>
#7  0xffffffffffffff10 in ?? ()
Cannot access memory at address 0x202
(gdb) 
Continuing.

Breakpoint 8, raise_softirq (nr=nr@entry=7) at kernel/softirq.c:408
408	{
#0  raise_softirq (nr=nr@entry=7) at kernel/softirq.c:408
#1  0xffffffff81054258 in trigger_load_balance (rq=<optimized out>, cpu=<optimized out>) at kernel/sched/fair.c:6896
#2  0xffffffff8104cc00 in scheduler_tick () at kernel/sched/core.c:2321
#3  0xffffffff81036024 in update_process_times (user_tick=0) at kernel/timer.c:1362
#4  0xffffffff810699a3 in tick_periodic (cpu=cpu@entry=1) at kernel/time/tick-common.c:90
#5  0xffffffff81069ad8 in tick_handle_periodic (dev=0xffff880007b0c900) at kernel/time/tick-common.c:102
#6  0xffffffff81020006 in smp_apic_timer_interrupt (regs=<optimized out>) at arch/x86/kernel/apic/apic.c:930
#7  <signal handler called>
#8  0xffffffffffffff10 in ?? ()

Breakpoint 8, raise_softirq (nr=nr@entry=9) at kernel/softirq.c:408
408	{
#0  raise_softirq (nr=nr@entry=9) at kernel/softirq.c:408
#1  0xffffffff8106188a in invoke_rcu_core () at kernel/rcu/tree.c:2344
#2  0xffffffff81063475 in rcu_check_callbacks (cpu=cpu@entry=1, user=user@entry=0) at kernel/rcu/tree.c:2179
#3  0xffffffff8103600b in update_process_times (user_tick=0) at kernel/timer.c:1357
#4  0xffffffff810699a3 in tick_periodic (cpu=cpu@entry=1) at kernel/time/tick-common.c:90
#5  0xffffffff81069ad8 in tick_handle_periodic (dev=0xffff880007b0c900) at kernel/time/tick-common.c:102
#6  0xffffffff81020006 in smp_apic_timer_interrupt (regs=<optimized out>) at arch/x86/kernel/apic/apic.c:930
#7  <signal handler called>
#8  0xffffffffffffff10 in ?? ()
Cannot access memory at address 0x246
(gdb) c
Continuing.

smp_apic_timer_interrupt から raise_softirq 1 / 7 / 9 が呼ばれます。
値の定義は以下にあります。TIMER_SOFTIRQ(1) / SCHED_SOFTIRQ(7) / RCU_SOFTIRQ(9)です。

include/linux/interrupt.h
enum
{
>---HI_SOFTIRQ=0,
>---TIMER_SOFTIRQ,
>---NET_TX_SOFTIRQ,
>---NET_RX_SOFTIRQ,
>---BLOCK_SOFTIRQ,
>---BLOCK_IOPOLL_SOFTIRQ,
>---TASKLET_SOFTIRQ,
>---SCHED_SOFTIRQ,
>---HRTIMER_SOFTIRQ,
>---RCU_SOFTIRQ,    /* Preferable RCU should always be the last softirq */

>---NR_SOFTIRQS
};

割り込みハンドラの対応は次の通りです。

open_softirq(TIMER_SOFTIRQ, run_timer_softirq);
open_softirq(RCU_SOFTIRQ, rcu_process_callbacks);
open_softirq(SCHED_SOFTIRQ, run_rebalance_domains);

__default_send_IPI_dest_field

__default_send_IPI_dest_fieldの処理を見ましょう。

arch/x86/include/asm/ipi.h
 92 static inline void
 93  __default_send_IPI_dest_field(unsigned int mask, int vector, unsigned int dest)
 94 {
 95 >---unsigned long cfg;
 96 
 97 >---/*
 98 >--- * Wait for idle.
 99 >--- */
100 >---if (unlikely(vector == NMI_VECTOR))
101 >--->---safe_apic_wait_icr_idle();
102 >---else
103 >--->---__xapic_wait_icr_idle();
104 
105 >---/*
106 >--- * prepare target chip field
107 >--- */
108 >---cfg = __prepare_ICR2(mask);
109 >---native_apic_mem_write(APIC_ICR2, cfg);
110 
111 >---/*
112 >--- * program the ICR
113 >--- */
114 >---cfg = __prepare_ICR(0, vector, dest);
115 
116 >---/*
117 >--- * Send the IPI. The write to APIC_ICR fires this off.
118 >--- */
119 >---native_apic_mem_write(APIC_ICR, cfg);
120 }

native_apic_mem_writeでIPIを発行します。
native_apic_mem_writeを見ましょう。

arch/x86/include/asm/apic.h
static inline void native_apic_mem_write(u32 reg, u32 v)
{
>---volatile u32 *addr = (volatile u32 *)(APIC_BASE + reg);

>---alternative_io("movl %0, %1", "xchgl %0, %1", X86_FEATURE_11AP,
>--->---       ASM_OUTPUT2("=r" (v), "=m" (*addr)),
>--->---       ASM_OUTPUT2("0" (v), "m" (*addr)));
}

native_apic_mem_writeではAPIC_BASE + regにデータを書き込むことでIPIを発行します。

109 >---native_apic_mem_write(APIC_ICR2, cfg);
119 >---native_apic_mem_write(APIC_ICR, cfg);

APIC_ICR / APIC_ICR2 の定義は次の通りです。

arch/x86/include/asm/apicdef.h
#define>APIC_ICR>---0x300
#define>APIC_ICR2>--0x310

flat_send_IPI_maskをdisassembleしてみましょう。

(gdb) disas flat_send_IPI_mask
Dump of assembler code for function flat_send_IPI_mask:
(snip)
   0xffffffff8102395e <+46>:	mov    %r12d,%eax
   0xffffffff81023961 <+49>:	shl    $0x18,%eax
   0xffffffff81023964 <+52>:	mov    %eax,0xffffffffff5fb310
   0xffffffff8102396b <+59>:	mov    %esi,%eax
   0xffffffff8102396d <+61>:	or     %ebx,%eax
   0xffffffff8102396f <+63>:	or     $0x4,%bh
   0xffffffff81023972 <+66>:	cmp    $0x2,%esi
   0xffffffff81023975 <+69>:	cmove  %ebx,%eax
   0xffffffff81023978 <+72>:	mov    %eax,0xffffffffff5fb300

次の命令がIPI発行命令です。

0xffffffff81023964 <+52>: mov %eax,0xffffffffff5fb310
0xffffffff81023978 <+72>: mov %eax,0xffffffffff5fb300

%eaxには何が設定されているのでしょうか。
0xffffffff81023964 / 0xffffffff81023978で実行を止めて値を見てみます。

(gdb) b *0xffffffff81023964
Breakpoint 2 at 0xffffffff81023964: file /home/user/linux-3.13.0/arch/x86/include/asm/apic.h, line 105.
(gdb) c
Continuing.

Breakpoint 2, __default_send_IPI_dest_field (dest=2048, vector=<optimized out>, mask=2) at /home/user/linux-3.13.0/arch/x86/include/asm/ipi.h:109
109		native_apic_mem_write(APIC_ICR2, cfg);
(gdb) p/x $eax
$1 = 0x2000000
(gdb) until *0xffffffff81023978
__default_send_IPI_dest_field (dest=<optimized out>, vector=<optimized out>, mask=2) at /home/user/linux-3.13.0/arch/x86/include/asm/ipi.h:119
119		native_apic_mem_write(APIC_ICR, cfg);
(gdb) p/x $eax
$2 = 0x8fd

%eaxは次の値になっています。

$1 = 0x2000000
$2 = 0x8fd

Interrupt Command Register (ICR)

x86-64にはIPIを発行する仕組みがあります。
Interrupt Command Register (ICR)を使用します。
Intelのmanualからの抜粋です。

1.JPG

先ほどの%eaxの値は次のようになっています。

Vector = 0xfd (RESCHEDULE_VECTOR:253)
Delivery Mode = 00 : 000 (Fixed)
Destination Mode = 1: Logical
Delivery Status (Read Only) = 0 : 0 (Idle), 1 (Send Pending)
Destination Shorthand = 00: (No Shorthand)
Destination Field = 2

※ Level / Trigger Mode は INIT level de-assert delivery modeでのみ有効であるため省略します。

送り先のprocessorsはDestination Fieldで指定します。
Logical Destination Modeではmessage destination address (MDA)といいます。

Logical Destination Modeでは次のRegisterも併せて使用します。

  • Local destination register (LDR)
  • Destination format register (DFR)

1.JPG

2.JPG

DFR設定はFlat Modelを使用していました。したがって以降はFlat Modelについて説明します。

使い方は次の通りです。

準備

  • processor毎にLogical APIC IDのビットを割り当てる。
  • processor毎に割り当てたビットのみを立てた値をLDRのLogical APIC ID fieldに設定する。

IPI発行

  • 送信したいprocessorに対応するビットを立てたDestination Fieldを用意する。
  • ICRに書き込む。

IPI受け付け

smp_reschedule_interruptで RESCHEDULE_VECTOR(253)を受け付けます。

arch/x86/kernel/smp.c
__visible void smp_reschedule_interrupt(struct pt_regs *regs)
{
>---ack_APIC_irq();
>---__smp_reschedule_interrupt();
(snip)
}

ack_APIC_irqでIPIの受け付けを完了させます。
End-of-interrupt(EOI) registerに0を書き込みます。

arch/x86/include/asm/apic.h
static inline void apic_eoi(void)
{
>---apic->eoi_write(APIC_EOI, APIC_EOI_ACK);
}

1.JPG

17
7
0

Register as a new user and use Qiita more conveniently

  1. You get articles that match your needs
  2. You can efficiently read back useful information
  3. You can use dark theme
What you can do with signing up
17
7

Delete article

Deleted articles cannot be recovered.

Draft of this article would be also deleted.

Are you sure you want to delete this article?