Help us understand the problem. What is going on with this article?

スタックフレームのアライメント要件

More than 3 years have passed since last update.

x86(i386)アーキテクチャ向けにGCCが出力するコードで、スタックフレームが16byteアライメントされる理由を求めて。1

※ 本記事の正確性は保証できません。誤り指摘や追加情報があればコメントまでお願いします。

経緯

  1. SysV ABI当初ではワード(4byte)アライメントのみが要請されていた。
  2. MMX, SSEなどのSIMD命令登場により、スタック変数でも16byteアライメントが要請されるケースが発生。またアライメント必須でない命令であってもパフォーマンス的には16byteアライメントが好ましい。
  3. Mac OS X i386 ABIでは16byteアライメントを明確に要請。
  4. GCCでサポートするLinux/BSD/OSX間の一貫性の観点から16byteアライメントを採用。
  5. 新しいSysV psABIでは16byteアライメントに変更。

GCC Bugzilla

ABI仕様

x86-64

System V Application Binary Interface, AMD64 Architecture Processor Supplement, Draft Version 0.99.6

3.2.2 The Stack Frame
In addition to registers, each function has a frame on the run-time stack. This stack grows downwards from high addresses. Figure 3.3 shows the stack organization. The end of the input argument area shall be aligned on a 16 (32, if __m256 is passed on stack) byte boundary. In other words, the value (%rsp + 8) is always a multiple of 16 (32) when control is transferred to the function entry point. The stack pointer, %rsp, always points to the end of the latest allocated stack frame.

i386

SYSTEM V APPLICATION BINARY INTERFACE, Intel386 Architecture Processor Supplement

Registers and the Stack Frame
Some registers have assigned roles in the standard calling sequence:
%esp: The stack pointer holds the limit of the current stack frame, which is the address of the stack’s bottom-most, valid word. At all times, the stack pointer should point to a word-aligned area.

OS X ABI Function Call Guide, IA-32 Function Calling Conventions

The function calling conventions used in the IA-32 environment are the same as those used in the System V IA-32 ABI, with the following exceptions:

  • The stack is 16-byte aligned at the point of function calls

System V Application Binary Interface, Intel386 Architecture Processor Supplement, Version 1.1

2.2.2 The Stack Frame
In addition to registers, each function has a frame on the run-time stack. This stack grows downwards from high addresses. Table 2.2 shows the stack organization.
The end of the input argument area shall be aligned on a 16 (32 or 64, if __m256 or __m512 is passed on stack) byte boundary. In other words, the value (%esp + 4) is always a multiple of 16 (32 or 64) when control is transferred to the function entry point. The stack pointer, %esp, always points to the end of the latest allocated stack frame.


  1. teratail ELFの関数プロローグについて への回答がきっかけ 

Why not register and get more from Qiita?
  1. We will deliver articles that match you
    By following users and tags, you can catch up information on technical fields that you are interested in as a whole
  2. you can read useful information later efficiently
    By "stocking" the articles you like, you can search right away
Comments
Sign up for free and join this conversation.
If you already have a Qiita account
Why do not you register as a user and use Qiita more conveniently?
You need to log in to use this function. Qiita can be used more conveniently after logging in.
You seem to be reading articles frequently this month. Qiita can be used more conveniently after logging in.
  1. We will deliver articles that match you
    By following users and tags, you can catch up information on technical fields that you are interested in as a whole
  2. you can read useful information later efficiently
    By "stocking" the articles you like, you can search right away