More than 5 years have passed since last update.

Kernel Self-Protection (1/2)

Last updated at 2020-05-07Posted at 2020-05-06

もともと、Linux Kernelのソースコードの一部なので、GPLv2扱いになる（はずの認識）。

https://www.kernel.org/doc/html/latest/index.html

Licensing documentation

The following describes the license of the Linux kernel source code (GPLv2), how to properly mark the license of individual files in the source tree, as well as links to the full license text.

https://www.kernel.org/doc/html/latest/process/license-rules.html#kernel-licensing

Kernel Self-Protection

Kernel self-protection is the design and implementation of systems and structures within the Linux kernel to protect against security flaws in the kernel itself.

Kernel self-protectionは、Linux Kernel自身のセキュリティ上の欠陥から保護するために、システムと構造が設計・実装されています。

This covers a wide range of issues, including removing entire classes of bugs, blocking security flaw exploitation methods, and actively detecting attack attempts.

これは、問題を幅広くカバーしています。バグのクラス全体の除去、セキュリティ欠陥を悪用する方法のブロッキング、積極的な攻撃の試みの検出など。

Not all topics are explored in this document, but it should serve as a reasonable starting point and answer any frequently asked questions. (Patches welcome, of course!)

このドキュメントでは、全てのトピックを開示しているわけではありませんが、だろうな出発点として、あるいは、よく聞かれる内容について役立つでしょう（もちろん、パッチは歓迎します！）

In the worst-case scenario, we assume an unprivileged local attacker has arbitrary read and write access to the kernel’s memory.

最悪なシナリオとして、権限のないローカルな香華k視野が、カーネルメモリに対して任意の読み取り/書き込み権限が与えられていることを想定しています。

In many cases, bugs being exploited will not provide this level of access, but with systems in place that defend against the worst case we’ll cover the more limited cases as well.

多くの場合、悪用されているbugによってもこのレベルでアクセスすることを提供しません。しかし、最悪のケースを防ぐことのできるシステムは、それよりも限定的なケースについてもカバーできます。

A higher bar, and one that should still be kept in mind, is protecting the kernel against a privileged local attacker, since the root user has access to a vastly increased attack surface.

この考えを心にとどめておくために、より高い水準として、カーネル「特権を持つ」ローカルな攻撃者から、カーネルを保護することです。root userであれば、大幅に増加した攻撃領域にアクセスすることができます。

(Especially when they have the ability to load arbitrary kernel modules.)

（特に、彼らが任意のカーネルモジュールをロードすることができる場合）.

The goals for successful self-protection systems would be that they are effective, on by default, require no opt-in by developers, have no performance impact, do not impede kernel debugging, and have tests.

self-protection systemsの成功ゴールとは、効果的に、デフォルト状態で、何も開発者に確認することなく、性能に影響を与ええずに、デバッグに悪影響を与えずに、そしてテストができることです。

It is uncommon that all these goals can be met, but it is worth explicitly mentioning them, since these aspects need to be explored, dealt with, and/or accepted.

これらのゴールをすべて満足することは一般的ではありませんが、このことを述べる事には一定の価値があります。調査においては、実行または受け入れなければならない必要があります。

Attack Surface Reduction

The most fundamental defense against security exploits is to reduce the areas of the kernel that can be used to redirect execution.

セキュリティ悪用に対する、もっとも基本的な保護は、実行のredirectに利用できるカーネルの領域を減らす事です。

This ranges from limiting the exposed APIs available to userspace, making in-kernel APIs hard to use incorrectly, minimizing the areas of writable kernel memory, etc.

これは、ユーザー空間で利用できる公開されたAPIを制限することで、カーネル内APIを誤って使用しにくくすること、書き込み可能なカーネルメモリの領域を最小限にすることなど、さまざまが含まれます。

Strict kernel memory permissions

When all of kernel memory is writable, it becomes trivial for attacks to redirect execution flow.

全てのkernel memoryが書き込み可能であるならば、攻撃者にとっては実行フローをrediretすることは容易になります。

To reduce the availability of these targets the kernel needs to protect its memory with a tight set of permissions.

この対象となるうることを減らすためには、kernelは自身のメモリに対して厳密にパーミッションをもって保護しなければなりません。

Executable code and read-only data must not be writable

Any areas of the kernel with executable memory must not be writable.

実行可能なメモリを供えたカーネルの領域は、書き込み可能であってはいけません。

While this obviously includes the kernel text itself, we must consider all additional places too: kernel modules, JIT memory, etc.

これは明らかにkernel text自身が含まれますが、更に、追加されるすべての領域についても考慮しなければなりません。kernel moduleやJIT memoryなど。

(There are temporary exceptions to this rule to support things like instruction alternatives, breakpoints, kprobes, etc.

(この規則においては、大体命令やbreakpoint, kprovesなどをサポートするための、例外があります。

If these must exist in a kernel, they are implemented in a way where the memory is temporarily made writable during the update, and then returned to the original permissions.)

これらがkernelに必要な場合、更新中にはメモリは一時的に書き込み可能となり、それが完了したら元のパーミッションに戻すように実装されます。）

In support of this are CONFIG_STRICT_KERNEL_RWX and CONFIG_STRICT_MODULE_RWX, which seek to make sure that code is not writable, data is not executable, and read-only data is neither writable nor executable.

これをサポートするために、CONFIG_STRICT_KERNEL_RWX と、CONFIG_STRICT_MODULE_RWXがあります。これらは、コードが書き込み不可能なこと、データが実行不可能な事、そして、読み込みしかしないデータは書き込みも実行もできないことを確認します。

Most architectures have these options on by default and not user selectable.

多くのアーキテクチャではこれらのオプションはデフォルトでonであり、ユーザーは選択できません。

For some architectures like arm that wish to have these be selectable, the architecture Kconfig can select ARCH_OPTIONAL_KERNEL_RWX to enable a Kconfig prompt.

armなどの一部のアーキテクチャでは、希望に合わせて選択することができます。architecure Kconfigでは、Kconfig promptでARCH_OPTIONAL_KERNEL_RWX を有効にすることができます。

CONFIG_ARCH_OPTIONAL_KERNEL_RWX_DEFAULT determines the default setting when ARCH_OPTIONAL_KERNEL_RWX is enabled.

CONFIG_ARCH_OPTIONAL_KERNEL_RWX が有効である際に、CONFIG_ARCH_OPTIONAL_KERNEL_RWX_DEFAULT を決定します。
.

Function pointers and sensitive variables must not be writable

Vast areas of kernel memory contain function pointers that are looked up by the kernel and used to continue execution

カーネルメモリの広大な領域には、カーネルによって検索され、実行を継続するために用いるための、カーネルポインタが含まれます。

(e.g. descriptor/vector tables, file/network/etc operation structures, etc).

（例えば、 descriptor/vector tables, file/network/etc operation structures, 等)

The number of these variables must be reduced to an absolute minimum.

これらの変数の数は、絶対最小まで減少しなければなりません。

Many such variables can be made read-only by setting them “const” so that they live in the .rodata section instead of the .data section of the kernel, gaining the protection of the kernel’s strict memory permissions as described above.

これらの変数の多くは、constを設定することによって、read-onlyにすることができます。これによって、kernelの.data sectionの代わりに、.rodata sectionに配置されます。カーネルの厳密なメモリパーミッションで保護されます。

For variables that are initialized once at __init time, these can be marked with the (new and under development) __ro_after_init attribute.

__init時に一度初期化される変数の場合、これらは(新規あるいは開発中であれば） __ro_after_init属性によってマークすることができます。

What remains are variables that are updated rarely (e.g. GDT).

残っているのは、まれにしかアップデートされない変数です（例えば、GDT)。

These will need another infrastructure (similar to the temporary exceptions made to kernel code mentioned above) that allow them to spend the rest of their lifetime read-only.

これらには、ライフタイムの間においてread-onlyで使用可能にする別の仕組みが必要です。（上記に示したカーネルコードでの一時的な例外と同様）。

(For example, when being updated, only the CPU thread performing the update would be given uninterruptible write access to the memory.)

（例えば、更新時において、CPU threadに対して、メモリへの中断の無い書き込み権限が与えられます）。

Segregation of kernel memory from userspace memory

The kernel must never execute userspace memory.

kernelは、ユーザー空間メモリを実行してはならない。

The kernel must also never access userspace memory without explicit expectation to do so.

kernelはまた、明示的な期待なしに、ユーザー空間メモリにアクセスしてもならない。

These rules can be enforced either by support of hardware-based restrictions (x86’s SMEP/SMAP, ARM’s PXN/PAN) or via emulation (ARM’s Memory Domains).

このルールは、ハードウェアベースの仕組みによって実行されます（x86であればSMEP/SMAP、ARMであればPXN/PAN)。また、エミュレーション(ARMであればMemory Domains)。

By blocking userspace memory in this way, execution and data parsing cannot be passed to trivially-controlled userspace memory, forcing attacks to operate entirely in kernel memory.

これによりユーザー空間メモリがブロックされることで、簡単に制御できるユーザー空間のメモリを、実行やデータとして受け渡すことができなくなり、カーネルメモリ内での攻撃をぜんぜんできなくします。

Reduced access to syscalls

One trivial way to eliminate many syscalls for 64-bit systems is building without CONFIG_COMPAT. However, this is rarely a feasible scenario.

64 bit systemで多くのsyscallを無効化する簡単な方法の１つは、CONFIG_COMPATなしでbuildすることです。しかし、これはほとんど現実的ではないシナリオです。

The “seccomp” system provides an opt-in feature made available to userspace, which provides a way to reduce the number of kernel entry points available to a running process.

"seccom" systemでは、ユーザー空間を有効化するopt-in機能が提供されています。これは、実行中のプロセスに対してkernelのentry pointの数を減らす手段を提供します。

This limits the breadth of kernel code that can be reached, possibly reducing the availability of a given bug to an attack.

これは、到達可能なカーネルコードの範囲を制限し、攻撃する際に与えられたバグの有用性を減少させることが可能かもしれません。

An area of improvement would be creating viable ways to keep access to things like compat, user namespaces, BPF creation, and perf limited only to trusted processes.

改善された領域は、、compat, user namespaces BPF creation 、信頼性のあるプロセスに対するperf のようなことのアクセスを可能にし続ける手段を提供します。

This would keep the scope of kernel entry points restricted to the more regular set of normally available to unprivileged userspace.

これは、kernel entry pointのスコープが、権限のないユーザー空間が通常利用できる標準的な集合に制限もされます。

Restricting access to kernel modules

The kernel should never allow an unprivileged user the ability to load specific kernel modules, since that would provide a facility to unexpectedly extend the available attack surface.

カーネルは、権限がないユーザーに対して、特定のカーネルモジュールをloadすることを許可してはなりません。これは、想像していなかった攻撃できる面の有効性を拡張する機能を提供してしまいます。

(The on-demand loading of modules via their predefined subsystems, e.g. MODULE_ALIAS_*, is considered “expected” here, though additional consideration should be given even to these.)

（"MODULE_ALIAS_*"が指定され、必要に応じてロードされるモジュールは、あらかじめ予期されたsubsytemです。これらについても、追加の検討が必要です。）

For example, loading a filesystem module via an unprivileged socket API is nonsense: only the root or physically local user should trigger filesystem module loading.

例えば、権限のないsocket APIを可視いて、filesystem moduleを呼び出す事は意味がないです。rootもしくは物理的にローカルユーザーだけが、filesystem moduleのloadができるべきです。

(And even this can be up for debate in some scenarios.)

そして、いくつかのシナリオについて議論する余地がありまsう。

To protect against even privileged users, systems may need to either disable module loading entirely (e.g. monolithic kernel builds or modules_disabled sysctl), or provide signed modules (e.g. CONFIG_MODULE_SIG_FORCE, or dm-crypt with LoadPin), to keep from having root load arbitrary kernel code via the module loader interface.

特権ユーザーからも保護するためには、システムはmodule loadingを完全に無効化する必要があるかもしれません（monolithic kernelとしてビルドする、あるいは、modules_disable sysctl）。あるいは、署名付きモジュールを利用する必要があります（CONFIG_MODULE_SIG_FORCE そして、dm-crypt with LoadPing）。これによって、rootが任意のmodule loader interfaceを介して任意のカーネルコードをloadさせないようにします。

Memory integrity

There are many memory structures in the kernel that are regularly abused to gain execution control during an attack,

カーネルには、攻撃の最中に実行制御を獲得されるために、よく使われるメモリ構造があります。

By far the most commonly understood is that of the stack buffer overflow in which the return address stored on the stack is overwritten.

これまでで最も一般的に理解されているのは、スタックに格納されている戻りアドレスが上書きされるスタックバッファオーバーフローです。

Many other examples of this kind of attack exist, and protections exist to defend against them.

この類の攻撃には他にも多くの例があり、それらを防御するための保護が存在します。

Stack buffer overflow

The classic stack buffer overflow involves writing past the expected end of a variable stored on the stack, ultimately writing a controlled value to the stack frame’s stored return address.

従来のスタックバッファオーバーフローでは、スタックに保持された変数の浴衣ｻﾚﾀ最期を越えて書き込むことで、最終的にはスタックフレームに格納された戻りアドレスに制御されたアドレスを書き込もうとします。

The most widely used defense is the presence of a stack canary between the stack variables and the return address (CONFIG_STACKPROTECTOR), which is verified just before the function returns.

最も広く使用されている防御策は、スタック変数と戻りアドレスの間にstack canaryを供える事です（CONFIG_STACKPROTECTOR）。これは、関数が戻る直前に確認されます。

Other defenses include things like shadow stacks.

他の保護手段は、shadow stackを含める事です。

Stack depth overflow

A less well understood attack is using a bug that triggers the kernel to consume stack memory with deep function calls or large stack allocations.

広く知られていない攻撃は、深い関数コールや巨大なstack確保によって、stack memoryを消費するようにカーネルにトリガーするバグを用いる事です。

With this attack it is possible to write beyond the end of the kernel’s preallocated stack space and into sensitive structures.

この攻撃によって、カーネルがあらかじめ確保しているstack spaceの終了を越えて、センシティブな構造に書き込み事ができます。

Two important changes need to be made for better protections: moving the sensitive thread_info structure elsewhere, and adding a faulting memory hole at the bottom of the stack to catch these overflows.

より保護をするためには２つの修正が必要です。センシティブなthread_info構造を別の場所に移動する事、そして、オーバーフローをキャッチするために、スタックの下に障害が起きた時のメモリホールを追加する事です。

Heap memory integrity

The structures used to track heap free lists can be sanity-checked during allocation and freeing to make sure they aren’t being used to manipulate other memory areas.

heap free listsを追跡する構造には、割り当ておよび解放時に健全性をチェックし、他のメモリ領域操作で使用されていないことを検証することができまsう。

Counter integrity

Many places in the kernel use atomic counters to track object references or perform similar lifetime management.

kernelの多くの場所では、オブジェクト参照をトラッキングしたり、ライフタイム管理を実行したりするために、atomic counterが使われまsう。

When these counters can be made to wrap (over or under) this traditionally exposes a use-after-free flaw.

このカウンタがwrapしたときに(過度、あるいは過小）、これは古くから、解放後に用いる欠陥(use-after-free flaw)を引き起こします。

By trapping atomic wrapping, this class of bug vanishes.

atomic wrappingのtrappingすることで、この類のバグは消えます。

Size calculation overflow detection

Similar to counter overflow, integer overflows (usually size calculations) need to be detected at runtime to kill this class of bug, which traditionally leads to being able to write past the end of kernel buffers.

counter overflowと同様に、整数のoverflow（通常はサイズ計算）を実行時に検出して、この類のバグを削除sする必要があります。従来は、カーネルのバッファーの終端を越えて書き込む可能性があります。

後半に続く…

流石に長すぎるので、あとは後半。

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up