[golang] シェアリングアップとシェアリングダウンを理解、さらに、ポインタの使いどころを知る。

Last updated at 2024-12-21Posted at 2024-12-21

結論

ポインタを使う必要がない場面
- map, slice, string などは、「長さ」, 「データ部分へのポインタ」などを内部的に持っており、返り値, 引数, メソッドのレシーバとして渡すとき、「長さ」, 「データ部分へのポインタ」がコピーされるだけで、「データ部分」はコピーされずに同じアドレスが使用される。わざわざ、ポインタを使う必要はない
- 構造体や固定長配列(値型)を返り値として使う場合、よほど大きなサイズではない限り(詳しく後述する)、ポインタ(アドレス)で返す必要はなく、そのまま値で返すべきである。
  - ポインタ(アドレス)を返り値として使うと、シェアリングアップにより、本来はスタックに配置されるデータも、ヒープにエスケープする必要性が生じる。ヒープ領域への書き込みは、スタックへの書き込みに比べてはるかに時間がかかることを覚えておく必要がある
    - 検証(1) で解説する
  - 「シェアリングアップがない場合でもヒープに配置されるくらい大きな構造体や固定長配列(値型)」であれば、ポインタ(アドレス)を返り値に使った方がパフォーマンスが良くなるだろう。なぜなら、返り値に値をそのまま使ったら、ヒープに完全なコピーが走るので、アドレス(8バイト)のスタックへの書き込みに比べてはるかに時間がかかるからである。
    - 私の検証によると、64*1024 + 1バイト以上のデータサイズの場合、通常時でもヒープに配置される。ただし、バイトスライスでの実験であり、構造体や固定長配列などの値型で試したわけではないので後で検証したい。
    - 上記サイズ以上であれば、アドレスで返したほうがいいだろう。例えば、10,000　バイトの構造体の場合、値で返したら、10,000　バイトのヒープへの書き込みが二回起きるが、アドレスで返せば、10,000　バイトのヒープへの書き込みと、アドレス(8バイト)のスタックへの書き込みが走る。後者の方が明らかにパフォーマンスが高いことが理解できるだろう。
    - 上記サイズ未満であれば、アドレスで返すのは避けた方がいいかもしれない。例えば、1000バイトの構造体の場合、値で返したら、1000バイトのスタックへの書き込みが2回起きるが、アドレスで返したら、1000バイトのヒープへの書き込みとアドレス(8バイト)のスタックへの書き込みが起きる。検証しないことには明言できないが、スタックへの書き込み速度がヒープへの書き込み速度と比較してはるかに速いことを考えると、この場合は値で返した方がいいだろう。これも後で検証したい
ポインタを使う必要がある場面
- 内部にポインタを持っていない構造体などの値(関数の引数, メソッドのレシーバとして直接渡すと完全なコピーが走る)を、関数, メソッド内で破壊的に更新したい場合、関数の引数, メソッドのレシーバとしてポインタを渡す必要がある
ポインタを必ずしも使う必要はないが使うべき場面は、いくつか挙げられる
- メソッド内でレシーバが破壊的に変更されることを明示的に示すために、ポインタを使う。
  - 例えば、内部的にデータへのポインタを持っている構造体は、レシーバとしてポインタを渡さずにそのまま値を渡してもデータを更新できる。ただ、そのメソッドによってデータが破壊的に変更されることを明示するために、レシーバとしてポインタを使うことは良い慣習である。
  - 逆に、レシーバの破壊的変更が起きない場合、レシーバとしてポインタではなくて値を渡すべきである。
    - ただし、大きいサイズのコピーが走ってしまう場合は、ポインタとして渡すべきかもしれない。これは、あなたの選択次第である。破壊的更新が起きないことを保証して明示することを優先するか、パフォーマンスを優先するかである。例えば、レシーバとして、サイズの大きい構造体や固定長配列(値型)を使うが、メソッド内で破壊的な変更が起きないときは、迷うことになるだろう。
- データサイズが大きい構造体や固定長配列を(これらは値型である)、関数の引数, メソッドのレシーバとして渡すときは、ポインタを渡すことによってパフォーマンスを改善できることが多い
  - 関数の引数, メソッドのレシーバとしてポインタを渡す場合は、シェアリングアップではなくて、シェアリングダウンが起きるので、意図しないヒープへのエスケープは発生せず、パフォーマンスが悪化することはない
    - 検証(2) で解説する
  - アドレスを渡すことによるパフォーマンス低下はないものの、可読性を考えて、データサイズが大きくない限り、関数の引数, メソッドのレシーバとしてポインタをわざわざ使う必要はないかもしれない
- データサイズが大きい構造体や固定長配列を(これらは値型である)、返り値として使うときは、ポインタを渡すことによってパフォーマンスを改善できるかもしれない
  - 小さなサイズであるのにもかかわらず、返り値としてポインタ使う場合は、上述したようにシェアリングアップによって、意図しないヒープへのエスケープが発生してしまい、むしろパフォーマンス低下に陥る可能性があるので、注意するべきである
  - 通常時にヒープに配置されるくらい大きなデータであれば、意図しないヒープへのエスケープ(シェアリングアップ)は発生しないので(もともとヒープに配置されているから)、返り値にポインタを使うことでパフォーマンスが改善できるだろう

検証

(1) シェアリングアップ

(a) と (b) のベンチマークを見ると、明らかに (a) の方がパフォーマンスが悪い

64ビット環境なので、int は int64(8バイト) である。

(a)
- MyFunc
  - mynum(8バイト)をヒープへ書き込む(スコープ外からも参照可能にするために、スタックではなくヒープに配置する)
  - mynumのアドレス(8バイト)をスタックにコピーする。これをPraEscape1で使う
- PraEscape1
  - 上記のコピーにより、myPointer(8バイト)がすでにスタックに配置されている
  - *myPointer(8バイト)をスタックにコピーする。これをBenchmarkPraEscape1で使う
(b)
- MyFunc
  - mynum(8バイト) をスタックに配置する
  - mynum(8バイト)のコピーをスタックにコピーする。これをPraEscape1で使う
- PraEscape1
  - 上記のコピーにより、numberMy(8バイト)がすでにスタックに配置されている
  - numberMy(8バイト)をスタックにコピーする。これをBenchmarkPraEscape1で使う

(a) と (b) は両方とも、メモリへ書き込んだ合計データ量は 8×3=24 バイトであり、全く同じである。
しかし、ベンチマークを見ると、明らかに (a) の方が時間がかかっている。これは、ヒープへの書き込み速度が、スタックへの書き込みに比べてはるかに遅いことが原因である。

※ 下記のコードだと、MyFuncの短さゆえに、go:noinlineがない場合は、最適化によりMyFuncがインライン展開されるので、ヒープへのエスケープは起きない。今回は実験のために、go:noinlineを使って、ヒープへのエスケープを引き起こしている。

main.go

package main

func main() {
}

pra_escape_test.go

package main

import "testing"

var globalInt int

// //////////////////////////////////////////////////
// ////////////////////////////////////////////////// (1)
func BenchmarkPraEscape1(b *testing.B) {
	var localInt int
	for i := 0; i < b.N; i++ {
		localInt = PraEscape1()
	}
	globalInt = localInt
}

(a)

pra_escape.go

package main

// //////////////////////////////////////////////////
// ////////////////////////////////////////////////// (1)
//
//go:noinline
func PraEscape1() int {
	myPointer := MyFunc()
	return *myPointer
}

//go:noinline
func MyFunc() *int {
	mynum := 1
	return &mynum
}

~/praprago$ go test -bench=. -benchmem
goos: linux
goarch: amd64
pkg: github.com/XXX/praprago
cpu: Intel(R) Xeon(R) CPU E5-2686 v4 @ 2.30GHz
BenchmarkPraEscape1-2           68502208                17.37 ns/op            8 B/op          1 allocs/op
PASS
ok      github.com/XXX/praprago 2.123s

アドレスが返り値となっているので、スコープを抜けても、mynumにアクセスできるように、mynumをヒープにエスケープさせている(シェアリングアップ)。
./pra_escape.go:14:2: mynum escapes to heap:を見るとわかるように、アドレスで返されることが事前にわかっているので、mynum := 1 の時点でヒープにエスケープされている。

~/praprago$ go build -gcflags "-m=2" .
# github.com/XXX/praprago
./main.go:3:6: can inline main with cost 0 as: func() {  }
./pra_escape.go:13:6: cannot inline MyFunc: marked go:noinline
./pra_escape.go:7:6: cannot inline PraEscape1: marked go:noinline
./pra_escape.go:14:2: mynum escapes to heap:
./pra_escape.go:14:2:   flow: ~r0 = &mynum:
./pra_escape.go:14:2:     from &mynum (address-of) at ./pra_escape.go:15:9
./pra_escape.go:14:2:     from return &mynum (return) at ./pra_escape.go:15:2
./pra_escape.go:14:2: moved to heap: mynum

(b)

pra_escape.go

package main

// //////////////////////////////////////////////////
// ////////////////////////////////////////////////// (1)
//
//go:noinline
func PraEscape1() int {
	numberMy := MyFunc()
	return numberMy
}

//go:noinline
func MyFunc() int {
	mynum := 1
	return mynum
}

~/praprago$ go test -bench=. -benchmem
goos: linux
goarch: amd64
pkg: github.com/XXX/praprago
cpu: Intel(R) Xeon(R) CPU E5-2686 v4 @ 2.30GHz
BenchmarkPraEscape1-2           645292551                1.850 ns/op           0 B/op          0 allocs/op
PASS
ok      github.com/XXX/praprago 1.387s

~/praprago$ go build -gcflags "-m=2" .
# github.com/XXX/praprago
./main.go:3:6: can inline main with cost 0 as: func() {  }
./pra_escape.go:13:6: cannot inline MyFunc: marked go:noinline
./pra_escape.go:7:6: cannot inline PraEscape1: marked go:noinline

(2) シェアリングダウン

返り値としてポインタを使うことで、意図しないヒープへのエスケープ(シェアリングアップ)が発生して、むしろパフォーマンスを低下させる可能性があることを上述した。

一方で、関数の引数, メソッドのレシーバとしてポインタを使うことで、パフォーマンスが悪化することは基本的にない。
なぜなら、呼び出し元が、スタックに配置されたデータへのポインタを、引数として呼び出し先の関数に渡した場合、データをヒープにエスケープせずとも、呼び出し先関数からポインタを介してスタックのデータにアクセスすることができるからである。
呼び出し先関数の実行中では、呼び出し元関数のスコープは生きているので、呼び出し元スコープのスタック領域にアクセスすることが可能なのである(シェアリングダウン)。

一応、メモリへの書き込みを追ってみる。

(a)
- PraEscape1
  - numMyA(8バイト)をスタックへ書き込む
  - numMyB(8バイト)をスタックへ書き込む
  - numMyAのアドレス(8バイト)とnumMyBのアドレス(8バイト)をスタックにコピーする。これをMyFuncが使う
- MyFunc
  - 上記のコピーにより、a(8バイト)とb(8バイト)がすでにスタックに配置されている
  - mynum(8バイト)をスタックに書き込む
  - mynum(8バイト)をスタックにコピーする。これをPraEscape1が使う
- PraEscape1
  - 上記のコピーにより、numberMy(8バイト)がすでにスタックに配置されている
  - numberMy(8バイト)をスタックにコピーする。これをBenchmarkPraEscape1が使う
(b)
- PraEscape1
  - numMyA(8バイト)をスタックへ書き込む
  - numMyB(8バイト)をスタックへ書き込む
  - numMyA(8バイト)とnumMyB(8バイト)をスタックにコピーする。これをMyFuncが使う
- MyFunc
  - 上記のコピーにより、a(8バイト)とb(8バイト)がすでにスタックに配置されている
  - mynum(8バイト)をスタックに書き込む
  - mynum(8バイト)をスタックにコピーする。これをPraEscape1が使う
- PraEscape1
  - 上記のコピーにより、numberMy(8バイト)がすでにスタックに配置されている
  - numberMy(8バイト)をスタックにコピーする。これをBenchmarkPraEscape1が使う

(a) は合計で 8×7=56 バイトのスタックへの書き込み、(b) は合計で 8×7=56 バイトのスタックへの書き込みが発生している。
(a)と(b)は、全く同じデータ量を、同じメモリ領域(スタック)に書き込んでいるので、実行時間に大きな差はない。

main.go

package main

func main() {
}

pra_escape_test.go

package main

import "testing"

var globalInt int

// //////////////////////////////////////////////////
// ////////////////////////////////////////////////// (1)
func BenchmarkPraEscape1(b *testing.B) {
	var localInt int
	for i := 0; i < b.N; i++ {
		localInt = PraEscape1()
	}
	globalInt = localInt
}

(a)

pra_escape.go

package main

// //////////////////////////////////////////////////
// ////////////////////////////////////////////////// (1)
//
//go:noinline
func PraEscape1() int {
	numMyA := 10
	numMyB := 20
	numberMy := MyFunc(&numMyA, &numMyB)
	return numberMy
}

//go:noinline
func MyFunc(a, b *int) int {
	mynum := *a + *b
	return mynum
}

~/praprago$ go test -bench=. -benchmem
goos: linux
goarch: amd64
pkg: github.com/XXX/praprago
cpu: Intel(R) Xeon(R) CPU E5-2686 v4 @ 2.30GHz
BenchmarkPraEscape1-2           406213046                2.957 ns/op           0 B/op          0 allocs/op
PASS
ok      github.com/XXX/praprago 1.506s

~/praprago$ go build -gcflags "-m=2" .
# github.com/XXX/praprago
./main.go:3:6: can inline main with cost 0 as: func() {  }
./pra_escape.go:15:6: cannot inline MyFunc: marked go:noinline
./pra_escape.go:7:6: cannot inline PraEscape1: marked go:noinline
./pra_escape.go:15:13: a does not escape
./pra_escape.go:15:16: b does not escape

(b)

pra_escape.go

package main

// //////////////////////////////////////////////////
// ////////////////////////////////////////////////// (1)
//
//go:noinline
func PraEscape1() int {
	numMyA := 10
	numMyB := 20
	numberMy := MyFunc(numMyA, numMyB)
	return numberMy
}

//go:noinline
func MyFunc(a, b int) int {
	mynum := a + b
	return mynum
}

~/praprago$ go test -bench=. -benchmem
goos: linux
goarch: amd64
pkg: github.com/XXX/praprago
cpu: Intel(R) Xeon(R) CPU E5-2686 v4 @ 2.30GHz
BenchmarkPraEscape1-2           529656027                2.278 ns/op           0 B/op          0 allocs/op
PASS
ok      github.com/XXX/praprago 1.442s

~/praprago$ go build -gcflags "-m=2" .
# github.com/XXX/praprago
./main.go:3:6: can inline main with cost 0 as: func() {  }
./pra_escape.go:15:6: cannot inline MyFunc: marked go:noinline
./pra_escape.go:7:6: cannot inline PraEscape1: marked go:noinline

参考

Receiver type
A method receiver can be passed either as a value or a pointer, just as if it were a regular function parameter. The choice between the two is based on which method set(s) the method should be a part of.

Correctness wins over speed or simplicity. There are cases where you must use a pointer value. In other cases, pick pointers for large types or as future-proofing if you don’t have a good sense of how the code will grow, and use values for simple plain old data.

The list below spells out each case in further detail:

If the receiver is a slice and the method doesn’t reslice or reallocate the slice, use a value rather than a pointer.
// Good:
type Buffer []byte

func (b Buffer) Len() int { return len(b) }
If the method needs to mutate the receiver, the receiver must be a pointer.
// Good:
type Counter int

func (c *Counter) Inc() { *c++ }

// See https://pkg.go.dev/container/heap.
type Queue []Item

func (q *Queue) Push(x Item) { *q = append([]Item{x}, *q...) }
If the receiver is a struct containing fields that cannot safely be copied, use a pointer receiver. Common examples are sync.Mutex and other synchronization types.
// Good:
type Counter struct {
    mu    sync.Mutex
    total int
}

func (c *Counter) Inc() {
    c.mu.Lock()
    defer c.mu.Unlock()
    c.total++
}
Tip: Check the type’s Godoc for information about whether it is safe or unsafe to copy.

If the receiver is a “large” struct or array, a pointer receiver may be more efficient. Passing a struct is equivalent to passing all of its fields or elements as arguments to the method. If that seems too large to pass by value, a pointer is a good choice.

For methods that will call or run concurrently with other functions that modify the receiver, use a value if those modifications should not be visible to your method; otherwise use a pointer.

If the receiver is a struct or array, any of whose elements is a pointer to something that may be mutated, prefer a pointer receiver to make the intention of mutability clear to the reader.
// Good:
type Counter struct {
    m *Metric
}

func (c *Counter) Inc() {
    c.m.Add(1)
}
If the receiver is a built-in type, such as an integer or a string, that does not need to be modified, use a value.
// Good:
type User string

func (u User) String() { return string(u) }
If the receiver is a map, function, or channel, use a value rather than a pointer.
// Good:
// See https://pkg.go.dev/net/http#Header.
type Header map[string][]string

func (h Header) Add(key, value string) { /* omitted */ }
If the receiver is a “small” array or struct that is naturally a value type with no mutable fields and no pointers, a value receiver is usually the right choice.
// Good:
// See https://pkg.go.dev/time#Time.
type Time struct { /* omitted */ }

func (t Time) Add(d Duration) Time { /* omitted */ }
When in doubt, use a pointer receiver.

As a general guideline, prefer to make the methods for a type either all pointer methods or all value methods.

Note: There is a lot of misinformation about whether passing a value or a pointer to a function can affect performance. The compiler can choose to pass pointers to values on the stack as well as copying values on the stack, but these considerations should not outweigh the readability and correctness of the code in most circumstances. When the performance does matter, it is important to profile both approaches with a realistic benchmark before deciding that one approach outperforms the other.

This is where Go Compiler kicks in 🚩
Sharing down of the variables (passing references) typically stays on the stack. This is because, the GO Compiler takes the decision whether a referenced variable needs to stay on the stack or on the heap.

sharing up (returning pointers) typically escapes to the Heap.

sharing up, sharing down について詳しく書いてあってとても分かりやすい。
関数のスコープを抜けたとき、その関数で使用されたスタック領域は、無効になるけどすぐにクリーンアップされないことまで丁寧に書いてある。ただ、関数のスコープを抜けたらすぐにメモリ領域がクリーンアップされると考えても特に問題ないので、そのように考えてしまった方が理解する上で分かりやすいかもしれない。

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up