1
0

Delete article

Deleted articles cannot be recovered.

Draft of this article would be also deleted.

Are you sure you want to delete this article?

[golang] unsafe.StringData を使って、コピー(キャスト)なしで string から byte に変換する

Last updated at Posted at 2024-12-15

目的

unsafe.StringData の注意事項などを知る。

検証

string から []byte へのキャスト(変換)と、その逆の処理は、コピーが走る。
でも、パフォーマンスのために、コピーを走らせないで、同じメモリ領域を共有したい場合もある。
以前は、ベストプラクティスが定まっておらず、s := *(*string)(unsafe.Pointer(&b)) のようにして変換していたらしいが、v1.21リリースunsafe パッケージが出て、ベストプラクティスが確定した。
この記事 を参考にしつつ、unsafe の挙動を確認してみた。

main.go
package main

import "fmt"

// go run .
func main() {
	fmt.Println(" ============================== (1)")
	pra_unsafe()
	fmt.Println(" ============================== (2)")
	// pra_unsafe2() // パニック起こすのでコメントアウト
	fmt.Println(" ============================== (3)")
	pra_unsafe3()
	fmt.Println(" ============================== (4)")
	pra_unsafe4()
	fmt.Println(" ============================== (5)")
	// pra_unsafe5() // パニック起こすのでコメントアウト
}

pra_unsafe.go
package main

import (
	"fmt"
	"strings"
	"unsafe"
)

// //////////////////////////////////////////////////////////
// ////////////////////////////////////////////////////////// (1)
//
//go:noinline
func pra_unsafe() {
	myString := "Hello,漢World"
	myString2 := "Hello,漢World"
	myString3 := "Hello,World!!"

	// myString, myString2 は、コンパイル時に確定しており、全く同じ文字列なのでメモリ領域を共有する。
	// 一般的に、初期化された 静的変数, グローバル変数, 静的ローカル変数は、Read-Only Data Segment に配置される。
	//   golang において、静的変数という概念はないが、上記のようにコンパイル時に確定してある文字列は、Read-Only Data Segment に配置される。
	// golang において文字列はイミュータブルなので、メモリ領域を共有しても全く問題ないことは分かるだろう。

	fmt.Printf("myString address: %p\n", &myString)                        // 変数自体のアドレス
	fmt.Printf("myString data address: %p\n", unsafe.StringData(myString)) // データ部分のアドレス。myString2 と同じアドレスを指す。

	fmt.Printf("myString2 address: %p\n", &myString2)                        // 変数自体のアドレス
	fmt.Printf("myString2 data address: %p\n", unsafe.StringData(myString2)) // データ部分のアドレス。myString と同じアドレスを指す。

	fmt.Printf("myString3 address: %p\n", &myString3)                        // 変数自体のアドレス
	fmt.Printf("myString3 data address: %p\n", unsafe.StringData(myString3)) // データ部分のアドレス
}

// //////////////////////////////////////////////////////////
// ////////////////////////////////////////////////////////// (2)
// スコープ外で参照される可能性がある値はヒープにエスケープされるのが通常であるが、unsafe.StringData による参照があったとしても、エスケープはされないようだ。
// 文字列データはヒープにエスケープされておらず、異なるスコープから参照できないので、unsafe.StringData(s) によって別のスコープから参照されてしまうとクラッシュする。
// 以上の理由により、このコードはクラッシュする。
//
// エスケープ解析をしてみると、unsafe.Slice はヒープにエスケープされずに、fmt.Println の部分ではヒープにエスケープされる。
//
//go:noinline
func pra_unsafe2() {
	s := "hello"
	b := unsafe.Slice(unsafe.StringData(s), len(s))
	b[0] = 'h'
	fmt.Println(s)
}

// //////////////////////////////////////////////////////////
// ////////////////////////////////////////////////////////// (3)
// エスケープ解析でも確認できたが、下記のように実装すると文字列はヒープに置かれるらしい。
// 個人的には、このような実装はするべきではないと考える。
// なぜなら、golang の言語仕様として、スタックに配置するのかヒープに配置するのかは、明確に定められていないため、将来バージョンで変更される可能性があるからだ。
// たとえ、下記のように実装したとしても、バージョンアップに伴い、文字列がヒープではなくてスタックやRead-Only Data Segment とかに配置されるようになった場合、このコードはクラッシュしてしまう。
//
// 私としては、(2) がクラッシュするのに、(3) がクラッシュしないのは、ものすごい違和感を感じてしまう。
// なぜなら、(2)と(3) の文字列 s は開発者目線では全く同じデータであるから。
// メモリの配置位置によって挙動が変わるのは、開発者にとっては迷惑なことである。
// C++ みたいにどこのメモリに配置するのかを開発者が明確に意識する言語なら理解できるが、golang は公式ドキュメントにおいて、どこのメモリに配置されるかは意識する必要はないと言い切っている。
//
//go:noinline
func pra_unsafe3() {
	s := string([]byte("hello"))
	b := unsafe.Slice(unsafe.StringData(s), len(s))
	b[0] = 'w'
	fmt.Println(s)
}

// //////////////////////////////////////////////////////////
// ////////////////////////////////////////////////////////// (4)
// strings.Repeat は動的にメモリに割り当てるため、s, s2 でメモリ領域を共有しない。
// なので、片方の文字列を変更しても、もう片方の文字列には影響を与えない。
//
//go:noinline
func pra_unsafe4() {
	s := strings.Repeat("A", 10)
	s2 := strings.Repeat("A", 10)
	b := unsafe.Slice(unsafe.StringData(s), len(s))
	b[0] = 'b'
	fmt.Println(s)
	fmt.Println(s2)
}

// //////////////////////////////////////////////////////////
// ////////////////////////////////////////////////////////// (5)
// 下記はクラッシュする。
// エスケープ解析をすると、一回目の fmt.Println(s) でヒープにエスケープされていることが分かる。
// 一回目の fmt.Println(s) でヒープにエスケープされているということは、unsafe.Slice 関数で、ヒープに置かれている文字列のデータを参照して使うことができると思うかもしれないが、そうではない。
// 関数に渡されるときにヒープにエスケープするというのは、文字列データをヒープにコピーして渡すということであり、呼び出し側の以降のスコープでは、文字列は相変わらず元の文字列データを参照している。
// つまり、呼び出し元では相変わらず、文字列は .rodata セクション(たぶん)に置かれている。

//
//go:noinline
func pra_unsafe5() {
	s := "CCCCCCCCCC"
	fmt.Println(s)
	b := unsafe.Slice(unsafe.StringData(s), len(s))
	b[0] = 'b'
	fmt.Println(s)
}

~/praprago$ go run .
 ============================== (1)
myString address: 0xc0000140b0
myString data address: 0x4b2487
myString2 address: 0xc0000140c0
myString2 data address: 0x4b2487
myString3 address: 0xc0000140d0
myString3 data address: 0x4b21bc
 ============================== (2)
 ============================== (3)
wello
 ============================== (4)
bAAAAAAAAA
AAAAAAAAAA
 ============================== (5)
~/praprago$ go build -gcflags "-m=2" .
# github.com/XXX/praprago
./pra_unsafe.go:13:6: cannot inline pra_unsafe: marked go:noinline
./pra_unsafe.go:62:6: cannot inline pra_unsafe3: marked go:noinline
./pra_unsafe.go:75:6: cannot inline pra_unsafe4: marked go:noinline
./main.go:6:6: cannot inline main: function too complex: cost 567 exceeds budget 80
./pra_unsafe.go:42:6: cannot inline pra_unsafe2: marked go:noinline
./pra_unsafe.go:92:6: cannot inline pra_unsafe5: marked go:noinline
./main.go:7:13: inlining call to fmt.Println
./main.go:9:13: inlining call to fmt.Println
./main.go:11:13: inlining call to fmt.Println
./main.go:13:13: inlining call to fmt.Println
./main.go:15:13: inlining call to fmt.Println
./pra_unsafe.go:23:12: inlining call to fmt.Printf
./pra_unsafe.go:24:12: inlining call to fmt.Printf
./pra_unsafe.go:26:12: inlining call to fmt.Printf
./pra_unsafe.go:27:12: inlining call to fmt.Printf
./pra_unsafe.go:29:12: inlining call to fmt.Printf
./pra_unsafe.go:30:12: inlining call to fmt.Printf
./pra_unsafe.go:46:13: inlining call to fmt.Println
./pra_unsafe.go:66:13: inlining call to fmt.Println
./pra_unsafe.go:80:13: inlining call to fmt.Println
./pra_unsafe.go:81:13: inlining call to fmt.Println
./pra_unsafe.go:94:13: inlining call to fmt.Println
./pra_unsafe.go:97:13: inlining call to fmt.Println
./pra_unsafe.go:16:2: myString3 escapes to heap:
./pra_unsafe.go:16:2:   flow: {storage for ... argument} = &myString3:
./pra_unsafe.go:16:2:     from &myString3 (address-of) at ./pra_unsafe.go:29:40
./pra_unsafe.go:16:2:     from &myString3 (interface-converted) at ./pra_unsafe.go:29:40
./pra_unsafe.go:16:2:     from ... argument (slice-literal-element) at ./pra_unsafe.go:29:12
./pra_unsafe.go:16:2:   flow: fmt.a = &{storage for ... argument}:
./pra_unsafe.go:16:2:     from ... argument (spill) at ./pra_unsafe.go:29:12
./pra_unsafe.go:16:2:     from fmt.format, fmt.a := "myString3 address: %p\n", ... argument (assign-pair) at ./pra_unsafe.go:29:12
./pra_unsafe.go:16:2:   flow: {heap} = *fmt.a:
./pra_unsafe.go:16:2:     from fmt.Fprintf(os.Stdout, fmt.format, fmt.a...) (call parameter) at ./pra_unsafe.go:29:12
./pra_unsafe.go:15:2: myString2 escapes to heap:
./pra_unsafe.go:15:2:   flow: {storage for ... argument} = &myString2:
./pra_unsafe.go:15:2:     from &myString2 (address-of) at ./pra_unsafe.go:26:40
./pra_unsafe.go:15:2:     from &myString2 (interface-converted) at ./pra_unsafe.go:26:40
./pra_unsafe.go:15:2:     from ... argument (slice-literal-element) at ./pra_unsafe.go:26:12
./pra_unsafe.go:15:2:   flow: fmt.a = &{storage for ... argument}:
./pra_unsafe.go:15:2:     from ... argument (spill) at ./pra_unsafe.go:26:12
./pra_unsafe.go:15:2:     from fmt.format, fmt.a := "myString2 address: %p\n", ... argument (assign-pair) at ./pra_unsafe.go:26:12
./pra_unsafe.go:15:2:   flow: {heap} = *fmt.a:
./pra_unsafe.go:15:2:     from fmt.Fprintf(os.Stdout, fmt.format, fmt.a...) (call parameter) at ./pra_unsafe.go:26:12
./pra_unsafe.go:14:2: myString escapes to heap:
./pra_unsafe.go:14:2:   flow: {storage for ... argument} = &myString:
./pra_unsafe.go:14:2:     from &myString (address-of) at ./pra_unsafe.go:23:39
./pra_unsafe.go:14:2:     from &myString (interface-converted) at ./pra_unsafe.go:23:39
./pra_unsafe.go:14:2:     from ... argument (slice-literal-element) at ./pra_unsafe.go:23:12
./pra_unsafe.go:14:2:   flow: fmt.a = &{storage for ... argument}:
./pra_unsafe.go:14:2:     from ... argument (spill) at ./pra_unsafe.go:23:12
./pra_unsafe.go:14:2:     from fmt.format, fmt.a := "myString address: %p\n", ... argument (assign-pair) at ./pra_unsafe.go:23:12
./pra_unsafe.go:14:2:   flow: {heap} = *fmt.a:
./pra_unsafe.go:14:2:     from fmt.Fprintf(os.Stdout, fmt.format, fmt.a...) (call parameter) at ./pra_unsafe.go:23:12
./pra_unsafe.go:14:2: moved to heap: myString
./pra_unsafe.go:15:2: moved to heap: myString2
./pra_unsafe.go:16:2: moved to heap: myString3
./pra_unsafe.go:23:12: ... argument does not escape
./pra_unsafe.go:24:12: ... argument does not escape
./pra_unsafe.go:26:12: ... argument does not escape
./pra_unsafe.go:27:12: ... argument does not escape
./pra_unsafe.go:29:12: ... argument does not escape
./pra_unsafe.go:30:12: ... argument does not escape
./pra_unsafe.go:66:14: s escapes to heap:
./pra_unsafe.go:66:14:   flow: {storage for ... argument} = &{storage for s}:
./pra_unsafe.go:66:14:     from s (spill) at ./pra_unsafe.go:66:14
./pra_unsafe.go:66:14:     from ... argument (slice-literal-element) at ./pra_unsafe.go:66:13
./pra_unsafe.go:66:14:   flow: fmt.a = &{storage for ... argument}:
./pra_unsafe.go:66:14:     from ... argument (spill) at ./pra_unsafe.go:66:13
./pra_unsafe.go:66:14:     from fmt.a := ... argument (assign-pair) at ./pra_unsafe.go:66:13
./pra_unsafe.go:66:14:   flow: {heap} = *fmt.a:
./pra_unsafe.go:66:14:     from fmt.Fprintln(os.Stdout, fmt.a...) (call parameter) at ./pra_unsafe.go:66:13
./pra_unsafe.go:63:20: string(([]byte)("hello")) escapes to heap:
./pra_unsafe.go:63:20:   flow: s = &{storage for string(([]byte)("hello"))}:
./pra_unsafe.go:63:20:     from string(([]byte)("hello")) (spill) at ./pra_unsafe.go:63:20
./pra_unsafe.go:63:20:     from s := string(([]byte)("hello")) (assign) at ./pra_unsafe.go:63:4
./pra_unsafe.go:63:20:   flow: {storage for s} = s:
./pra_unsafe.go:63:20:     from s (interface-converted) at ./pra_unsafe.go:66:14
./pra_unsafe.go:63:20: string(([]byte)("hello")) escapes to heap
./pra_unsafe.go:63:21: ([]byte)("hello") does not escape
./pra_unsafe.go:63:21: zero-copy string->[]byte conversion
./pra_unsafe.go:66:13: ... argument does not escape
./pra_unsafe.go:66:14: s escapes to heap
./pra_unsafe.go:81:14: s2 escapes to heap:
./pra_unsafe.go:81:14:   flow: {storage for ... argument} = &{storage for s2}:
./pra_unsafe.go:81:14:     from s2 (spill) at ./pra_unsafe.go:81:14
./pra_unsafe.go:81:14:     from ... argument (slice-literal-element) at ./pra_unsafe.go:81:13
./pra_unsafe.go:81:14:   flow: fmt.a = &{storage for ... argument}:
./pra_unsafe.go:81:14:     from ... argument (spill) at ./pra_unsafe.go:81:13
./pra_unsafe.go:81:14:     from fmt.a := ... argument (assign-pair) at ./pra_unsafe.go:81:13
./pra_unsafe.go:81:14:   flow: {heap} = *fmt.a:
./pra_unsafe.go:81:14:     from fmt.Fprintln(os.Stdout, fmt.a...) (call parameter) at ./pra_unsafe.go:81:13
./pra_unsafe.go:80:14: s escapes to heap:
./pra_unsafe.go:80:14:   flow: {storage for ... argument} = &{storage for s}:
./pra_unsafe.go:80:14:     from s (spill) at ./pra_unsafe.go:80:14
./pra_unsafe.go:80:14:     from ... argument (slice-literal-element) at ./pra_unsafe.go:80:13
./pra_unsafe.go:80:14:   flow: fmt.a = &{storage for ... argument}:
./pra_unsafe.go:80:14:     from ... argument (spill) at ./pra_unsafe.go:80:13
./pra_unsafe.go:80:14:     from fmt.a := ... argument (assign-pair) at ./pra_unsafe.go:80:13
./pra_unsafe.go:80:14:   flow: {heap} = *fmt.a:
./pra_unsafe.go:80:14:     from fmt.Fprintln(os.Stdout, fmt.a...) (call parameter) at ./pra_unsafe.go:80:13
./pra_unsafe.go:80:13: ... argument does not escape
./pra_unsafe.go:80:14: s escapes to heap
./pra_unsafe.go:81:13: ... argument does not escape
./pra_unsafe.go:81:14: s2 escapes to heap
./main.go:15:14: " ============================== (5)" escapes to heap:
./main.go:15:14:   flow: {storage for ... argument} = &{storage for " ============================== (5)"}:
./main.go:15:14:     from " ============================== (5)" (spill) at ./main.go:15:14
./main.go:15:14:     from ... argument (slice-literal-element) at ./main.go:15:13
./main.go:15:14:   flow: fmt.a = &{storage for ... argument}:
./main.go:15:14:     from ... argument (spill) at ./main.go:15:13
./main.go:15:14:     from fmt.a := ... argument (assign-pair) at ./main.go:15:13
./main.go:15:14:   flow: {heap} = *fmt.a:
./main.go:15:14:     from fmt.Fprintln(os.Stdout, fmt.a...) (call parameter) at ./main.go:15:13
./main.go:13:14: " ============================== (4)" escapes to heap:
./main.go:13:14:   flow: {storage for ... argument} = &{storage for " ============================== (4)"}:
./main.go:13:14:     from " ============================== (4)" (spill) at ./main.go:13:14
./main.go:13:14:     from ... argument (slice-literal-element) at ./main.go:13:13
./main.go:13:14:   flow: fmt.a = &{storage for ... argument}:
./main.go:13:14:     from ... argument (spill) at ./main.go:13:13
./main.go:13:14:     from fmt.a := ... argument (assign-pair) at ./main.go:13:13
./main.go:13:14:   flow: {heap} = *fmt.a:
./main.go:13:14:     from fmt.Fprintln(os.Stdout, fmt.a...) (call parameter) at ./main.go:13:13
./main.go:11:14: " ============================== (3)" escapes to heap:
./main.go:11:14:   flow: {storage for ... argument} = &{storage for " ============================== (3)"}:
./main.go:11:14:     from " ============================== (3)" (spill) at ./main.go:11:14
./main.go:11:14:     from ... argument (slice-literal-element) at ./main.go:11:13
./main.go:11:14:   flow: fmt.a = &{storage for ... argument}:
./main.go:11:14:     from ... argument (spill) at ./main.go:11:13
./main.go:11:14:     from fmt.a := ... argument (assign-pair) at ./main.go:11:13
./main.go:11:14:   flow: {heap} = *fmt.a:
./main.go:11:14:     from fmt.Fprintln(os.Stdout, fmt.a...) (call parameter) at ./main.go:11:13
./main.go:9:14: " ============================== (2)" escapes to heap:
./main.go:9:14:   flow: {storage for ... argument} = &{storage for " ============================== (2)"}:
./main.go:9:14:     from " ============================== (2)" (spill) at ./main.go:9:14
./main.go:9:14:     from ... argument (slice-literal-element) at ./main.go:9:13
./main.go:9:14:   flow: fmt.a = &{storage for ... argument}:
./main.go:9:14:     from ... argument (spill) at ./main.go:9:13
./main.go:9:14:     from fmt.a := ... argument (assign-pair) at ./main.go:9:13
./main.go:9:14:   flow: {heap} = *fmt.a:
./main.go:9:14:     from fmt.Fprintln(os.Stdout, fmt.a...) (call parameter) at ./main.go:9:13
./main.go:7:14: " ============================== (1)" escapes to heap:
./main.go:7:14:   flow: {storage for ... argument} = &{storage for " ============================== (1)"}:
./main.go:7:14:     from " ============================== (1)" (spill) at ./main.go:7:14
./main.go:7:14:     from ... argument (slice-literal-element) at ./main.go:7:13
./main.go:7:14:   flow: fmt.a = &{storage for ... argument}:
./main.go:7:14:     from ... argument (spill) at ./main.go:7:13
./main.go:7:14:     from fmt.a := ... argument (assign-pair) at ./main.go:7:13
./main.go:7:14:   flow: {heap} = *fmt.a:
./main.go:7:14:     from fmt.Fprintln(os.Stdout, fmt.a...) (call parameter) at ./main.go:7:13
./main.go:7:13: ... argument does not escape
./main.go:7:14: " ============================== (1)" escapes to heap
./main.go:9:13: ... argument does not escape
./main.go:9:14: " ============================== (2)" escapes to heap
./main.go:11:13: ... argument does not escape
./main.go:11:14: " ============================== (3)" escapes to heap
./main.go:13:13: ... argument does not escape
./main.go:13:14: " ============================== (4)" escapes to heap
./main.go:15:13: ... argument does not escape
./main.go:15:14: " ============================== (5)" escapes to heap
./pra_unsafe.go:46:14: s escapes to heap:
./pra_unsafe.go:46:14:   flow: {storage for ... argument} = &{storage for s}:
./pra_unsafe.go:46:14:     from s (spill) at ./pra_unsafe.go:46:14
./pra_unsafe.go:46:14:     from ... argument (slice-literal-element) at ./pra_unsafe.go:46:13
./pra_unsafe.go:46:14:   flow: fmt.a = &{storage for ... argument}:
./pra_unsafe.go:46:14:     from ... argument (spill) at ./pra_unsafe.go:46:13
./pra_unsafe.go:46:14:     from fmt.a := ... argument (assign-pair) at ./pra_unsafe.go:46:13
./pra_unsafe.go:46:14:   flow: {heap} = *fmt.a:
./pra_unsafe.go:46:14:     from fmt.Fprintln(os.Stdout, fmt.a...) (call parameter) at ./pra_unsafe.go:46:13
./pra_unsafe.go:46:13: ... argument does not escape
./pra_unsafe.go:46:14: s escapes to heap
./pra_unsafe.go:97:14: s escapes to heap:
./pra_unsafe.go:97:14:   flow: {storage for ... argument} = &{storage for s}:
./pra_unsafe.go:97:14:     from s (spill) at ./pra_unsafe.go:97:14
./pra_unsafe.go:97:14:     from ... argument (slice-literal-element) at ./pra_unsafe.go:97:13
./pra_unsafe.go:97:14:   flow: fmt.a = &{storage for ... argument}:
./pra_unsafe.go:97:14:     from ... argument (spill) at ./pra_unsafe.go:97:13
./pra_unsafe.go:97:14:     from fmt.a := ... argument (assign-pair) at ./pra_unsafe.go:97:13
./pra_unsafe.go:97:14:   flow: {heap} = *fmt.a:
./pra_unsafe.go:97:14:     from fmt.Fprintln(os.Stdout, fmt.a...) (call parameter) at ./pra_unsafe.go:97:13
./pra_unsafe.go:94:14: s escapes to heap:
./pra_unsafe.go:94:14:   flow: {storage for ... argument} = &{storage for s}:
./pra_unsafe.go:94:14:     from s (spill) at ./pra_unsafe.go:94:14
./pra_unsafe.go:94:14:     from ... argument (slice-literal-element) at ./pra_unsafe.go:94:13
./pra_unsafe.go:94:14:   flow: fmt.a = &{storage for ... argument}:
./pra_unsafe.go:94:14:     from ... argument (spill) at ./pra_unsafe.go:94:13
./pra_unsafe.go:94:14:     from fmt.a := ... argument (assign-pair) at ./pra_unsafe.go:94:13
./pra_unsafe.go:94:14:   flow: {heap} = *fmt.a:
./pra_unsafe.go:94:14:     from fmt.Fprintln(os.Stdout, fmt.a...) (call parameter) at ./pra_unsafe.go:94:13
./pra_unsafe.go:94:13: ... argument does not escape
./pra_unsafe.go:94:14: s escapes to heap
./pra_unsafe.go:97:13: ... argument does not escape
./pra_unsafe.go:97:14: s escapes to heap

参考

unsafe 関連

type _string struct {
  elements *byte // underlying bytes
  len      int   // number of bytes
}

As we saw, indexing a string yields its bytes, not its characters: a string is just a bunch of bytes.

type stringStruct struct {
  str unsafe.Pointer
  len int
}

A string type represents the set of string values.

Call      Argument type    Result

len(s)    string type      string length in bytes
          [n]T, *[n]T      array length (== n)
          []T              slice length
          map[K]T          map length (number of defined keys)
          chan T           number of elements queued in channel buffer
          type parameter   see below

cap(s)    [n]T, *[n]T      array length (== n)
          []T              slice capacity
          chan T           channel buffer capacity
          type parameter   see below

string is the set of all strings of 8-bit bytes, conventionally but not necessarily representing UTF-8-encoded text. A string may be empty, but not nil. Values of string type are immutable.

func Clone(s string) string

type StringHeader struct {
	Data uintptr
	Len  int
}

StringHeader is the runtime representation of a string. It cannot be used safely or portably and its representation may change in a later release. Moreover, the Data field is not sufficient to guarantee the data it references will not be garbage collected, so programs must keep a separate, correctly typed pointer to the underlying data.
Deprecated: Use unsafe.String or unsafe.StringData instead.

// StringHeader is the runtime representation of a string.
// It cannot be used safely or portably and its representation may
// change in a later release.
// Moreover, the Data field is not sufficient to guarantee the data
// it references will not be garbage collected, so programs must keep
// a separate, correctly typed pointer to the underlying data.
//
// Deprecated: Use unsafe.String or unsafe.StringData instead.
type StringHeader struct {
	Data uintptr
	Len  int
}

しかしイミュータブルなのは理解しつつもバイト列を文字列にする為に無駄なアロケートをしたくない場合もあります。これまで Go ではドキュメントに明文化していなかった為に色々な作法が生まれてしまっていました。その代表的な物が以下です。

s := *(*string)(unsafe.Pointer(&b))

本来は、Go のバイト列の内部は SliceHeader という構造体により管理されています。

type SliceHeader struct {
    Data uintptr
    Len  int
    Cap  int
}

そこで今回、unsafe.StringData、unsafe.String、unsafe.SliceData が入りました。

func String(ptr *byte, len IntegerType) string
func StringData(str string) *byte
func SliceData(slice []ArbitraryType) *ArbitraryType

As of go 1.22, for string to bytes conversion, we can replace the usage of unsafe.Slice(unsafe.StringData(s), len(s)) with type casting []bytes(str), without the worry of losing performance.
As of go 1.22, string to bytes conversion []bytes(str) is faster than using the unsafe package. Both methods have 0 memory allocation now.
I saw at least two places in the codebase still using the unsafe way:

unsafe.Slice(unsafe.StringData(s), len(s)) を使うよりも、[]byte(str) によってキャスト(コピー)してしまった方が、速いのではないかという疑問。
しかし、測定ミスだった。

*(*string)(unsafe.Pointer(&data))

が一番速いようですね。スッキリ!

unsafe.StringData, unsafe.Strin, unsafe.SliceData が実装されたのでもう使わない。

The Go language takes responsibility for arranging the storage of Go values; in most cases, a Go developer need not care about where these values are stored, or why, if at all. In practice, however, these values often need to be stored in computer physical memory and physical memory is a finite resource. Because it is finite, memory must be managed carefully and recycled in order to avoid running out of it while executing a Go program. It's the job of a Go implementation to allocate and recycle memory as needed.

From a correctness standpoint, you don’t need to know. Each variable in Go exists as long as there are references to it. The storage location chosen by the implementation is irrelevant to the semantics of the language.

Data Segement, Read-Only Data Segment, BSS 関連

今度試してみたい。

  • Initialized data, read only. Not only reserved for constant string, but for all types of const global data. It is not necessarily in the data section, it can be in the text section of the program (normally the .rodata segment), as it is normally not modifiable by a program.
  • Initialized data, read write. Normally in the data section of the program (.data segment).
  • Uninitialized data, read write. Normally in the data section of the program. The difference with the previous is that the executable doesn't include its contents, only the size, as they are initialized to a fixed known value (zero) and there is not a table of initializers in the executable file. Normally, the compiler/linker constructs a segment for this purpose in which it accumulates only the size required by the component modules (the .bss segment).
1
0
0

Register as a new user and use Qiita more conveniently

  1. You get articles that match your needs
  2. You can efficiently read back useful information
  3. You can use dark theme
What you can do with signing up
1
0

Delete article

Deleted articles cannot be recovered.

Draft of this article would be also deleted.

Are you sure you want to delete this article?