Swift の Hashable は自分で実装しない限りキャッシュが効かない

Swift

Last updated at 2022-12-22Posted at 2022-12-22

小ネタです。Swift の Hashable について実にとんでもない勘違いをしていました。

Swift 4.2 から Hashable 適合の際に必要な実装が var hashValue: Int から func hash(into hasher: inout Hasher) に変わりました。これは当然ながらハッシュ値の生成をより簡単にかつ効率的に作るためであって、Hashable 独自適合の際にハッシュアルゴリズムを気にしなくても大丈夫なようになりました。

ところがなぜか、これで hashValue も内部で lazy var 的な仕組みでキャッシュしてくれるんじゃないかと勘違いしていました。勝手にキャッシュして欲しかったからなのでしょうか。

検証のためにこのようなコードを書いてみました：

import Foundation

struct Demo: Hashable {
    var p001: Int
    var p002: Int
    var p003: Int
    // ...
    var p100: Int
}

let demo1 = Demo(p001: 123, p002: 123, p003: 123, /* ... */, p100: 123)
let demo2 = Demo(p001: 123, p002: 123, p003: 123, /* ... */, p100: 123)

let loopTimesA = 0 ..< 1_000_000
let loopTimesB = 0 ..< 10_000_000

let date1 = Date()
for _ in loopTimesA {
    _ = demo1 == demo2
}
print("elapsed time of comparing raw in A times:", -date1.timeIntervalSinceNow)

let date2 = Date()
for _ in loopTimesB {
    _ = demo1 == demo2
}
print("elapsed time of comparing raw in B times:", -date2.timeIntervalSinceNow)

let date3 = Date()
for _ in loopTimesA {
    _ = demo1.hashValue == demo2.hashValue
}
print("elapsed time of comparing hash in A times:", -date3.timeIntervalSinceNow)

let date4 = Date()
for _ in loopTimesB {
    _ = demo1.hashValue == demo2.hashValue
}
print("elapsed time of comparing hash in B times:", -date4.timeIntervalSinceNow)

上記のコードで 100 のプロパティーを持つ型 Demo を作って、そして値が同じ demo1 と demo2 を作って、お互い自身の比較を2回違うループ数で行ってみました；そしてさらに hashValue も同じく、違うループ回数で2回比較しました。

そして自分の環境での実行結果は、最適化なしの場合は、値自身の比較にしろ、hashValue の比較にしろ、どっちもかかる時間はループ回数に比例します；ただし hashValue の比較よりも、値自身の比較の方が早いです。

 % swift -Onone main.swift
elapsed time of comparing raw in A times: 0.2997390031814575
elapsed time of comparing raw in B times: 3.01657497882843
elapsed time of comparing hash in A times: 1.6205110549926758
elapsed time of comparing hash in B times: 16.24207305908203

これでわかるのは、やはり少なくともこの場合 hashValue はキャッシュされないし、hashValue の計算自体もそれなりにコストがかかります。

ちなみに最適化をかけた場合、hashValue の経過時間も同じようにループ回数に比例して増加します、つまり最適化してもキャッシュがないことがわかります；ところが、逆にむしろ値自身の比較の方が2回目では全く時間がかかりませんでした。まあ let だったので、最適化の結果値の比較は 1 回だけしか行わなかったでしょう。そして最適化なしの場合と比較しても、hashValue の比較にかかった時間はあまり変わらなかったので、ハッシュ値の計算は最適化の効果が非常に限られてると言えるでしょう。

% swift -O main.swift
elapsed time of comparing raw in A times: 5.1021575927734375e-05
elapsed time of comparing raw in B times: -0.0
elapsed time of comparing hash in A times: 1.3289411067962646
elapsed time of comparing hash in B times: 13.25122594833374

結論：プロパティーが多い型でハッシュ値を何回も必要とするときは、もし必要でしたら自分でキャッシュを実装した方がいいかもしれないですね。

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up