More than 1 year has passed since last update.

Keccak256(SHA3-256) を実装したので備忘録

Last updated at 2022-03-28Posted at 2022-03-28

TL;DR

Keccak256というハッシュ関数をご存知でしょうか。Ethereumで使われているアレです。公開鍵からアドレスを導出するのに使われている奴です。SHA3-256とも呼ばれる奴です。
Pythonなどのいくつかの言語では標準のライブラリに入っていたりしますが、未実装の言語もまだまだ多いです。

Pythonだと標準で使える

>>> from hashlib import sha3_256
>>> sha3_256(b'Hello, World!!').hexdigest()
'a2590767a13b13c73ac7388ba21ea6403f9833e9436209da7baa67d9c6b259f5'

SHA3(Secure Hash Algorithm 3)と呼ばれていますが、SHA2のような以前のSHAとは根本的にアルゴリズムの設計が異なります。SHA-256とSHA3-256は似たような名前ですが、全くの別物です(対照的に、SHA1およびSHA2はSHA0の改良系です)。

本稿は、このSHA3-256を実装することで一定の理解を得ることを目的としています。

実装の前に

本稿では、Nim言語を実装に用います。また、Keccak256を実装するにあたって、あらかじめ以下の型を定義しておきます。

type
    HashWords {.union.} = object
        words: array[5, array[5, uint64]]
        bytes: array[200, uint8]

    HashState = ref object
        hw: HashWords
        reading: 0..135

アルゴリズム構造

Keccakの中核を成すのは、${\rm Keccak-}f_{[b]}$　という関数です。$b$はパラメータで、内部状態のサイズを表します。Keccak256においては$b = 1600_{\rm [bits]}$で、200バイトとなります。また、内部状態はあらかじめ0で初期化されます。

proc keccakF1600(state: var HashState) =
    ## keccak-f関数
    # 1回あたり24ラウンド処理する
    for i in 0..23:
        round(state, i)

${\rm Keccak-}f_{[b]}$ 関数は、内部状態に対して単純に$\rm round_{[{\it b}]}$関数を24回適用(計算)する関数です。
さらにこの${\rm round}_{[b]}$関数は、$\theta$過程、$\rho$過程、$\pi$過程、$\chi$過程および$\iota$過程の5段階で構成されています。

全体の流れを見てみます。Keccakは大きく分けて2つの処理過程を持つアルゴリズムです。
最初の段階は、absorbです。その名の通り、データを内部状態に吸収していきます。Keccak256においては、データは17個の64bit整数、つまり136バイトを1ブロックとして処理します。

absorb過程では、入力を1ブロックづつ内部状態とXORし、それを${\rm Keccak-}f_{[1600]}$にかけた結果で内部状態を上書きしていきます。

proc update(state: var HashState, bytes: openArray[uint8]) =
    ## absorb過程
    for c in bytes:
        # バイト単位で内部状態とxor
        state.hw.bytes[state.reading] = state.hw.bytes[state.reading] xor c
        # ブロックに書き込んだバイト数を更新する
        # 136バイトづつなので、135の次は0に戻る
        state.reading = if (state.reading == 135): 0 else: state.reading + 1

        # 1ブロック分読み込んだら keccak-f関数にかける
        if state.reading == 0:
            state.keccakF1600()

1ブロックに満たない場合パティングをおこないますが、丁度1ブロックとなる場合でもパディング自体は行われ、この場合は1ブロック分がまるまるパディングされます。
パディング処理は単純です。読み込んだバイト列の終端の1つ次(1ブロックちょうどであった場合は先頭)をある定数(Keccak256の場合は0x06)とし、ブロックの最後を0x80、それ以外の中間のバイト列を0x00とします。
例外的に、最後の入力データが135バイトあった場合、パディングとなる1バイトは定数と0x80をXORして求めたものとなります。

proc padding(state: var HashState) =
    state.hw.bytes[state.reading] = state.hw.bytes[state.reading] xor 0x06
    state.hw.bytes[135] = state.hw.bytes[135] xor 0x80

    state.keccakF1600()

さて、2つめの過程はsqeezeです。sqeeze過程では、実際のハッシュ値を求めます。とはいっても、Keccak256の場合は内部状態を先頭から32バイト分読み出すだけでハッシュ値がもとまります。136バイトを超える長さが必要な場合は、136バイトごとに${\rm Keccak-}f_{[b]}$関数を適用してまた先頭から読み出します。

proc digest(state: var HashState): string =
    ## ハッシュ値を計算する関数
    ## 一度計算すると内部状態が壊れ、再度初期化されます。
    # パディングする
    state.padding()

    # 戻り値を空文字列で初期化
    result = ""

    # 32バイト分出力文字列に追加
    for i in 0..31:
        result.add(cast[char](state.hw.bytes[i]))

    # 内部状態を再初期化
    state.reading = 0
    for i in 0..24:
        state.hw.words[i] = 0

これで処理の大まかな流れはおしまいです。

round関数

round関数の中身について説明します。この関数は、5つの過程から成ります。

Theta過程

この過程は、2つの内部変数を利用します。

var c: array[5, uint64]
var d: array[5, uint64]

$\theta$過程では、まず25個のuint64で表現される内部状態を5個のuint64に圧縮します。

for x in 0..4:
    c[x] = state.hw.words[0][x] xor state.hw.words[1][x] xor state.hw.words[2][x] xor state.hw.words[3][x] xor state.hw.words[4][x]

次に、それを攪拌します。

for i in 0..4:
    d[i] = c[(i + 4) mod 5] xor rot64(c[(i + 1) mod 5], 1)

ちょっとわかりにくいので、for文を展開してみます。

d[0] = c[4] xor rot64(c[1], 1)
d[1] = c[0] xor rot64(c[2], 1)
d[2] = c[1] xor rot64(c[3], 1)
d[3] = c[2] xor rot64(c[4], 1)
d[4] = c[3] xor rot64(c[0], 1)

rot64は64bit整数をnビット分左に「回す」処理です：

rot64

proc rot64(x: uint64, k: uint64): uint64 = (x shl k) or (x shr (64 - k))

たとえば、rot64(0x000000000000000F, 4)は0xF000000000000000です。

最後に、それを内部状態に戻せば終わりです。

for x in 0..4:
    for y in 0..4:
        state.hw.words[y][x] = state.hw.words[y][x] xor d[x]

Rho過程

$\rho$過程では、25個それぞれの内部状態をテーブルに従って「回し」ます。ここで利用するrot64関数は先に上げたものと同じです。

for x in 0..4:
    for y in 0..4:
        state.hw.words[y][x] = rot64(state.hw.words[y][x], rotOffsets[y][x])

オフセット(rotOffset[y][x])は以下のように定義されます。

const rotOffsets: array[25, uint64] = [
    [ 0'u64,  1'u64, 62'u64, 28'u64, 27'u64],
    [36'u64, 44'u64,  6'u64, 55'u64, 20'u64],
    [ 3'u64, 10'u64, 43'u64, 25'u64, 39'u64],
    [41'u64, 45'u64, 15'u64, 21'u64,  8'u64],
    [18'u64,  2'u64, 61'u64, 56'u64, 14'u64],
]

Pi過程

$\pi$過程では、内部状態をシャッフルします。

let copy: array[5, array[5, uint64]]

for x in 0..4:
    for y in 0..4:
        copy[y][x] = state.hw.words[y][x]

for x in 0..4:
    for y in 0..4:
        state.hw.words[(2*x + 3*y) mod 5][y] = copy[y][x]

どの要素がどこに移されるのかを有向グラフで表現してみました。$\pi$過程を1回行うごとに要素が1つずつ矢印の先へと移ります。

たとえば、state.hw.words[1][1]はstate.hw.words[1][0]に移ります。state.hw.words[0][0]やstate.hw.words[3][1]などは移動しません。

Chi過程

$\chi$過程では、それぞれの要素をもう一度攪拌します。

let copy: array[5, array[5, uint64]]

for x in 0..4:
    for y in 0..4:
        copy[y][x] = state.hw.words[y][x]

for x in 0..4:
    for y in 0..4:
        let val = (not copy[y][(x + 1) mod 5]) and b[y][(x + 2) mod 5]
        state.hw.words[y][x] = copy[y][x] xor val

andおよびnotはそれぞれbitwise and、bitwise notです。

Iota過程

最後に$\iota$過程です。この過程では、24ラウンドのそれぞれに固有の定数を利用します。
単純にこの定数と内部変数の先頭とxorして先頭に上書きすれば完了です。

state.hw.words[0][0] = state.hw.words[0][0] xor iotaConstants[roundN]

このときに利用する定数を以下に示します。

const iotaConstants: array[24, uint64] = [
        0x0000000000000001'u64, 0x0000000000008082'u64, 0x800000000000808a'u64,
        0x8000000080008000'u64, 0x000000000000808b'u64, 0x0000000080000001'u64,
        0x8000000080008081'u64, 0x8000000000008009'u64, 0x000000000000008a'u64,
        0x0000000000000088'u64, 0x0000000080008009'u64, 0x000000008000000a'u64,
        0x000000008000808b'u64, 0x800000000000008b'u64, 0x8000000000008089'u64,
        0x8000000000008003'u64, 0x8000000000008002'u64, 0x8000000000000080'u64,
        0x000000000000800a'u64, 0x800000008000000a'u64, 0x8000000080008081'u64,
        0x8000000000008080'u64, 0x0000000080000001'u64, 0x8000000080008008'u64,
]

round関数は以上です。

Keccak256の全体像

最後に、ここまで実装してきた内容をまとめます。
以下に示すコードでは、モジュールとして使えるようにするための微修正と、ちょっとした最適化が入っています。

type
    HashWords {.union.} = object
        words: array[25, uint64]
        bytes: array[200, uint8]

    HashState* = ref object
        hw: HashWords
        reading: 0..135

const iotaConstants: array[24, uint64] = [
        0x0000000000000001'u64, 0x0000000000008082'u64, 0x800000000000808a'u64,
        0x8000000080008000'u64, 0x000000000000808b'u64, 0x0000000080000001'u64,
        0x8000000080008081'u64, 0x8000000000008009'u64, 0x000000000000008a'u64,
        0x0000000000000088'u64, 0x0000000080008009'u64, 0x000000008000000a'u64,
        0x000000008000808b'u64, 0x800000000000008b'u64, 0x8000000000008089'u64,
        0x8000000000008003'u64, 0x8000000000008002'u64, 0x8000000000000080'u64,
        0x000000000000800a'u64, 0x800000008000000a'u64, 0x8000000080008081'u64,
        0x8000000000008080'u64, 0x0000000080000001'u64, 0x8000000080008008'u64,
]

const rotOffsets: array[25, uint64] = [
     0'u64,  1'u64, 62'u64, 28'u64, 27'u64,
    36'u64, 44'u64,  6'u64, 55'u64, 20'u64,
     3'u64, 10'u64, 43'u64, 25'u64, 39'u64,
    41'u64, 45'u64, 15'u64, 21'u64,  8'u64,
    18'u64,  2'u64, 61'u64, 56'u64, 14'u64,
]

const hexDigits = "0123456789abcdef"

template rot64(x: uint64, k: uint64): uint64 = (x shl k) or (x shr (64 - k))

template round(state: var HashState, iterations: int) =
    var b: array[25, uint64]
    var c: array[5, uint64]
    var d: array[5, uint64]

    for i in 0..4:
        c[i] = state.hw.words[i] xor state.hw.words[i + 5] xor state.hw.words[i + 10] xor state.hw.words[i + 15] xor state.hw.words[i + 20]

    for i in 0..4:
        d[i] = c[(i + 4) mod 5] xor rot64(c[(i + 1) mod 5], 1)

    for i in 0..24:
        state.hw.words[i] = state.hw.words[i] xor d[i mod 5]

    for x in 0..4:
        for y in 0..4:
            let i = x + 5*y
            b[((2*x + 3*y) mod 5) * 5 + y] = rot64(state.hw.words[i], rotOffsets[i])

    for x in 0..4:
        for y in 0..4:
            let i = x + 5*y
            state.hw.words[i] = b[i] xor ((not b[(x + 1) mod 5 + y*5]) and b[(x + 2) mod 5 + y*5])

    state.hw.words[0] = state.hw.words[0] xor iotaConstants[iterations]

template keccakF1600(state: var HashState) =
    for i in 0..23:
        round(state, i)

proc initState*(): HashState =
    ## initialize hash state
    runnableExamples:
        var state = initState()
        state.update("Hello, World!!")
        echo state.digest()

    new result
    for i in 0..24:
        result.hw.words[i] = 0
    result.reading = 0

proc update*(state: var HashState, bytes: openArray[uint8]) =
    ## feed data to hasher
    for c in bytes:
        state.hw.bytes[state.reading] = state.hw.bytes[state.reading] xor c
        state.reading = if (state.reading == 135): 0 else: state.reading + 1

        if state.reading == 0:
            state.keccakF1600()

proc update*[T: string or openArray[byte]](state: var HashState, bytes: T) =
    ## feed data to hasher
    for c in bytes:
        state.hw.bytes[state.reading] = state.hw.bytes[state.reading] xor cast[uint8](c)
        state.reading = if (state.reading == 135): 0 else: state.reading + 1

        if state.reading == 0:
            state.keccakF1600()

proc digest*(state: var HashState): string =
    ## calculate digest
    ## please note this procedure breaks the hasher internal state and re-initializes it.
    result = ""

    state.hw.bytes[state.reading] = state.hw.bytes[state.reading] xor 0x06
    state.hw.bytes[135] = state.hw.bytes[135] xor 0x80

    state.keccakF1600()

    for i in 0..31:
        result.add(cast[char](state.hw.bytes[i]))

    state.reading = 0
    for i in 0..24:
        state.hw.words[i] = 0

proc hex*(str: string): string =
    ## convert string to hexadecimal string
    result = ""

    for c in str:
        result.add(hexDigits[cast[uint8](c) shr 4])
        result.add(hexDigits[cast[uint8](c) and 0x0f])

when isMainModule:
    var state = initState()
    state.update("Hello, Nim-lang!!")
    let dig = state.digest().hex()
    assert dig == "0f3be9c96b48b5e4dd07da8e0141ba75ee3b6fcbc75cb1323225d09084344af1"

参考文献

Keccak specifications summary - Keccak Team Keccakの処理を疑似コードで解説しています
XKCP/Keccak-readable-and-compact.c at master · XKCP/XKCP KeccakのCで書かれたリファレンス実装

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up