More than 3 years have passed since last update.

JuliaAdvent Calendar 2021

@tohgarashi_hack(遠空址 (とおがらし))

駿台予備校広告の答え合わせをJuliaで複数やって速度比較する(文字列, 順列, 並べ替え)

Last updated at 2022-01-28Posted at 2022-01-26

はじめに

参考記事

これらの投稿を読んで件の駿台広告のことを知り、記事2のJuliaのコードをもっと速く出来るんじゃないか、どうせなら速度比較しようということでやってみました。

追記

2022/1/27 @antimon2 さんのコメントを受けて変更・追記。

環境

julia> versioninfo()
Julia Version 1.7.1
Commit ac5cc99908 (2021-12-22 19:35 UTC)
Platform Info:
  OS: macOS (x86_64-apple-darwin19.5.0)
  CPU: Intel(R) Core(TM) i9-8950HK CPU @ 2.90GHz
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-12.0.1 (ORCJIT, skylake)

Combinatorics v1.0.2 公式ドキュメント
BenchmarkTools v1.2.2 公式ドキュメント

ちなみに、VSCode上のJypyter Notebookで実行しました。

検証

準備

Combinatoricsをインストールしていない場合はインストールしておく必要があります。

pkg> add Combinatorics

あらかじめCombinatoricsを読み込んでおく。

using Combinatorics

また、速度計測について、BenchmarkToolsの@btimeマクロか@benchmarkマクロを使うと複数回実行して計測してくれるのでより確実です。
標準の@timeマクロは 1 回のみ実行なので、こちらの方が掛かる時間は短いです。

pkg> add BenchmarkTools
using BenchmarkTools

注：@timeと@btimeでは、計り方が違うのか複数回実行のせいなのか、@btimeの方が結果が小さい傾向があるようです。比較する場合は揃えましょう。

速度計測

記事2のコード(func1)

まずは記事2のコードを計測してみます。

6文字の順列を作り，重複を省いて，ソートして，100 番目の要素を出力する。

という手順になっています。

@time sort(unique(join(nthperm(split("GAKKOU", ""), i)) for i in 1:factorial(6)))[100]
    
  0.043128 seconds (50.76 k allocations: 3.024 MiB, 91.94% compilation time)
"GOKAKU"

Julia公式のPerformance Tipsには、

Any code that is performance critical should be inside a function. Code inside functions tends to run much faster than top level code, due to how Julia's compiler works.

（機械翻訳）

パフォーマンスが重要なコードは、関数の中に入れてください。関数の中のコードは、Juliaのコンパイラの働きにより、トップレベルのコードよりもはるかに高速に実行される傾向があります。

とあるので、関数として括って実行してみます。

func1(str::String, n::Int) = sort(unique(join(nthperm(split(str, ""), i)) for i in 1:factorial(length(str))))[n]
@time func1("GAKKOU", 100)
    
  0.000887 seconds (14.42 k allocations: 1.079 MiB)
"GOKAKU"

実際に40倍程度高速化しました。

@btimeだとこんな感じ。計測は10秒くらい掛かります。
623.977 μs = 0.000623 seconds

@btime func1("GAKKOU", 100)

  623.977 μs (14421 allocations: 1.08 MiB)
"GOKAKU"

以下では関数で括ったコードのみを示します。
また、@btimeで計測します。

func2

6文字の順列を作り，重複を省いて，ソートして，100 番目の要素を出力する。

という手順は変わっていませんが、nthperm()ではなくpermutations()を使用していることや、join()を目的の文字列のみに掛けていることが変更点です。

func2(str::String, n::Int) = join(sort(unique(collect(permutations(split(str, "")))))[n])
@btime func2("GAKKOU", 100)

  226.151 μs (2202 allocations: 279.47 KiB)
"GOKAKU"

func1()に対して2~3倍程度高速化しました。

func3

Combinatoricsには、「要素が重複している可能性のある選択肢:ベクトルA(長さn)」から全ての長さt(<=n)の順列(重複要素は区別しない)を生成するmultiset_permutations()という関数があります。
例えばこんな感じ。

collect(multiset_permutations([1,1,2],3))

3-element Vector{Vector{Int64}}:
 [1, 1, 2]
 [1, 2, 1]
 [2, 1, 1]

[1,1,2]が2つ生成されたりはしません。"GAKKOU"に'K'が2つ含まれている今の状況にぴったりですね。
multiset_permutations()を長さtを取らない形に多重ディスパッチして利用します。

Combinatorics.multiset_permutations(A) = multiset_permutations(A,length(A))

これはfunc4以降でも利用します。

func3(str::String, n::Int) = join(sort(collect(multiset_permutations(split(str, ""))))[n])
@btime func3("GAKKOU", 100)

  82.612 μs (1126 allocations: 134.80 KiB)
"GOKAKU"

func2()に対して2~3倍程度高速化しました。unique()しなくていいのが効いているんでしょうか。

@antimon2 さんのコメントより、func4~func6を追加します。

func4

split(str, "") の代わりに collect(str) にすると結構速くなります（下記 func4()）。
前者は1文字ずつの文字列からなる配列（split(str, "") isa Vector{SubString{String}}）、後者は文字型の配列（collect(str) isa Vector{Char}）となり、後者の要素の型（Char）がprimitiveだから（というかその大小比較がコスト低いから）です。

func4(str::String, n::Int) = join(sort(collect(multiset_permutations(collect(str))))[n])
@btime func4("GAKKOU", 100)

  46.297 μs (1111 allocations: 88.14 KiB)
"GOKAKU"

func5

列挙した全順列の結果を sort() する代わりに、↑の結果をsort()してもOK（下記 func5()）。なぜなら入力がソート済なら Combinatorics.multiset_permutations() は辞書式順序で列挙した結果を返す（すなわちソート済の結果を返す）から。360件（"GAKKOU"の6文字を並べ替えた全順列の個数）をソートするより6件（['G','A','K','K','O','U']）をソートする方が速いはず。

func5(str::String, n::Int) = join(collect(multiset_permutations(sort(collect(str))))[n])
@btime func5("GAKKOU", 100)

  34.491 μs (1110 allocations: 83.84 KiB)
"GOKAKU"

func6

全結果を collect() して 100 番目を取り出すより、先頭の99件を排除（Iterators.drop()）して最初の要素を取得（first()）する方が（今回の場合は）速いです（下記 func6()）。360件全て列挙せず最初の100件だけ列挙することになるので。

func6(str::String, n::Int) = join(first(Iterators.drop(multiset_permutations(sort(collect(str))),n-1)))
@btime func6("GAKKOU", 100)

  10.438 μs (326 allocations: 23.70 KiB)
"GOKAKU"

計測結果

	func1	func2	func3	func4	func5	func6
関数括り有り	623.977 μs	226.151 μs	82.612 μs	46.297 μs	34.491 μs	10.438 μs
関数括り無し	0.043128 s (`@time`)

注：それぞれ多少の上下有り。記載が無ければ@btime。

まとめ

Juliaはまだ初心者なのですが、今回実際に試してみることで実装によって速度が変化するのを体感することが出来ました。

おまけ

順列や組み合わせは対象の数が増えるごとに結果の数がどんどん増えるので高速化・効率化は非常に重要だと思います。

例

julia> @time func3("JULIAPYTHON", 100)
 78.013371 seconds (119.75 M allocations: 19.777 GiB, 32.45% gc time)
"AHIJLNYOTUP"

julia> @time func4("JULIAPYTHON", 100)
 25.645158 seconds (119.75 M allocations: 11.450 GiB, 37.89% gc time)
"AHIJLNYOTUP"

julia> @time func5("JULIAPYTHON", 100)
 14.047115 seconds (119.75 M allocations: 11.004 GiB, 69.54% gc time)
"AHIJLNYOTUP"

julia> @time func6("JULIAPYTHON", 100)
  0.000036 seconds (343 allocations: 31.641 KiB)
"AHIJLNYOTUP"

この場合、Iterators.drop()を用いて全結果の処理を避けているfunc6がめちゃくちゃ優秀ですね。実行時間もメモリ使用量も桁違いです。

謝辞

参考記事

これらの投稿を参考にしました。ありがとうございます。
また、@antimon2 さんから大変勉強になるコメントを頂きました。ありがとうございます。

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up