Help us understand the problem. What is going on with this article?

Nimの最適化フラグとその効果

More than 1 year has passed since last update.

はじめに

先日、こちらの記事をみて自分でも気になったのでざっくり調べてみました。環境はWindowsとLinuxという違いなどもあるので、あくまで参考程度のものとしてみてください。

環境

$ cat /proc/cpuinfo| grep 'model name'| sort -nr| uniq
model name  : Intel(R) Core(TM) i7-6500U CPU @ 2.50GHz
$ uname -mrv
4.15.0-34-generic #37-Ubuntu SMP Mon Aug 27 15:21:48 UTC 2018 x86_64
$ nim --version
Nim Compiler Version 0.18.0 [Linux: amd64]
Copyright (c) 2006-2018 by Andreas Rumpf

git hash: 855956bf617f68ac0be3717329e9e1181e5dc0c6
active boot switches: -d:release

また、使用しているnim.cfg(といってもデフォルトのものですが)で最適化まわりに関係ありそうなものを一部抜粋したものが以下です。

(...)
@if release or quick:
  obj_checks:off
  field_checks:off
  range_checks:off
  bound_checks:off
  overflow_checks:off
  assertions:off
  stacktrace:off
  linetrace:off
  debugger:off
  line_dir:off
  dead_code_elim:on
  @if nimHasNilChecks:
    nilchecks:off
  @end
@end

@if release:
  opt:speed
@end
(...)
gcc.options.speed = "-O3 -fno-strict-aliasing"
gcc.options.size = "-Os"
@if windows:
  gcc.options.debug = "-g3 -O0 -gdwarf-3"
@else:
  gcc.options.debug = "-g3 -O0"
@end
gcc.cpp.options.speed = "-O3 -fno-strict-aliasing"
gcc.cpp.options.size = "-Os"
gcc.cpp.options.debug = "-g3 -O0"

どうやら -d:release--opt フラグを指定しない場合は opt:speed が使われるようです。

コード

特に変更していません。

import math, times

const max = 10_000_000

when isMainModule:
  var sins: float64 = 0.0
  var coses: float64 = 0.0
  var num: float64 = 0.0

  let start = epochTime()

  for i in 0..<max:
    num = float64(i)
    sins += sin(num)
    coses += cos(num)

  echo eopchTime() - start
  echo sins
  echo coses

普通のビルド

まずreleaseを指定しない場合を計測してみます。
コード内で時間計測を行っていますが、user/sysの比率もみてみたいのでビルトインのtimeコマンドを使います。

$ nim c -f --hints:off verbosity:3 app.nim
gcc -c  -w  -I/home/kubo39/.choosenim/toolchains/nim-0.18.0/lib -o /home/kubo39/dev/nim/optimizations/nimcache/app.o /home/kubo39/dev/nim/optimizations/nimcache/app.c
gcc -c  -w  -I/home/kubo39/.choosenim/toolchains/nim-0.18.0/lib -o /home/kubo39/dev/nim/optimizations/nimcache/stdlib_system.o /home/kubo39/dev/nim/optimizations/nimcache/stdlib_system.c
gcc -c  -w  -I/home/kubo39/.choosenim/toolchains/nim-0.18.0/lib -o /home/kubo39/dev/nim/optimizations/nimcache/stdlib_math.o /home/kubo39/dev/nim/optimizations/nimcache/stdlib_math.c
gcc -c  -w  -I/home/kubo39/.choosenim/toolchains/nim-0.18.0/lib -o /home/kubo39/dev/nim/optimizations/nimcache/stdlib_times.o /home/kubo39/dev/nim/optimizations/nimcache/stdlib_times.c
gcc -c  -w  -I/home/kubo39/.choosenim/toolchains/nim-0.18.0/lib -o /home/kubo39/dev/nim/optimizations/nimcache/stdlib_strutils.o /home/kubo39/dev/nim/optimizations/nimcache/stdlib_strutils.c
gcc -c  -w  -I/home/kubo39/.choosenim/toolchains/nim-0.18.0/lib -o /home/kubo39/dev/nim/optimizations/nimcache/stdlib_parseutils.o /home/kubo39/dev/nim/optimizations/nimcache/stdlib_parseutils.c
gcc -c  -w  -I/home/kubo39/.choosenim/toolchains/nim-0.18.0/lib -o /home/kubo39/dev/nim/optimizations/nimcache/stdlib_algorithm.o /home/kubo39/dev/nim/optimizations/nimcache/stdlib_algorithm.c
gcc -c  -w  -I/home/kubo39/.choosenim/toolchains/nim-0.18.0/lib -o /home/kubo39/dev/nim/optimizations/nimcache/stdlib_posix.o /home/kubo39/dev/nim/optimizations/nimcache/stdlib_posix.c
$ bash -c "time ./app"
0.7602357864379883
1.535343615350497
1.338538979005332

real    0m0.764s
user    0m0.763s
sys 0m0.001s

releaseビルド(opt未指定)

$ nim c -f -d:release --hints:off verbosity:3 app.nim
gcc -c  -w -O3 -fno-strict-aliasing  -I/home/kubo39/.choosenim/toolchains/nim-0.18.0/lib -o /home/kubo39/dev/nim/optimizations/nimcache/app.o /home/kubo39/dev/nim/optimizations/nimcache/app.c
gcc -c  -w -O3 -fno-strict-aliasing  -I/home/kubo39/.choosenim/toolchains/nim-0.18.0/lib -o /home/kubo39/dev/nim/optimizations/nimcache/stdlib_system.o /home/kubo39/dev/nim/optimizations/nimcache/stdlib_system.c
gcc -c  -w -O3 -fno-strict-aliasing  -I/home/kubo39/.choosenim/toolchains/nim-0.18.0/lib -o /home/kubo39/dev/nim/optimizations/nimcache/stdlib_math.o /home/kubo39/dev/nim/optimizations/nimcache/stdlib_math.c
gcc -c  -w -O3 -fno-strict-aliasing  -I/home/kubo39/.choosenim/toolchains/nim-0.18.0/lib -o /home/kubo39/dev/nim/optimizations/nimcache/stdlib_times.o /home/kubo39/dev/nim/optimizations/nimcache/stdlib_times.c
gcc -c  -w -O3 -fno-strict-aliasing  -I/home/kubo39/.choosenim/toolchains/nim-0.18.0/lib -o /home/kubo39/dev/nim/optimizations/nimcache/stdlib_strutils.o /home/kubo39/dev/nim/optimizations/nimcache/stdlib_strutils.c
gcc -c  -w -O3 -fno-strict-aliasing  -I/home/kubo39/.choosenim/toolchains/nim-0.18.0/lib -o /home/kubo39/dev/nim/optimizations/nimcache/stdlib_parseutils.o /home/kubo39/dev/nim/optimizations/nimcache/stdlib_parseutils.c
gcc -c  -w -O3 -fno-strict-aliasing  -I/home/kubo39/.choosenim/toolchains/nim-0.18.0/lib -o /home/kubo39/dev/nim/optimizations/nimcache/stdlib_algorithm.o /home/kubo39/dev/nim/optimizations/nimcache/stdlib_algorithm.c
gcc -c  -w -O3 -fno-strict-aliasing  -I/home/kubo39/.choosenim/toolchains/nim-0.18.0/lib -o /home/kubo39/dev/nim/optimizations/nimcache/stdlib_posix.o /home/kubo39/dev/nim/optimizations/nimcache/stdlib_posix.c
$ bash -c "time ./app"
0.3733479976654053
1.535343615350497
1.338538979005332

real    0m0.375s
user    0m0.374s
sys 0m0.000s

nim.cfgでみたようにreleaseで未指定の場合はopt:speedが渡されるので gcc.options.speed = "-O3 -fno-strict-aliasing" がgccに渡されています。

releaseビルド(--opt:none)

$ nim c -f -d:release --opt:none --hints:off verbosity:3 app.nim
gcc -c  -w  -I/home/kubo39/.choosenim/toolchains/nim-0.18.0/lib -o /home/kubo39/dev/nim/optimizations/nimcache/app.o /home/kubo39/dev/nim/optimizations/nimcache/app.c
gcc -c  -w  -I/home/kubo39/.choosenim/toolchains/nim-0.18.0/lib -o /home/kubo39/dev/nim/optimizations/nimcache/stdlib_system.o /home/kubo39/dev/nim/optimizations/nimcache/stdlib_system.c
gcc -c  -w  -I/home/kubo39/.choosenim/toolchains/nim-0.18.0/lib -o /home/kubo39/dev/nim/optimizations/nimcache/stdlib_math.o /home/kubo39/dev/nim/optimizations/nimcache/stdlib_math.c
gcc -c  -w  -I/home/kubo39/.choosenim/toolchains/nim-0.18.0/lib -o /home/kubo39/dev/nim/optimizations/nimcache/stdlib_times.o /home/kubo39/dev/nim/optimizations/nimcache/stdlib_times.c
gcc -c  -w  -I/home/kubo39/.choosenim/toolchains/nim-0.18.0/lib -o /home/kubo39/dev/nim/optimizations/nimcache/stdlib_strutils.o /home/kubo39/dev/nim/optimizations/nimcache/stdlib_strutils.c
gcc -c  -w  -I/home/kubo39/.choosenim/toolchains/nim-0.18.0/lib -o /home/kubo39/dev/nim/optimizations/nimcache/stdlib_parseutils.o /home/kubo39/dev/nim/optimizations/nimcache/stdlib_parseutils.c
gcc -c  -w  -I/home/kubo39/.choosenim/toolchains/nim-0.18.0/lib -o /home/kubo39/dev/nim/optimizations/nimcache/stdlib_algorithm.o /home/kubo39/dev/nim/optimizations/nimcache/stdlib_algorithm.c
gcc -c  -w  -I/home/kubo39/.choosenim/toolchains/nim-0.18.0/lib -o /home/kubo39/dev/nim/optimizations/nimcache/stdlib_posix.o /home/kubo39/dev/nim/optimizations/nimcache/stdlib_posix.c
$ bash -c "time ./app"
0.5598080158233643
1.535343615350497
1.338538979005332

real    0m0.563s
user    0m0.559s
sys 0m0.004s

opt:noneのときは最適化フラグは渡されず、速度もさきほどのreleaseビルドに劣っています。それでもrelease未指定のときよりいくらかは速くなっています。(一度しかベンチマークをとっていないのでよくないですが、100msecオーダーなので今回は気にしないことにします)。

再びnim.cfgをみてみると

@if release or quick:
  obj_checks:off
  field_checks:off
  range_checks:off
  bound_checks:off
  overflow_checks:off
  assertions:off
  stacktrace:off
  linetrace:off
  debugger:off
  line_dir:off
  dead_code_elim:on
  @if nimHasNilChecks:
    nilchecks:off
  @end
@end

と、releaseフラグが渡されたときは配列の境界チェックやオーバーフローのチェックを行うコードを生成しないようになっているので、opt:none指定であってもreleaseを渡さないときよりは速くなるのだと考えられます。

releaseビルド(--opt:size)

$ nim c -f -d:release --opt:size --hints:off verbosity:3 app.nim
gcc -c  -w -Os  -I/home/kubo39/.choosenim/toolchains/nim-0.18.0/lib -o /home/kubo39/dev/nim/optimizations/nimcache/app.o /home/kubo39/dev/nim/optimizations/nimcache/app.c
gcc -c  -w -Os  -I/home/kubo39/.choosenim/toolchains/nim-0.18.0/lib -o /home/kubo39/dev/nim/optimizations/nimcache/stdlib_system.o /home/kubo39/dev/nim/optimizations/nimcache/stdlib_system.c
gcc -c  -w -Os  -I/home/kubo39/.choosenim/toolchains/nim-0.18.0/lib -o /home/kubo39/dev/nim/optimizations/nimcache/stdlib_math.o /home/kubo39/dev/nim/optimizations/nimcache/stdlib_math.c
gcc -c  -w -Os  -I/home/kubo39/.choosenim/toolchains/nim-0.18.0/lib -o /home/kubo39/dev/nim/optimizations/nimcache/stdlib_times.o /home/kubo39/dev/nim/optimizations/nimcache/stdlib_times.c
gcc -c  -w -Os  -I/home/kubo39/.choosenim/toolchains/nim-0.18.0/lib -o /home/kubo39/dev/nim/optimizations/nimcache/stdlib_strutils.o /home/kubo39/dev/nim/optimizations/nimcache/stdlib_strutils.c
gcc -c  -w -Os  -I/home/kubo39/.choosenim/toolchains/nim-0.18.0/lib -o /home/kubo39/dev/nim/optimizations/nimcache/stdlib_parseutils.o /home/kubo39/dev/nim/optimizations/nimcache/stdlib_parseutils.c
gcc -c  -w -Os  -I/home/kubo39/.choosenim/toolchains/nim-0.18.0/lib -o /home/kubo39/dev/nim/optimizations/nimcache/stdlib_algorithm.o /home/kubo39/dev/nim/optimizations/nimcache/stdlib_algorithm.c
gcc -c  -w -Os  -I/home/kubo39/.choosenim/toolchains/nim-0.18.0/lib -o /home/kubo39/dev/nim/optimizations/nimcache/stdlib_posix.o /home/kubo39/dev/nim/optimizations/nimcache/stdlib_posix.c
$ bash -c "time ./app"
0.3892240524291992
1.535343615350497
1.338538979005332

real    0m0.393s
user    0m0.393s
sys 0m0.000s

opt:sizeオプションはgccに -Os オプションを渡しているという違いだけです。 -Os-O2 からバイナリサイズが大きくなるような最適化を取り除いたものなのである程度の最適化は効きます。実際ベンチマーク結果も反映されているようにみえます。opt:speedとの比較でも誤差と言い切って問題ないレベルでしょう。

releaseビルド(--opt:speed)

$ nim c -f -d:release --opt:speed --hints:off verbosity:3 app.nim
gcc -c  -w -O3 -fno-strict-aliasing  -I/home/kubo39/.choosenim/toolchains/nim-0.18.0/lib -o /home/kubo39/dev/nim/optimizations/nimcache/app.o /home/kubo39/dev/nim/optimizations/nimcache/app.c
gcc -c  -w -O3 -fno-strict-aliasing  -I/home/kubo39/.choosenim/toolchains/nim-0.18.0/lib -o /home/kubo39/dev/nim/optimizations/nimcache/stdlib_system.o /home/kubo39/dev/nim/optimizations/nimcache/stdlib_system.c
gcc -c  -w -O3 -fno-strict-aliasing  -I/home/kubo39/.choosenim/toolchains/nim-0.18.0/lib -o /home/kubo39/dev/nim/optimizations/nimcache/stdlib_math.o /home/kubo39/dev/nim/optimizations/nimcache/stdlib_math.c
gcc -c  -w -O3 -fno-strict-aliasing  -I/home/kubo39/.choosenim/toolchains/nim-0.18.0/lib -o /home/kubo39/dev/nim/optimizations/nimcache/stdlib_times.o /home/kubo39/dev/nim/optimizations/nimcache/stdlib_times.c
gcc -c  -w -O3 -fno-strict-aliasing  -I/home/kubo39/.choosenim/toolchains/nim-0.18.0/lib -o /home/kubo39/dev/nim/optimizations/nimcache/stdlib_strutils.o /home/kubo39/dev/nim/optimizations/nimcache/stdlib_strutils.c
gcc -c  -w -O3 -fno-strict-aliasing  -I/home/kubo39/.choosenim/toolchains/nim-0.18.0/lib -o /home/kubo39/dev/nim/optimizations/nimcache/stdlib_parseutils.o /home/kubo39/dev/nim/optimizations/nimcache/stdlib_parseutils.c
gcc -c  -w -O3 -fno-strict-aliasing  -I/home/kubo39/.choosenim/toolchains/nim-0.18.0/lib -o /home/kubo39/dev/nim/optimizations/nimcache/stdlib_algorithm.o /home/kubo39/dev/nim/optimizations/nimcache/stdlib_algorithm.c
gcc -c  -w -O3 -fno-strict-aliasing  -I/home/kubo39/.choosenim/toolchains/nim-0.18.0/lib -o /home/kubo39/dev/nim/optimizations/nimcache/stdlib_posix.o /home/kubo39/dev/nim/optimizations/nimcache/stdlib_posix.c
$ bash -c "time ./app"
0.3850760459899902
1.535343615350497
1.338538979005332

real    0m0.389s
user    0m0.385s
sys 0m0.004s

gccに渡しているオプションもrelease(opt未指定)と同じことが確認できます。ベンチマーク結果も誤差の範囲といえるでしょう。

結論

あくまで私見ですが、

  • 速度の最適化を行いたい: -d:release をつける (--opt:speed は自分で指定する必要はない)
  • サイズの最適化を行いたい: -d:release --opt:size をつける (--opt フラグを複数指定する場合後勝ちなので注意)
  • 開発中: オーバーフローや配列の境界値チェックが欲しいのでrelease関連のフラグは指定しない

という意見です。

また最適化とは関係ないですが、コンパイル時に --debuginfo をつけるとOSに応じたデバッグ情報(LinuxではDWARF)を付加するのでgdbでデバッグするなどしたい場合はつけておくとよいと思います。

Why do not you register as a user and use Qiita more conveniently?
  1. We will deliver articles that match you
    By following users and tags, you can catch up information on technical fields that you are interested in as a whole
  2. you can read useful information later efficiently
    By "stocking" the articles you like, you can search right away
Comments
Sign up for free and join this conversation.
If you already have a Qiita account
Why do not you register as a user and use Qiita more conveniently?
You need to log in to use this function. Qiita can be used more conveniently after logging in.
You seem to be reading articles frequently this month. Qiita can be used more conveniently after logging in.
  1. We will deliver articles that match you
    By following users and tags, you can catch up information on technical fields that you are interested in as a whole
  2. you can read useful information later efficiently
    By "stocking" the articles you like, you can search right away