std::mersenne_twister_engine に内部の処理方法をテンプレート引数に追加したものを作ってみました。
std::mt19937 と std::mt19937_64 に対応する型を namespace RngMTE に定義してあります。
std::mt19937 |
std::mt19937_64 |
RngMT19937SeqB32 |
RngMT19937SeqB64 |
RngMT19937SeqC32 |
RngMT19937SeqC64 |
RngMT19937SeqBP32 |
RngMT19937SeqBP64 |
RngMT19937SeqCP32 |
RngMT19937SeqCP64 |
RngMT19937SeqSB32 |
RngMT19937SeqSB64 |
RngMT19937SeqSC32 |
RngMT19937SeqSC64 |
RngMT19937BlkB32 |
RngMT19937BlkB64 |
RngMT19937BlkC32 |
RngMT19937BlkC64 |
RngMT19937BlkBV32 |
RngMT19937BlkBV64 |
RngMT19937BlkCV32 |
RngMT19937BlkCV64 |
RngMT19937BlkBT32 |
RngMT19937BlkBT64 |
RngMT19937BlkCT32 |
RngMT19937BlkCT64 |
RngMT19937BlkSB32 |
RngMT19937BlkSB64 |
RngMT19937BlkSC32 |
RngMT19937BlkSC64 |
RngMT19937BlkSBV32 |
RngMT19937BlkSBV64 |
RngMT19937BlkSCV32 |
RngMT19937BlkSCV64 |
RngMT19937BlkSBT32 |
RngMT19937BlkSBT64 |
RngMT19937BlkSCT32 |
RngMT19937BlkSCT64 |
実行速度測定
32ビット版乱数生成 100,000 回(※)を 1000 回テストして最速だったものを選択した。計測は処理(※)を Intel プロセッサでは RDTSC でサイクル数を、他 CPU では clock_gettime のナノ秒を 100,000 で割っています。
テストした実行環境は以下のとおり。
CPU |
OS |
Core i3-13100 |
Ubuntu 22.04 |
Core i7-8700B |
Ubuntu 24.04, macOS 15.4, Windows 11 |
Apple M4 |
Ubuntu 24.04 |
Arm Cortex A76 |
Raspberry Pi OS (Debian 12) |
コンパイラ |
最適化オプション |
Clang |
-O3 -march=native |
GNU C/C++ |
-O3 -march=native Intel プロセッサでは -mtune=intel を追加 |
Microsoft C/C++ |
/O2 /Ob2 /Oi /Ot /Oy- /arch:AVX2 |
表は std::mt19937
→ std
、RngMT19937{Type}32
→ {Type}
と省略しています。
CPU: Core i3-13100
コンパイラ: Clang
Version |
11.1 |
12.0 |
13.0 |
14.0 |
15.0 |
std |
5.331 |
5.350 |
5.313 |
5.304 |
5.323 |
SeqB |
5.523 |
5.508 |
5.475 |
5.470 |
5.434 |
SeqC |
5.743 |
5.743 |
5.728 |
5.728 |
5.716 |
SeqBP |
5.915 |
5.915 |
5.572 |
5.575 |
5.572 |
SeqCP |
6.102 |
6.102 |
5.720 |
5.721 |
5.720 |
SeqSB |
5.500 |
5.236 |
5.185 |
5.411 |
5.549 |
SeqSC |
5.462 |
5.595 |
5.373 |
5.377 |
5.484 |
BlkB |
5.308 |
5.256 |
5.008 |
5.008 |
4.926 |
BlkC |
5.255 |
5.382 |
5.010 |
5.010 |
4.928 |
BlkBV |
6.087 |
6.099 |
5.544 |
5.561 |
4.305 |
BlkCV |
6.086 |
6.102 |
5.552 |
5.544 |
4.258 |
BlkBT |
5.133 |
5.110 |
5.130 |
5.050 |
4.546 |
BlkCT |
5.115 |
5.155 |
5.112 |
5.103 |
4.531 |
BlkSB |
5.446 |
5.581 |
5.125 |
5.144 |
4.768 |
BlkSC |
5.436 |
5.548 |
5.129 |
5.125 |
4.779 |
BlkSBV |
5.482 |
5.547 |
5.335 |
5.351 |
4.950 |
BlkSCV |
5.450 |
5.570 |
5.375 |
5.333 |
4.964 |
BlkSBT |
5.088 |
5.002 |
5.062 |
4.996 |
5.024 |
BlkSCT |
5.051 |
5.000 |
5.079 |
5.055 |
4.993 |
コンパイラ: GNU C/C++
Version |
9.5 |
10.5 |
11.4 |
12.3 |
std |
5.077 |
5.203 |
5.191 |
5.053 |
SeqB |
6.330 |
5.569 |
5.576 |
5.533 |
SeqC |
6.125 |
5.744 |
6.122 |
6.118 |
SeqBP |
6.915 |
6.361 |
6.592 |
6.565 |
SeqCP |
6.880 |
6.522 |
6.883 |
6.879 |
SeqSB |
7.863 |
6.033 |
6.026 |
7.451 |
SeqSC |
8.010 |
5.626 |
5.495 |
7.564 |
BlkB |
5.720 |
5.809 |
6.169 |
6.131 |
BlkC |
5.792 |
5.792 |
6.165 |
6.140 |
BlkBV |
3.262 |
3.260 |
3.504 |
3.536 |
BlkCV |
3.258 |
3.314 |
3.553 |
3.543 |
BlkBT |
2.906 |
2.941 |
3.200 |
3.416 |
BlkCT |
2.943 |
2.893 |
3.200 |
3.311 |
BlkSB |
5.187 |
5.411 |
5.421 |
5.854 |
BlkSC |
5.167 |
5.400 |
5.425 |
5.852 |
BlkSBV |
5.075 |
5.258 |
5.309 |
5.280 |
BlkSCV |
5.059 |
5.295 |
5.201 |
5.356 |
BlkSBT |
5.247 |
5.288 |
5.305 |
5.240 |
BlkSCT |
5.282 |
5.303 |
5.291 |
5.307 |
CPU: Core i7-8700B
コンパイラ: Clang
Version |
17.0(※) |
14.0 |
15.0 |
16.0 |
17.0 |
18.1 |
19.1 |
std |
14.714 |
9.301 |
9.604 |
8.593 |
9.124 |
8.795 |
8.107 |
SeqB |
9.925 |
7.738 |
7.714 |
9.386 |
8.710 |
7.713 |
10.000 |
SeqC |
8.934 |
8.458 |
8.462 |
8.455 |
8.394 |
8.394 |
8.397 |
SeqBP |
10.333 |
10.197 |
10.196 |
9.454 |
8.455 |
10.159 |
10.736 |
SeqCP |
9.680 |
8.951 |
8.969 |
8.951 |
8.951 |
8.951 |
8.950 |
SeqSB |
8.866 |
8.934 |
8.026 |
8.691 |
8.658 |
10.562 |
10.898 |
SeqSC |
7.461 |
8.215 |
8.961 |
8.697 |
8.962 |
11.237 |
11.259 |
BlkB |
5.803 |
7.181 |
7.169 |
7.168 |
7.199 |
6.500 |
6.484 |
BlkC |
5.807 |
7.161 |
7.190 |
7.307 |
7.161 |
6.454 |
6.458 |
BlkBV |
5.496 |
7.601 |
7.455 |
6.703 |
7.447 |
6.039 |
6.043 |
BlkCV |
5.501 |
7.588 |
8.177 |
6.714 |
5.923 |
5.355 |
6.694 |
BlkBT |
5.845 |
8.374 |
7.371 |
7.180 |
6.685 |
6.496 |
6.504 |
BlkCT |
5.852 |
8.293 |
7.744 |
7.202 |
6.721 |
7.225 |
5.828 |
BlkSB |
6.134 |
7.546 |
7.624 |
7.211 |
7.612 |
7.629 |
7.328 |
BlkSC |
6.124 |
7.651 |
7.212 |
7.215 |
7.614 |
7.632 |
7.020 |
BlkSBV |
6.116 |
7.509 |
7.179 |
7.937 |
7.437 |
8.131 |
7.817 |
BlkSCV |
6.121 |
7.553 |
8.682 |
7.926 |
8.610 |
7.452 |
6.849 |
BlkSBT |
4.928 |
8.159 |
8.144 |
8.138 |
7.638 |
7.393 |
7.391 |
BlkSCT |
4.950 |
7.752 |
8.194 |
8.141 |
7.765 |
8.126 |
8.127 |
※は macOS Xcode の Apple clang version 17.0.0 (clang-1700.0.13.3)
他は Ubuntu 24.04 の apt でインストール
コンパイラ: GNU C/C++
Version |
14.2(※) |
9.5 |
10.5 |
11.4 |
12.3 |
13.3 |
14.2 |
std |
6.099 |
8.068 |
8.071 |
7.087 |
7.074 |
6.791 |
7.776 |
SeqB |
9.662 |
9.100 |
10.468 |
10.199 |
8.210 |
10.195 |
10.202 |
SeqC |
9.696 |
11.931 |
8.563 |
11.188 |
8.955 |
8.589 |
8.588 |
SeqBP |
8.723 |
10.268 |
9.240 |
9.466 |
11.426 |
9.468 |
11.425 |
SeqCP |
9.040 |
9.717 |
9.702 |
9.943 |
11.190 |
9.744 |
11.189 |
SeqSB |
7.675 |
11.408 |
10.406 |
10.507 |
12.300 |
12.296 |
11.995 |
SeqSC |
7.175 |
11.613 |
9.062 |
8.577 |
12.028 |
11.335 |
10.161 |
BlkB |
8.530 |
8.198 |
8.596 |
9.298 |
8.842 |
8.894 |
8.538 |
BlkC |
8.560 |
8.197 |
8.563 |
9.306 |
8.823 |
8.629 |
8.478 |
BlkBV |
4.699 |
5.623 |
5.604 |
5.618 |
5.705 |
5.655 |
5.786 |
BlkCV |
4.640 |
5.606 |
5.602 |
5.727 |
5.134 |
5.611 |
5.778 |
BlkBT |
4.077 |
5.477 |
5.464 |
5.455 |
5.621 |
5.495 |
5.551 |
BlkCT |
4.725 |
5.473 |
5.454 |
5.463 |
5.604 |
5.496 |
6.280 |
BlkSB |
7.347 |
7.789 |
7.787 |
7.411 |
9.282 |
7.411 |
8.347 |
BlkSC |
6.346 |
7.380 |
7.404 |
7.786 |
9.282 |
7.414 |
8.347 |
BlkSBV |
5.672 |
8.051 |
7.821 |
7.761 |
7.058 |
7.184 |
8.330 |
BlkSCV |
6.035 |
7.291 |
7.795 |
7.812 |
7.817 |
8.562 |
8.331 |
BlkSBT |
5.554 |
6.897 |
8.288 |
8.285 |
7.558 |
7.638 |
7.344 |
BlkSCT |
5.548 |
8.303 |
8.294 |
8.283 |
7.647 |
7.547 |
7.575 |
※は macOS 上でレポジトリから clone してビルドしたもの
他は Ubuntu 24.04 の apt でインストール
コンパイラ: Microsoft C/C++
Version |
19.43 |
std |
11.064 |
SeqB |
9.424 |
SeqC |
12.032 |
SeqBP |
10.445 |
SeqCP |
11.506 |
SeqSB |
7.145 |
SeqSC |
6.182 |
BlkB |
10.726 |
BlkC |
9.363 |
BlkBV |
13.111 |
BlkCV |
11.824 |
BlkBT |
14.377 |
BlkCT |
11.827 |
BlkSB |
8.269 |
BlkSC |
7.905 |
BlkSBV |
8.669 |
BlkSCV |
7.975 |
BlkSBT |
7.851 |
BlkSCT |
7.086 |
CPU: Apple M4
コンパイラ: Clang
Version |
14.0 |
15.0 |
16.0 |
17.0 |
18.1 |
19.1 |
std |
1.413 |
1.412 |
1.634 |
1.636 |
1.413 |
1.413 |
SeqB |
1.126 |
1.151 |
1.231 |
1.178 |
1.160 |
1.222 |
SeqC |
1.138 |
1.151 |
1.145 |
1.182 |
1.152 |
1.061 |
SeqBP |
1.135 |
1.202 |
1.202 |
1.244 |
1.245 |
1.245 |
SeqCP |
1.296 |
1.376 |
1.187 |
1.251 |
1.242 |
1.408 |
SeqSB |
1.398 |
1.552 |
1.434 |
1.414 |
1.636 |
1.666 |
SeqSC |
1.360 |
1.550 |
1.582 |
1.450 |
1.616 |
1.617 |
BlkB |
1.222 |
1.192 |
1.198 |
1.216 |
1.241 |
1.290 |
BlkC |
1.216 |
1.207 |
1.195 |
1.202 |
1.244 |
1.255 |
BlkBV |
1.685 |
1.556 |
1.559 |
1.569 |
1.353 |
1.389 |
BlkCV |
1.678 |
1.562 |
1.565 |
1.563 |
1.350 |
1.388 |
BlkBT |
1.424 |
1.342 |
1.362 |
1.388 |
1.315 |
1.357 |
BlkCT |
1.410 |
1.332 |
1.355 |
1.360 |
1.308 |
1.344 |
BlkSB |
1.281 |
1.288 |
1.281 |
1.307 |
1.310 |
1.323 |
BlkSC |
1.504 |
1.286 |
1.281 |
1.282 |
1.314 |
1.331 |
BlkSBV |
1.425 |
1.432 |
1.425 |
1.425 |
1.427 |
1.467 |
BlkSCV |
1.425 |
1.425 |
1.425 |
1.425 |
1.426 |
1.467 |
BlkSBT |
1.440 |
1.442 |
1.434 |
1.441 |
1.442 |
1.425 |
BlkSCT |
1.440 |
1.434 |
1.441 |
1.434 |
1.442 |
1.426 |
コンパイラ: GNU C/C++
Version |
9.5 |
10.5 |
11.4 |
12.3 |
13.3 |
14.2 |
std |
1.165 |
1.169 |
1.165 |
1.168 |
1.168 |
1.282 |
SeqB |
1.338 |
1.183 |
1.182 |
1.182 |
1.182 |
1.182 |
SeqC |
1.242 |
1.357 |
1.178 |
1.418 |
1.425 |
1.486 |
SeqBP |
1.339 |
1.337 |
1.296 |
1.317 |
1.337 |
1.309 |
SeqCP |
1.244 |
1.242 |
1.242 |
1.445 |
1.487 |
1.434 |
SeqSB |
2.291 |
1.670 |
1.655 |
1.603 |
1.796 |
1.616 |
SeqSC |
1.458 |
1.778 |
1.537 |
1.568 |
1.524 |
1.457 |
BlkB |
1.262 |
1.260 |
1.214 |
1.255 |
1.312 |
1.214 |
BlkC |
1.243 |
1.246 |
1.212 |
1.256 |
1.260 |
1.210 |
BlkBV |
1.352 |
1.357 |
1.319 |
1.352 |
1.391 |
1.330 |
BlkCV |
1.352 |
1.352 |
1.319 |
1.349 |
1.396 |
1.319 |
BlkBT |
1.264 |
1.279 |
1.276 |
1.265 |
1.307 |
1.263 |
BlkCT |
1.272 |
1.297 |
1.260 |
1.262 |
1.328 |
1.257 |
BlkSB |
1.226 |
1.224 |
1.182 |
1.206 |
1.187 |
1.207 |
BlkSC |
1.235 |
1.185 |
1.183 |
1.269 |
1.229 |
1.207 |
BlkSBV |
1.378 |
1.347 |
1.317 |
1.453 |
1.390 |
1.345 |
BlkSCV |
1.380 |
1.347 |
1.324 |
1.453 |
1.392 |
1.345 |
BlkSBT |
1.420 |
1.405 |
1.371 |
1.415 |
1.419 |
1.423 |
BlkSCT |
1.441 |
1.459 |
1.431 |
1.414 |
1.457 |
1.443 |
CPU: Arm Cortex A76
コンパイラ: Clang
Version |
13.0 |
14.0 |
15.0 |
16.0 |
19.1 |
std |
4.422 |
4.363 |
4.316 |
4.318 |
4.169 |
SeqB |
4.992 |
5.350 |
5.104 |
4.946 |
4.608 |
SeqC |
5.493 |
5.592 |
5.188 |
5.019 |
4.646 |
SeqBP |
5.323 |
5.471 |
5.589 |
5.485 |
5.062 |
SeqCP |
5.704 |
5.659 |
5.642 |
5.557 |
5.076 |
SeqSB |
6.572 |
6.468 |
6.542 |
6.552 |
8.416 |
SeqSC |
6.494 |
6.379 |
6.589 |
6.454 |
8.313 |
BlkB |
4.672 |
4.636 |
4.473 |
4.473 |
4.229 |
BlkC |
4.665 |
4.633 |
4.471 |
4.471 |
4.232 |
BlkBV |
6.309 |
5.904 |
5.610 |
5.351 |
5.499 |
BlkCV |
5.922 |
5.852 |
5.578 |
5.393 |
5.487 |
BlkBT |
6.641 |
6.722 |
6.031 |
6.029 |
5.562 |
BlkCT |
6.695 |
6.729 |
6.122 |
6.091 |
5.617 |
BlkSB |
4.943 |
5.022 |
5.011 |
5.011 |
4.670 |
BlkSC |
4.979 |
5.021 |
5.011 |
5.012 |
4.670 |
BlkSBV |
6.102 |
6.109 |
6.147 |
6.146 |
6.093 |
BlkSCV |
5.917 |
6.147 |
6.148 |
6.148 |
6.034 |
BlkSBT |
6.147 |
6.148 |
6.084 |
6.148 |
5.860 |
BlkSCT |
6.194 |
6.238 |
6.148 |
6.150 |
5.658 |
コンパイラ: GNU C/C++
Version |
11.3 |
12.2 |
std |
4.647 |
4.757 |
SeqB |
4.328 |
4.328 |
SeqC |
4.627 |
4.503 |
SeqBP |
5.060 |
5.064 |
SeqCP |
5.230 |
5.137 |
SeqSB |
8.098 |
8.228 |
SeqSC |
5.975 |
8.309 |
BlkB |
4.227 |
4.218 |
BlkC |
4.213 |
4.217 |
BlkBV |
5.329 |
5.375 |
BlkCV |
5.369 |
5.317 |
BlkBT |
5.465 |
5.462 |
BlkCT |
5.464 |
5.464 |
BlkSB |
4.072 |
4.841 |
BlkSC |
4.075 |
4.841 |
BlkSBV |
5.313 |
6.319 |
BlkSCV |
5.383 |
6.319 |
BlkSBT |
6.421 |
6.500 |
BlkSCT |
6.422 |
6.523 |
メルセンヌツイスタ規模のプログラムだと、高速な生成が必要ならば目的に応じたターゲット専用コードを書いた方がいいでしょう。