最近ジェネリックプログラミングをする機会が増えているのですが、C++のテンプレート特殊化のように型ごとの最適実装を書きたいと思うことがしばしばあります。C#のジェネリックでも型チェックを駆使すれば、特定の型に対する専用処理を実装することはできますが、なまじ色んな方法があってどのように書くのが筋がよいか?というのはイマイチよくわかっていません。
わからないなら検証するしかないよね、ということでいろいろ試してみました。
検証内容
ひとまずは加算を特殊化してみることにします。
次のような、型T
の値を一つ持つクラスを考えます。
public partial class Container<T>
{
public T Value { get; }
public Container(T value) => Value = value;
}
T
が加算可能な型である場合に、このクラスのインスタンス2つに対しても加算が利用できるようにしたい、というケースを考えます。
partial class Container<T>
{
public static Container<T> Add(Container<T> lhs, Container<T> rhs)
=> throw new NotImplementedException();
}
今回は、この加算メソッドが最速で動作するような方法を探っていきます。
なお、本来ならばlhs
、rhs
の非nullガードが必要だが、今回はベンチマークなので書かないことにしておきます。
さて、一方のT
としてはプリミティブ型、構造体、sealedかつ非nullのクラスを考えます。
継承可能だったりnullableだったりするクラスは型判定において扱いがすこぶるめんどくさくなるので今回はサポート対象外。
プリミティブ型としてはとりあえずint
およびdouble
を試します。
構造体、クラスには以下のような型を利用します。いずれもプリミティブ型の単純なラッパです。
public struct IntStruct
{
public readonly int Value;
public IntStruct(int value) => Value = value;
public static IntStruct operator+(IntStruct lhs, IntStruct rhs)
=> new IntStruct(lhs.Value + rhs.Value);
}
public struct DoubleStruct
{
public readonly double Value;
public DoubleStruct(double value) => Value = value;
public static DoubleStruct operator +(DoubleStruct lhs, DoubleStruct rhs)
=> new DoubleStruct(lhs.Value + rhs.Value);
}
public sealed class IntClass
{
public readonly int Value;
public IntClass(int value) => Value = value;
public static IntClass operator +(IntClass lhs, IntClass rhs)
=> new IntClass(lhs.Value + rhs.Value);
}
public sealed class DoubleClass
{
public readonly double Value;
public DoubleClass(double value) => Value = value;
public static DoubleClass operator +(DoubleClass lhs, DoubleClass rhs)
=> new DoubleClass(lhs.Value + rhs.Value);
}
これら6種3ケースのT
についてベンチマークを取っていきます。
特殊化手法たち
1. ジェネリック静的Strategy
C#の場合、型引数の異なるクローズ型は明確に別の型扱いとなる1。つまり、ジェネリッククラスの静的メンバは型引数が異なると別の実体が割り当てられます。
そのため、Strategyパターンの実装を静的フィールドで保持することで容易に特殊化が実現できます。
.Netの標準ライブラリでも、Comparer<T>.Default
などでおなじみの方法でしょう。
まず、下記のようなStrategyを用意します。特殊化すべき型についてはDefault
をちゃんと初期化しておきます。
public interface IArithmetic<T>
{
T Add(T lhs, T rhs);
}
public class Arithmetic
: IArithmetic<int>
, IArithmetic<double>
, IArithmetic<IntStruct>
, IArithmetic<DoubleStruct>
, IArithmetic<IntClass>
, IArithmetic<DoubleClass>
{
static Arithmetic()
{
var instance = new Arithmetic();
Arithmetic<int> .Default = instance;
Arithmetic<double> .Default = instance;
Arithmetic<IntStruct> .Default = instance;
Arithmetic<DoubleStruct>.Default = instance;
Arithmetic<IntClass> .Default = instance;
Arithmetic<DoubleClass> .Default = instance;
}
internal static void Initialize() {}
public int Add(int lhs, int rhs) => lhs + rhs;
public double Add(double lhs, double rhs) => lhs + rhs;
public IntStruct Add(IntStruct lhs, IntStruct rhs) => lhs + rhs;
public DoubleStruct Add(DoubleStruct lhs, DoubleStruct rhs) => lhs + rhs;
public IntClass Add(IntClass lhs, IntClass rhs) => lhs + rhs;
public DoubleClass Add(DoubleClass lhs, DoubleClass rhs) => lhs + rhs;
}
public static class Arithmetic<T>
{
public static IArithmetic<T> Default { get; internal set; }
static Arithmetic() => Arithmetic.Initialize();
}
ダミーメンバと静的コンストラクタを使ってただ1度だけ初期化されるといったコスいテクニックを使ってはいるが、概ね意図は伝わるんじゃないでしょうか。
Container<T>
型はこれを使うだけです。デザパタ的美しさにはなかなか優れた方法じゃないでしょうか。
public partial class Container<T>
{
public static Container<T> AddByStaticStrategy(Container<T> lhs, Container<T> rhs)
=> new Container<T>(Arithmetic<T>.Default.Add(lhs.Value, rhs.Value));
}
2. コンテナ全体を型スイッチ
C# 7で型スイッチが入ったことで型による分岐は気楽に書けるようになりました。
という訳で、以下のような愚直な型分岐実装が考えられますね。
public partial class Container<T>
{
public static Container<T> AddByContainerTypeSwitch(Container<T> lhs, Container<T> rhs)
{
switch(lhs)
{
case Container<int> intL:
{
var r = rhs as Container<int>;
return new Container<int>(intL.Value + r.Value) as Container<T>;
}
// ......
}
throw new Exception();
}
}
フル実装はムダに長いので折りたたみ
public partial class Container<T>
{
public static Container<T> AddByContainerTypeSwitch(Container<T> lhs, Container<T> rhs)
{
switch(lhs)
{
case Container<int> intL:
{
var r = rhs as Container<int>;
return new Container<int>(intL.Value + r.Value) as Container<T>;
}
case Container<double> doubleL:
{
var r = rhs as Container<double>;
return new Container<double>(doubleL.Value + r.Value) as Container<T>;
}
case Container<IntStruct> intStructL:
{
var r = rhs as Container<IntStruct>;
return new Container<IntStruct>(intStructL.Value + r.Value) as Container<T>;
}
case Container<DoubleStruct> doubleStructL:
{
var r = rhs as Container<DoubleStruct>;
return new Container<DoubleStruct>(doubleStructL.Value + r.Value) as Container<T>;
}
case Container<IntClass> intClassL:
{
var r = rhs as Container<IntClass>;
return new Container<IntClass>(intClassL.Value + r.Value) as Container<T>;
}
case Container<DoubleClass> doubleClassL:
{
var r = rhs as Container<DoubleClass>;
return new Container<DoubleClass>(doubleClassL.Value + r.Value) as Container<T>;
}
}
throw new Exception();
}
}
なお、この方法はT
に継承可能なクラスを許容する場合素直には書けなくなることに注意してください。
3. 値の方を型スイッチ
コンテナ全体ではなく、中身だけに型スイッチを適用する方法もあるでしょう。
public partial class Container<T>
{ public static Container<T> AddByValueTypeSwitch(Container<T> lhs, Container<T> rhs)
{
switch(lhs.Value)
{
case int intL:
{
if(rhs.Value is int r)
return new Container<int>(intL + r) as Container<T>;
break;
}
// ......
}
throw new Exception();
}
}
フル実装はムダに(ry
public partial class Container<T>
{ public static Container<T> AddByValueTypeSwitch(Container<T> lhs, Container<T> rhs)
{
switch(lhs.Value)
{
case int intL:
{
if(rhs.Value is int r)
return new Container<int>(intL + r) as Container<T>;
break;
}
case double doubleL:
{
if(rhs.Value is double r)
return new Container<double>(doubleL + r) as Container<T>;
break;
}
case IntStruct intStructL:
{
if(rhs.Value is IntStruct r)
return new Container<IntStruct>(intStructL + r) as Container<T>;
break;
}
case DoubleStruct doubleStructL:
{
if(rhs.Value is DoubleStruct r)
return new Container<DoubleStruct>(doubleStructL + r) as Container<T>;
break;
}
case IntClass intClassL:
{
if(rhs.Value is IntClass r)
return new Container<IntClass>(intClassL + r) as Container<T>;
break;
}
case DoubleClass doubleClassL:
{
if(rhs.Value is DoubleClass r)
return new Container<DoubleClass>(doubleClassL + r) as Container<T>;
break;
}
}
throw new Exception();
}
}
こちらは2.とは異なりT
の派生には対応できるが、代わりにnull
が入ってきたときに正しく動作しません。
4. typeofによるベタ比較
C#ではリフレクションによるメタ情報取得が非常に容易です。2
当然typeof
による型比較も考えられます。
public partial class Container<T>
{ public static Container<T> AddByTypeOf(Container<T> lhs, Container<T> rhs)
{
if(typeof(T) == typeof(int))
{
var l = lhs as Container<int>;
var r = rhs as Container<int>;
return new Container<int>(l.Value + r.Value) as Container<T>;
}
// ......
throw new Exception();
}
}
フル実装は(ry
public partial class Container<T>
{ public static Container<T> AddByTypeOf(Container<T> lhs, Container<T> rhs)
{
if(typeof(T) == typeof(int))
{
var l = lhs as Container<int>;
var r = rhs as Container<int>;
return new Container<int>(l.Value + r.Value) as Container<T>;
}
if(typeof(T) == typeof(double))
{
var l = lhs as Container<double>;
var r = rhs as Container<double>;
return new Container<double>(l.Value + r.Value) as Container<T>;
}
if(typeof(T) == typeof(IntStruct))
{
var l = lhs as Container<IntStruct>;
var r = rhs as Container<IntStruct>;
return new Container<IntStruct>(l.Value + r.Value) as Container<T>;
}
if(typeof(T) == typeof(DoubleStruct))
{
var l = lhs as Container<DoubleStruct>;
var r = rhs as Container<DoubleStruct>;
return new Container<DoubleStruct>(l.Value + r.Value) as Container<T>;
}
if(typeof(T) == typeof(IntClass))
{
var l = lhs as Container<IntClass>;
var r = rhs as Container<IntClass>;
return new Container<IntClass>(l.Value + r.Value) as Container<T>;
}
if(typeof(T) == typeof(DoubleClass))
{
var l = lhs as Container<DoubleClass>;
var r = rhs as Container<DoubleClass>;
return new Container<DoubleClass>(l.Value + r.Value) as Container<T>;
}
throw new Exception();
}
}
こちらもT
の派生には対応しづらいです。
できなくはないがパフォーマンス面では非常にきつい予感がしますね。
5. Ldftn
+ Calli
こちらの記事で実践している方がいるが、関数ポインタを直に叩くことでなんやかんやという話があるのだとか。
現時点ではC#で記述不可能なのでMSILでつらつら書いていきます。
QiitaさんMSILのシンタックスハイライトも実装してくれよな~頼むよ~
.assembly extern mscorlib
{}
.assembly extern GenericSpecializationBenchmark.Core
{}
.assembly GenericSpecializationBenchmark.Unsafe
{}
.module GenericSpecializationBenchmark.Unsafe.dll
.class private auto ansi abstract sealed FastArithmetic
extends [mscorlib]System.Object
{
.method private hidebysig specialname rtspecialname static
void .cctor () cil managed
{
.maxstack 8
ldftn int32 FastArithmetic::Add(int32, int32)
stsfld void* class FastArithmetic`1<int32>::_fptrAdd
ldftn float64 FastArithmetic::Add(float64, float64)
stsfld void* class FastArithmetic`1<float64>::_fptrAdd
ldftn valuetype [GenericSpecializationBenchmark.Core]IntStruct FastArithmetic::Add(valuetype [GenericSpecializationBenchmark.Core]IntStruct, valuetype [GenericSpecializationBenchmark.Core]IntStruct)
stsfld void* class FastArithmetic`1<valuetype [GenericSpecializationBenchmark.Core]IntStruct>::_fptrAdd
ldftn valuetype [GenericSpecializationBenchmark.Core]DoubleStruct FastArithmetic::Add(valuetype [GenericSpecializationBenchmark.Core]DoubleStruct, valuetype [GenericSpecializationBenchmark.Core]DoubleStruct)
stsfld void* class FastArithmetic`1<valuetype [GenericSpecializationBenchmark.Core]DoubleStruct>::_fptrAdd
ldftn class [GenericSpecializationBenchmark.Core]IntClass FastArithmetic::Add(class [GenericSpecializationBenchmark.Core]IntClass, class [GenericSpecializationBenchmark.Core]IntClass)
stsfld void* class FastArithmetic`1<class [GenericSpecializationBenchmark.Core]IntClass>::_fptrAdd
ldftn class [GenericSpecializationBenchmark.Core]DoubleClass FastArithmetic::Add(class [GenericSpecializationBenchmark.Core]DoubleClass, class [GenericSpecializationBenchmark.Core]DoubleClass)
stsfld void* class FastArithmetic`1<class [GenericSpecializationBenchmark.Core]DoubleClass>::_fptrAdd
ret
}
.method assembly hidebysig static
void Initialize () cil managed
{
.maxstack 8
ret
}
.method public hidebysig static
int32 Add (int32 lhs, int32 rhs) cil managed
{
.maxstack 8
ldarg.0
ldarg.1
add
ret
}
.method public hidebysig static
int64 Add (int64 lhs, int64 rhs) cil managed
{
.maxstack 8
ldarg.0
ldarg.1
add
ret
}
.method public hidebysig static
float32 Add (float32 lhs, float32 rhs) cil managed
{
.maxstack 8
ldarg.0
ldarg.1
add
ret
}
.method public hidebysig static
float64 Add (float64 lhs, float64 rhs) cil managed
{
.maxstack 8
ldarg.0
ldarg.1
add
ret
}
.method public hidebysig static
valuetype [GenericSpecializationBenchmark.Core]IntStruct Add (
valuetype [GenericSpecializationBenchmark.Core]IntStruct lhs,
valuetype [GenericSpecializationBenchmark.Core]IntStruct rhs
)
{
.maxstack 8
ldarg.0
ldarg.1
call valuetype [GenericSpecializationBenchmark.Core]IntStruct [GenericSpecializationBenchmark.Core]IntStruct::op_Addition(valuetype [GenericSpecializationBenchmark.Core]IntStruct, valuetype [GenericSpecializationBenchmark.Core]IntStruct)
ret
}
.method public hidebysig static
valuetype [GenericSpecializationBenchmark.Core]DoubleStruct Add (
valuetype [GenericSpecializationBenchmark.Core]DoubleStruct lhs,
valuetype [GenericSpecializationBenchmark.Core]DoubleStruct rhs
)
{
.maxstack 8
ldarg.0
ldarg.1
call valuetype [GenericSpecializationBenchmark.Core]DoubleStruct [GenericSpecializationBenchmark.Core]DoubleStruct::op_Addition(valuetype [GenericSpecializationBenchmark.Core]DoubleStruct, valuetype [GenericSpecializationBenchmark.Core]DoubleStruct)
ret
}
.method public hidebysig static
class [GenericSpecializationBenchmark.Core]IntClass Add (
class [GenericSpecializationBenchmark.Core]IntClass lhs,
class [GenericSpecializationBenchmark.Core]IntClass rhs
)
{
.maxstack 8
ldarg.0
ldarg.1
call class [GenericSpecializationBenchmark.Core]IntClass [GenericSpecializationBenchmark.Core]IntClass::op_Addition(class [GenericSpecializationBenchmark.Core]IntClass, class [GenericSpecializationBenchmark.Core]IntClass)
ret
}
.method public hidebysig static
class [GenericSpecializationBenchmark.Core]DoubleClass Add (
class [GenericSpecializationBenchmark.Core]DoubleClass lhs,
class [GenericSpecializationBenchmark.Core]DoubleClass rhs
)
{
.maxstack 8
ldarg.0
ldarg.1
call class [GenericSpecializationBenchmark.Core]DoubleClass [GenericSpecializationBenchmark.Core]DoubleClass::op_Addition(class [GenericSpecializationBenchmark.Core]DoubleClass, class [GenericSpecializationBenchmark.Core]DoubleClass)
ret
}
}
.class public auto ansi abstract sealed beforefieldinit FastArithmetic`1<T>
extends [mscorlib]System.Object
{
.field assembly static void* _fptrAdd
.property bool IsSupported()
{
.get bool FastArithmetic`1::get_IsSupported()
}
.method public hidebysig specialname static
bool get_IsSupported () cil managed aggressiveinlining
{
.maxstack 8
ldsfld void* class FastArithmetic`1<!T>::_fptrAdd
ldc.i4.0
conv.u
ceq
ldc.i4.0
conv.u
ceq
ret
}
.method private hidebysig specialname rtspecialname static
void .cctor () cil managed
{
.maxstack 8
call void class FastArithmetic::Initialize()
ret
}
.method public hidebysig static
!T Add (!T lhs, !T rhs) cil managed aggressiveinlining
{
.maxstack 8
ldarg.0
ldarg.1
ldsfld void* class FastArithmetic`1<!T>::_fptrAdd
calli !T(!T, !T)
ret
}
}
使う側はこう。雰囲気は静的Strategyに近いです。
public partial class Container<T>
{
public static Container<T> AddByLdftnAndCalli(Container<T> lhs, Container<T> rhs)
{
if(FastArithmetic<T>.IsSupported)
return new Container<T>(FastArithmetic<T>.Add(lhs.Value, rhs.Value));
throw new Exception();
}
}
unsafe
ならともかくILの保守なんかしたくないよ!という人は多いと思うので今のところ現実的な方法ではないが、csharplangでは関数ポインタがプロポーザルに上がってたりするのでそのうち実用の範囲まで降りてくるかもしれません。
ひとまず今回は参考記録ということで。
6. 拡張メソッドによるオーバーローディング
拡張メソッドに追い出してしまえば、クローズ型だろうとなんだろうと同名のメソッドでオーバーロードが可能。
public static class Container
{
public static Container<int> AddByOverload(Container<int> lhs, Container<int> rhs)
=> new Container<int>(lhs.Value + rhs.Value);
// ......
}
フ(ry
public static class Container
{
public static Container<int> AddByOverload(Container<int> lhs, Container<int> rhs)
=> new Container<int>(lhs.Value + rhs.Value);
public static Container<double> AddByOverload(Container<double> lhs, Container<double> rhs)
=> new Container<double>(lhs.Value + rhs.Value);
public static Container<IntStruct> AddByOverload(Container<IntStruct> lhs, Container<IntStruct> rhs)
=> new Container<IntStruct>(lhs.Value + rhs.Value);
public static Container<DoubleStruct> AddByOverload(Container<DoubleStruct> lhs, Container<DoubleStruct> rhs)
=> new Container<DoubleStruct>(lhs.Value + rhs.Value);
public static Container<IntClass> AddByOverload(Container<IntClass> lhs, Container<IntClass> rhs)
=> new Container<IntClass>(lhs.Value + rhs.Value);
public static Container<DoubleClass> AddByOverload(Container<DoubleClass> lhs, Container<DoubleClass> rhs)
=> new Container<DoubleClass>(lhs.Value + rhs.Value);
}
C#コンパイル時点で完全に別のメソッド呼び出しになってる上、IL命令もcallvirt
じゃなくcall
なのでパフォーマンスだけ見ればこれが最速でしょう。
とはいえ最初の呼び出しがクローズ型じゃないと呼び分けが機能しないし、非publicなメンバにはアクセスできないし、演算子オーバーロードでは使えないしで、他の方法と比べるとかなり制約が厳しいです。
完全な代替にはなりえないと思われます。
こちらも参考記録ということで。
いざ、ベンチマーク
メソッドも出揃ったのでベンチマークを取っていきます。
ベンチマークコード
まずはベンチマークメソッド全体を定義しておきます。
using System;
using System.Linq;
using System.Reflection;
public static class GenericSpecializationBenchmarkCore
{
public const int Iteration = 10000;
static GenericSpecializationBenchmarkCore()
{
var results = typeof(GenericSpecializationBenchmarkCore)
.GetMethods(BindingFlags.Public | BindingFlags.Static)
.Select(mi => (double)mi.Invoke(null, null))
.ToList();
foreach(var res in results)
if(results[0] != res)
throw new Exception("Invalid add method impl");
}
// こんな感じのメソッドをPrimitive/Struct/Class、および各特殊化手法ごとに定義していく
public static double AddByStaticStrategy_Primitive()
{
var result = 0.0;
{
var x = new Container<int>(1);
var y = new Container<int>(1);
for(var i = 0; i < Iteration; ++i)
x = Container<int>.AddByStaticStrategy(x, y);
result += x.Value;
}
{
var x = new Container<double>(1);
var y = new Container<double>(1);
for(var i = 0; i < Iteration; ++i)
x = Container<double>.AddByStaticStrategy(x, y);
result += x.Value;
}
return result;
}
}
(ry
using System;
using System.Linq;
using System.Reflection;
public static class GenericSpecializationBenchmarkCore
{
public const int Iteration = 10000;
static GenericSpecializationBenchmarkCore()
{
var results = typeof(GenericSpecializationBenchmarkCore)
.GetMethods(BindingFlags.Public | BindingFlags.Static)
.Select(mi => (double)mi.Invoke(null, null))
.ToList();
foreach(var res in results)
if(results[0] != res)
throw new Exception("Invalid add method impl");
}
public static double AddByStaticStrategy_Primitive()
{
var result = 0.0;
{
var x = new Container<int>(1);
var y = new Container<int>(1);
for(var i = 0; i < Iteration; ++i)
x = Container<int>.AddByStaticStrategy(x, y);
result += x.Value;
}
{
var x = new Container<double>(1);
var y = new Container<double>(1);
for(var i = 0; i < Iteration; ++i)
x = Container<double>.AddByStaticStrategy(x, y);
result += x.Value;
}
return result;
}
public static double AddByContainerTypeSwitch_Primitive()
{
var result = 0.0;
{
var x = new Container<int>(1);
var y = new Container<int>(1);
for(var i = 0; i < Iteration; ++i)
x = Container<int>.AddByContainerTypeSwitch(x, y);
result += x.Value;
}
{
var x = new Container<double>(1);
var y = new Container<double>(1);
for(var i = 0; i < Iteration; ++i)
x = Container<double>.AddByContainerTypeSwitch(x, y);
result += x.Value;
}
return result;
}
public static double AddByValueTypeSwitch_Primitive()
{
var result = 0.0;
{
var x = new Container<int>(1);
var y = new Container<int>(1);
for(var i = 0; i < Iteration; ++i)
x = Container<int>.AddByValueTypeSwitch(x, y);
result += x.Value;
}
{
var x = new Container<double>(1);
var y = new Container<double>(1);
for(var i = 0; i < Iteration; ++i)
x = Container<double>.AddByValueTypeSwitch(x, y);
result += x.Value;
}
return result;
}
public static double AddByTypeOf_Primitive()
{
var result = 0.0;
{
var x = new Container<int>(1);
var y = new Container<int>(1);
for(var i = 0; i < Iteration; ++i)
x = Container<int>.AddByTypeOf(x, y);
result += x.Value;
}
{
var x = new Container<double>(1);
var y = new Container<double>(1);
for(var i = 0; i < Iteration; ++i)
x = Container<double>.AddByTypeOf(x, y);
result += x.Value;
}
return result;
}
public static double AddByLdftnAndCalli_Primitive()
{
var result = 0.0;
{
var x = new Container<int>(1);
var y = new Container<int>(1);
for(var i = 0; i < Iteration; ++i)
x = Container<int>.AddByLdftnAndCalli(x, y);
result += x.Value;
}
{
var x = new Container<double>(1);
var y = new Container<double>(1);
for(var i = 0; i < Iteration; ++i)
x = Container<double>.AddByLdftnAndCalli(x, y);
result += x.Value;
}
return result;
}
public static double AddByOverload_Primitive()
{
var result = 0.0;
{
var x = new Container<int>(1);
var y = new Container<int>(1);
for(var i = 0; i < Iteration; ++i)
x = Container.AddByOverload(x, y);
result += x.Value;
}
{
var x = new Container<double>(1);
var y = new Container<double>(1);
for(var i = 0; i < Iteration; ++i)
x = Container.AddByOverload(x, y);
result += x.Value;
}
return result;
}
public static double AddByStaticStrategy_Struct()
{
var result = 0.0;
{
var x = new Container<IntStruct>(new IntStruct(1));
var y = new Container<IntStruct>(new IntStruct(1));
for(var i = 0; i < Iteration; ++i)
x = Container<IntStruct>.AddByStaticStrategy(x, y);
result += x.Value.Value;
}
{
var x = new Container<DoubleStruct>(new DoubleStruct(1));
var y = new Container<DoubleStruct>(new DoubleStruct(1));
for(var i = 0; i < Iteration; ++i)
x = Container<DoubleStruct>.AddByStaticStrategy(x, y);
result += x.Value.Value;
}
return result;
}
public static double AddByContainerTypeSwitch_Struct()
{
var result = 0.0;
{
var x = new Container<IntStruct>(new IntStruct(1));
var y = new Container<IntStruct>(new IntStruct(1));
for(var i = 0; i < Iteration; ++i)
x = Container<IntStruct>.AddByContainerTypeSwitch(x, y);
result += x.Value.Value;
}
{
var x = new Container<DoubleStruct>(new DoubleStruct(1));
var y = new Container<DoubleStruct>(new DoubleStruct(1));
for(var i = 0; i < Iteration; ++i)
x = Container<DoubleStruct>.AddByContainerTypeSwitch(x, y);
result += x.Value.Value;
}
return result;
}
public static double AddByValueTypeSwitch_Struct()
{
var result = 0.0;
{
var x = new Container<IntStruct>(new IntStruct(1));
var y = new Container<IntStruct>(new IntStruct(1));
for(var i = 0; i < Iteration; ++i)
x = Container<IntStruct>.AddByValueTypeSwitch(x, y);
result += x.Value.Value;
}
{
var x = new Container<DoubleStruct>(new DoubleStruct(1));
var y = new Container<DoubleStruct>(new DoubleStruct(1));
for(var i = 0; i < Iteration; ++i)
x = Container<DoubleStruct>.AddByValueTypeSwitch(x, y);
result += x.Value.Value;
}
return result;
}
public static double AddByTypeOf_Struct()
{
var result = 0.0;
{
var x = new Container<IntStruct>(new IntStruct(1));
var y = new Container<IntStruct>(new IntStruct(1));
for(var i = 0; i < Iteration; ++i)
x = Container<IntStruct>.AddByTypeOf(x, y);
result += x.Value.Value;
}
{
var x = new Container<DoubleStruct>(new DoubleStruct(1));
var y = new Container<DoubleStruct>(new DoubleStruct(1));
for(var i = 0; i < Iteration; ++i)
x = Container<DoubleStruct>.AddByTypeOf(x, y);
result += x.Value.Value;
}
return result;
}
public static double AddByLdftnAndCalli_Struct()
{
var result = 0.0;
{
var x = new Container<IntStruct>(new IntStruct(1));
var y = new Container<IntStruct>(new IntStruct(1));
for(var i = 0; i < Iteration; ++i)
x = Container<IntStruct>.AddByLdftnAndCalli(x, y);
result += x.Value.Value;
}
{
var x = new Container<DoubleStruct>(new DoubleStruct(1));
var y = new Container<DoubleStruct>(new DoubleStruct(1));
for(var i = 0; i < Iteration; ++i)
x = Container<DoubleStruct>.AddByLdftnAndCalli(x, y);
result += x.Value.Value;
}
return result;
}
public static double AddByOverload_Struct()
{
var result = 0.0;
{
var x = new Container<IntStruct>(new IntStruct(1));
var y = new Container<IntStruct>(new IntStruct(1));
for(var i = 0; i < Iteration; ++i)
x = Container.AddByOverload(x, y);
result += x.Value.Value;
}
{
var x = new Container<DoubleStruct>(new DoubleStruct(1));
var y = new Container<DoubleStruct>(new DoubleStruct(1));
for(var i = 0; i < Iteration; ++i)
x = Container.AddByOverload(x, y);
result += x.Value.Value;
}
return result;
}
public static double AddByStaticStrategy_Class()
{
var result = 0.0;
{
var x = new Container<IntClass>(new IntClass(1) );
var y = new Container<IntClass>(new IntClass(1) );
for(var i = 0; i < Iteration; ++i)
x = Container<IntClass>.AddByStaticStrategy(x, y);
result += x.Value.Value;
}
{
var x = new Container<DoubleClass>(new DoubleClass(1));
var y = new Container<DoubleClass>(new DoubleClass(1));
for(var i = 0; i < Iteration; ++i)
x = Container<DoubleClass>.AddByStaticStrategy(x, y);
result += x.Value.Value;
}
return result;
}
public static double AddByContainerTypeSwitch_Class()
{
var result = 0.0;
{
var x = new Container<IntClass>(new IntClass(1) );
var y = new Container<IntClass>(new IntClass(1) );
for(var i = 0; i < Iteration; ++i)
x = Container<IntClass>.AddByContainerTypeSwitch(x, y);
result += x.Value.Value;
}
{
var x = new Container<DoubleClass>(new DoubleClass(1));
var y = new Container<DoubleClass>(new DoubleClass(1));
for(var i = 0; i < Iteration; ++i)
x = Container<DoubleClass>.AddByContainerTypeSwitch(x, y);
result += x.Value.Value;
}
return result;
}
public static double AddByValueTypeSwitch_Class()
{
var result = 0.0;
{
var x = new Container<IntClass>(new IntClass(1) );
var y = new Container<IntClass>(new IntClass(1) );
for(var i = 0; i < Iteration; ++i)
x = Container<IntClass>.AddByValueTypeSwitch(x, y);
result += x.Value.Value;
}
{
var x = new Container<DoubleClass>(new DoubleClass(1));
var y = new Container<DoubleClass>(new DoubleClass(1));
for(var i = 0; i < Iteration; ++i)
x = Container<DoubleClass>.AddByValueTypeSwitch(x, y);
result += x.Value.Value;
}
return result;
}
public static double AddByTypeOf_Class()
{
var result = 0.0;
{
var x = new Container<IntClass>(new IntClass(1) );
var y = new Container<IntClass>(new IntClass(1) );
for(var i = 0; i < Iteration; ++i)
x = Container<IntClass>.AddByTypeOf(x, y);
result += x.Value.Value;
}
{
var x = new Container<DoubleClass>(new DoubleClass(1));
var y = new Container<DoubleClass>(new DoubleClass(1));
for(var i = 0; i < Iteration; ++i)
x = Container<DoubleClass>.AddByTypeOf(x, y);
result += x.Value.Value;
}
return result;
}
public static double AddByLdftnAndCalli_Class()
{
var result = 0.0;
{
var x = new Container<IntClass>(new IntClass(1) );
var y = new Container<IntClass>(new IntClass(1) );
for(var i = 0; i < Iteration; ++i)
x = Container<IntClass>.AddByLdftnAndCalli(x, y);
result += x.Value.Value;
}
{
var x = new Container<DoubleClass>(new DoubleClass(1));
var y = new Container<DoubleClass>(new DoubleClass(1));
for(var i = 0; i < Iteration; ++i)
x = Container<DoubleClass>.AddByLdftnAndCalli(x, y);
result += x.Value.Value;
}
return result;
}
public static double AddByOverload_Class()
{
var result = 0.0;
{
var x = new Container<IntClass>(new IntClass(1) );
var y = new Container<IntClass>(new IntClass(1) );
for(var i = 0; i < Iteration; ++i)
x = Container.AddByOverload(x, y);
result += x.Value.Value;
}
{
var x = new Container<DoubleClass>(new DoubleClass(1));
var y = new Container<DoubleClass>(new DoubleClass(1));
for(var i = 0; i < Iteration; ++i)
x = Container.AddByOverload(x, y);
result += x.Value.Value;
}
return result;
}
}
.Net Coreおよび.Net FrameworkではBenchmarkDotNetが使えるのでベンチマーククラスをかぶせていきます。
(ry
using System;
using BenchmarkDotNet.Attributes;
[CoreJob, ClrJob]
public class GenericSpecializationBenchmark
{
[Benchmark]
public double AddByStaticStrategy_Primitive()
=> GenericSpecializationBenchmarkCore.AddByStaticStrategy_Primitive();
[Benchmark]
public double AddByContainerTypeSwitch_Primitive()
=> GenericSpecializationBenchmarkCore.AddByContainerTypeSwitch_Primitive();
[Benchmark]
public double AddByValueTypeSwitch_Primitive()
=> GenericSpecializationBenchmarkCore.AddByValueTypeSwitch_Primitive();
[Benchmark]
public double AddByTypeOf_Primitive()
=> GenericSpecializationBenchmarkCore.AddByTypeOf_Primitive();
[Benchmark]
public double AddByLdftnAndCalli_Primitive()
=> GenericSpecializationBenchmarkCore.AddByLdftnAndCalli_Primitive();
[Benchmark]
public double AddByOverload_Primitive()
=> GenericSpecializationBenchmarkCore.AddByOverload_Primitive();
[Benchmark]
public double AddByStaticStrategy_Struct()
=> GenericSpecializationBenchmarkCore.AddByStaticStrategy_Struct();
[Benchmark]
public double AddByContainerTypeSwitch_Struct()
=> GenericSpecializationBenchmarkCore.AddByContainerTypeSwitch_Struct();
[Benchmark]
public double AddByValueTypeSwitch_Struct()
=> GenericSpecializationBenchmarkCore.AddByValueTypeSwitch_Struct();
[Benchmark]
public double AddByTypeOf_Struct()
=> GenericSpecializationBenchmarkCore.AddByTypeOf_Struct();
[Benchmark]
public double AddByLdftnAndCalli_Struct()
=> GenericSpecializationBenchmarkCore.AddByLdftnAndCalli_Struct();
[Benchmark]
public double AddByOverload_Struct()
=> GenericSpecializationBenchmarkCore.AddByOverload_Struct();
[Benchmark]
public double AddByStaticStrategy_Class()
=> GenericSpecializationBenchmarkCore.AddByStaticStrategy_Class();
[Benchmark]
public double AddByContainerTypeSwitch_Class()
=> GenericSpecializationBenchmarkCore.AddByContainerTypeSwitch_Class();
[Benchmark]
public double AddByValueTypeSwitch_Class()
=> GenericSpecializationBenchmarkCore.AddByValueTypeSwitch_Class();
[Benchmark]
public double AddByTypeOf_Class()
=> GenericSpecializationBenchmarkCore.AddByTypeOf_Class();
[Benchmark]
public double AddByLdftnAndCalli_Class()
=> GenericSpecializationBenchmarkCore.AddByLdftnAndCalli_Class();
[Benchmark]
public double AddByOverload_Class()
=> GenericSpecializationBenchmarkCore.AddByOverload_Class();
}
C#のプラットフォームとしてもう一つデカいやつ、Unityがあるのですが、残念ながらBenchmarkDotNetはUnity上では動きません。
代わりにPerformance Testing Extension for Unity Test Runnerなるものを見つけたので、今回はこれを使ってみます。
(ry
using System;
using UnityEngine;
using Unity.PerformanceTesting;
public class GenericSpecializationBenchmark : MonoBehaviour
{
[PerformanceTest]
public void AddByStaticStrategy_Primitive()
{
Measure.Method(() => GenericSpecializationBenchmarkCore.AddByStaticStrategy_Primitive())
.WarmupCount(16)
.MeasurementCount(128)
.IterationsPerMeasurement(16)
.Run();
}
[PerformanceTest]
public void AddByContainerTypeSwitch_Primitive()
{
Measure.Method(() => GenericSpecializationBenchmarkCore.AddByContainerTypeSwitch_Primitive())
.WarmupCount(16)
.MeasurementCount(128)
.IterationsPerMeasurement(16)
.Run();
}
[PerformanceTest]
public void AddByValueTypeSwitch_Primitive()
{
Measure.Method(() => GenericSpecializationBenchmarkCore.AddByValueTypeSwitch_Primitive())
.WarmupCount(16)
.MeasurementCount(128)
.IterationsPerMeasurement(16)
.Run();
}
[PerformanceTest]
public void AddByTypeOf_Primitive()
{
Measure.Method(() => GenericSpecializationBenchmarkCore.AddByTypeOf_Primitive())
.WarmupCount(16)
.MeasurementCount(128)
.IterationsPerMeasurement(16)
.Run();
}
[PerformanceTest]
public void AddByLdftnAndCalli_Primitive()
{
Measure.Method(() => GenericSpecializationBenchmarkCore.AddByLdftnAndCalli_Primitive())
.WarmupCount(16)
.MeasurementCount(128)
.IterationsPerMeasurement(16)
.Run();
}
[PerformanceTest]
public void AddByOverload_Primitive()
{
Measure.Method(() => GenericSpecializationBenchmarkCore.AddByOverload_Primitive())
.WarmupCount(16)
.MeasurementCount(128)
.IterationsPerMeasurement(16)
.Run();
}
[PerformanceTest]
public void AddByStaticStrategy_Struct()
{
Measure.Method(() => GenericSpecializationBenchmarkCore.AddByStaticStrategy_Struct())
.WarmupCount(16)
.MeasurementCount(128)
.IterationsPerMeasurement(16)
.Run();
}
[PerformanceTest]
public void AddByContainerTypeSwitch_Struct()
{
Measure.Method(() => GenericSpecializationBenchmarkCore.AddByContainerTypeSwitch_Struct())
.WarmupCount(16)
.MeasurementCount(128)
.IterationsPerMeasurement(16)
.Run();
}
[PerformanceTest]
public void AddByValueTypeSwitch_Struct()
{
Measure.Method(() => GenericSpecializationBenchmarkCore.AddByValueTypeSwitch_Struct())
.WarmupCount(16)
.MeasurementCount(128)
.IterationsPerMeasurement(16)
.Run();
}
[PerformanceTest]
public void AddByTypeOf_Struct()
{
Measure.Method(() => GenericSpecializationBenchmarkCore.AddByTypeOf_Struct())
.WarmupCount(16)
.MeasurementCount(128)
.IterationsPerMeasurement(16)
.Run();
}
[PerformanceTest]
public void AddByLdftnAndCalli_Struct()
{
Measure.Method(() => GenericSpecializationBenchmarkCore.AddByLdftnAndCalli_Struct())
.WarmupCount(16)
.MeasurementCount(128)
.IterationsPerMeasurement(16)
.Run();
}
[PerformanceTest]
public void AddByOverload_Struct()
{
Measure.Method(() => GenericSpecializationBenchmarkCore.AddByOverload_Struct())
.WarmupCount(16)
.MeasurementCount(128)
.IterationsPerMeasurement(16)
.Run();
}
[PerformanceTest]
public void AddByStaticStrategy_Class()
{
Measure.Method(() => GenericSpecializationBenchmarkCore.AddByStaticStrategy_Class())
.WarmupCount(16)
.MeasurementCount(128)
.IterationsPerMeasurement(16)
.Run();
}
[PerformanceTest]
public void AddByContainerTypeSwitch_Class()
{
Measure.Method(() => GenericSpecializationBenchmarkCore.AddByContainerTypeSwitch_Class())
.WarmupCount(16)
.MeasurementCount(128)
.IterationsPerMeasurement(16)
.Run();
}
[PerformanceTest]
public void AddByValueTypeSwitch_Class()
{
Measure.Method(() => GenericSpecializationBenchmarkCore.AddByValueTypeSwitch_Class())
.WarmupCount(16)
.MeasurementCount(128)
.IterationsPerMeasurement(16)
.Run();
}
[PerformanceTest]
public void AddByTypeOf_Class()
{
Measure.Method(() => GenericSpecializationBenchmarkCore.AddByTypeOf_Class())
.WarmupCount(16)
.MeasurementCount(128)
.IterationsPerMeasurement(16)
.Run();
}
[PerformanceTest]
public void AddByLdftnAndCalli_Class()
{
Measure.Method(() => GenericSpecializationBenchmarkCore.AddByLdftnAndCalli_Class())
.WarmupCount(16)
.MeasurementCount(128)
.IterationsPerMeasurement(16)
.Run();
}
[PerformanceTest]
public void AddByOverload_Class()
{
Measure.Method(() => GenericSpecializationBenchmarkCore.AddByOverload_Class())
.WarmupCount(16)
.MeasurementCount(128)
.IterationsPerMeasurement(16)
.Run();
}
}
結果
という訳で結果発表。テスト環境は以下の通り。
Unityも同じマシンを使っており、バージョンは2018.3.5f1です。
BenchmarkDotNet=v0.11.4, OS=Windows 10.0.17134.590 (1803/April2018Update/Redstone4)
Intel Core i7-6700K CPU 4.00GHz (Skylake), 1 CPU, 8 logical and 4 physical cores
Frequency=3914060 Hz, Resolution=255.4892 ns, Timer=TSC
.NET Core SDK=2.2.103
[Host] : .NET Core 2.2.1 (CoreCLR 4.6.27207.03, CoreFX 4.6.27207.03), 64bit RyuJIT
Clr : .NET Framework 4.7.2 (CLR 4.0.30319.42000), 64bit RyuJIT-v4.7.3324.0
Core : .NET Core 2.2.1 (CoreCLR 4.6.27207.03, CoreFX 4.6.27207.03), 64bit RyuJIT
.Net Framework
Method | Job | Runtime | Mean | Error | StdDev |
---|---|---|---|---|---|
AddByStaticStrategy_Primitive | Clr | Clr | 122.10 us | 0.2588 us | 0.2021 us |
AddByContainerTypeSwitch_Primitive | Clr | Clr | 178.35 us | 0.8562 us | 0.8009 us |
AddByValueTypeSwitch_Primitive | Clr | Clr | 403.07 us | 3.7490 us | 3.5068 us |
AddByTypeOf_Primitive | Clr | Clr | 150.96 us | 1.0237 us | 0.9576 us |
AddByLdftnAndCalli_Primitive | Clr | Clr | 113.50 us | 1.0959 us | 1.0251 us |
AddByOverload_Primitive | Clr | Clr | 81.95 us | 0.6142 us | 0.5745 us |
Method | Job | Runtime | Mean | Error | StdDev |
---|---|---|---|---|---|
AddByStaticStrategy_Struct | Clr | Clr | 144.91 us | 1.5194 us | 1.2688 us |
AddByContainerTypeSwitch_Struct | Clr | Clr | 274.19 us | 0.6559 us | 0.5477 us |
AddByValueTypeSwitch_Struct | Clr | Clr | 525.21 us | 1.3129 us | 1.1639 us |
AddByTypeOf_Struct | Clr | Clr | 156.33 us | 0.9623 us | 0.9002 us |
AddByLdftnAndCalli_Struct | Clr | Clr | 158.57 us | 1.0668 us | 0.9979 us |
AddByOverload_Struct | Clr | Clr | 124.76 us | 0.2196 us | 0.1833 us |
Method | Job | Runtime | Mean | Error | StdDev |
---|---|---|---|---|---|
AddByStaticStrategy_Class | Clr | Clr | 442.95 us | 1.5895 us | 1.4869 us |
AddByContainerTypeSwitch_Class | Clr | Clr | 425.81 us | 1.1064 us | 1.0350 us |
AddByValueTypeSwitch_Class | Clr | Clr | 407.23 us | 3.9100 us | 3.6574 us |
AddByTypeOf_Class | Clr | Clr | 284.35 us | 1.6016 us | 1.4198 us |
AddByLdftnAndCalli_Class | Clr | Clr | 347.96 us | 2.9368 us | 2.7471 us |
AddByOverload_Class | Clr | Clr | 156.59 us | 0.7971 us | 0.6656 us |
.Net Frameworkの場合、値型では静的Strategy、クラスではtypeofが速かったです。
プリミティブ相手だと関数ポインタが速くて悪くないんですが、苦労に比べれば大した改善じゃないし構造体やクラス相手だとむしろ遅いしでどうしようもないです。
.Net Core
Method | Job | Runtime | Mean | Error | StdDev |
---|---|---|---|---|---|
AddByStaticStrategy_Primitive | Core | Core | 150.60 us | 0.3208 us | 0.2843 us |
AddByContainerTypeSwitch_Primitive | Core | Core | 110.63 us | 0.1589 us | 0.1487 us |
AddByValueTypeSwitch_Primitive | Core | Core | 85.13 us | 0.0894 us | 0.0836 us |
AddByTypeOf_Primitive | Core | Core | 91.29 us | 0.1205 us | 0.0941 us |
AddByLdftnAndCalli_Primitive | Core | Core | 148.50 us | 0.1943 us | 0.1818 us |
AddByOverload_Primitive | Core | Core | 87.54 us | 0.4548 us | 0.4255 us |
Method | Job | Runtime | Mean | Error | StdDev |
---|---|---|---|---|---|
AddByStaticStrategy_Struct | Core | Core | 156.91 us | 0.2384 us | 0.2114 us |
AddByContainerTypeSwitch_Struct | Core | Core | 207.03 us | 0.4905 us | 0.4589 us |
AddByValueTypeSwitch_Struct | Core | Core | 175.25 us | 0.3431 us | 0.3210 us |
AddByTypeOf_Struct | Core | Core | 129.96 us | 0.5253 us | 0.4656 us |
AddByLdftnAndCalli_Struct | Core | Core | 161.22 us | 1.1491 us | 0.9596 us |
AddByOverload_Struct | Core | Core | 132.88 us | 0.5114 us | 0.4534 us |
Method | Job | Runtime | Mean | Error | StdDev |
---|---|---|---|---|---|
AddByStaticStrategy_Class | Core | Core | 388.40 us | 0.6757 us | 0.5643 us |
AddByContainerTypeSwitch_Class | Core | Core | 416.41 us | 0.6913 us | 0.6128 us |
AddByValueTypeSwitch_Class | Core | Core | 415.41 us | 1.1371 us | 1.0080 us |
AddByTypeOf_Class | Core | Core | 256.51 us | 0.9296 us | 0.8241 us |
AddByLdftnAndCalli_Class | Core | Core | 335.43 us | 1.1102 us | 1.0385 us |
AddByOverload_Class | Core | Core | 167.72 us | 0.3065 us | 0.2867 us |
全体を通してtypeofが速いです。
プリミティブ型に対しては値に対しての型スイッチが速いですが、構造体・クラス相手だとむしろ遅いです。
プリミティブ型相手だとJITでガッツリ最適化かかってるんですかね。
Unity
Method | Median | Min | Max | Avg | Std |
---|---|---|---|---|---|
AddByStaticStrategy_Primitive | 2.78 ms | 2.70 ms | 3.33 ms | 2.85 ms | 0.14 ms |
AddByContainerTypeSwitch_Primitive | 2.73 ms | 2.66 ms | 3.19 ms | 2.80 ms | 0.12 ms |
AddByValueTypeSwitch_Primitive | 13.69 ms | 13.39 ms | 16.06 ms | 13.73 ms | 0.24 ms |
AddByTypeOf_Primitive | 2.72 ms | 2.68 ms | 3.20 ms | 2.80 ms | 0.12 ms |
AddByLdftnAndCalli_Primitive | 6.95 ms | 6.88 ms | 7.44 ms | 7.03 ms | 0.13 ms |
AddByOverload_Primitive | 2.60 ms | 2.55 ms | 2.85 ms | 2.67 ms | 0.11 ms |
Method | Median | Min | Max | Avg | Std |
---|---|---|---|---|---|
AddByStaticStrategy_Struct | 3.04 ms | 2.99 ms | 3.39 ms | 3.11 ms | 0.12 ms |
AddByContainerTypeSwitch_Struct | 3.03 ms | 2.98 ms | 3.69 ms | 3.11 ms | 0.14 ms |
AddByValueTypeSwitch_Struct | 19.12 ms | 18.82 ms | 21.28 ms | 19.14 ms | 0.31 ms |
AddByTypeOf_Struct | 3.02 ms | 2.97 ms | 4.05 ms | 3.11 ms | 0.15 ms |
AddByLdftnAndCalli_Struct | 7.31 ms | 7.19 ms | 9.42 ms | 7.38 ms | 0.22 ms |
AddByOverload_Struct | 2.84 ms | 2.80 ms | 3.19 ms | 2.92 ms | 0.12 ms |
Method | Median | Min | Max | Avg | Std |
---|---|---|---|---|---|
AddByStaticStrategy_Class | 5.67 ms | 5.39 ms | 7.02 ms | 5.66 ms | 0.23 ms |
AddByContainerTypeSwitch_Class | 5.52 ms | 5.18 ms | 7.13 ms | 5.50 ms | 0.24 ms |
AddByValueTypeSwitch_Class | 5.56 ms | 5.31 ms | 5.85 ms | 5.52 ms | 0.12 ms |
AddByTypeOf_Class | 5.71 ms | 5.43 ms | 6.73 ms | 5.69 ms | 0.18 ms |
AddByLdftnAndCalli_Class | 9.83 ms | 9.53 ms | 11.76 ms | 9.81 ms | 0.22 ms |
AddByOverload_Class | 5.26 ms | 5.00 ms | 5.83 ms | 5.21 ms | 0.13 ms |
全体を通してあまり差がない・・・プリミティブ型・構造体相手のときに値の型スイッチに対してやたら遅くなるくらいですかね?
考察
速い手法はなぜ速いのか?を調べるにはJIT結果を見るのが一番なので試してみます。
T
がint
/IntStruct
/IntClass
のときの実質的なアセンブリを確認していきます。
なお、Container<T>
の各Add
メソッドにはMethodImpl(MethodImplOptions.NoInlining)
属性を指定して測定しています。
メソッド全体インライン化されたらどこ見たらいいかわかんないからね。
途中のcall
先でどれだけ命令が呼ばれているのか追跡しきれなかったのであくまで参考値ですが命令数も載せておきます。
ひとまずは.Net Core 2.2.1で実証。
誰か他の環境を調べて
プリミティブ型
AddByOverload_Primitive
174: => new Container<int>(lhs.Value + rhs.Value);
00007FFC87257260 push rdi
00007FFC87257261 push rsi
00007FFC87257262 sub rsp,28h
00007FFC87257266 mov rsi,rdx
00007FFC87257269 mov edi,dword ptr [rcx+8]
00007FFC8725726C mov rcx,7FFC8730A778h
00007FFC87257276 call 00007FFCE6D5B3B0
00007FFC8725727B mov edx,edi
00007FFC8725727D add edx,dword ptr [rsi+8]
00007FFC87257280 mov dword ptr [rax+8],edx
00007FFC87257283 add rsp,28h
00007FFC87257287 pop rsi
00007FFC87257288 pop rdi
00007FFC87257289 ret
AddByStaticStrategy_Primitive
22: => new Container<T>(Arithmetic<T>.Default.Add(lhs.Value, rhs.Value));
00007FFC87265D90 push rdi
00007FFC87265D91 push rsi
00007FFC87265D92 push rbp
00007FFC87265D93 push rbx
00007FFC87265D94 sub rsp,28h
00007FFC87265D98 mov rsi,rcx
00007FFC87265D9B mov rdi,rdx
00007FFC87265D9E mov rcx,7FFC87349D80h
00007FFC87265DA8 xor edx,edx
00007FFC87265DAA call 00007FFCE6D32120
00007FFC87265DAF mov rcx,1A944672AD0h
00007FFC87265DB9 mov rbx,qword ptr [rcx]
00007FFC87265DBC mov esi,dword ptr [rsi+8]
00007FFC87265DBF mov rcx,7FFC8731A778h
00007FFC87265DC9 call 00007FFCE6D5B3B0
00007FFC87265DCE mov rbp,rax
00007FFC87265DD1 mov r8d,dword ptr [rdi+8]
00007FFC87265DD5 mov rcx,rbx
00007FFC87265DD8 mov edx,esi
00007FFC87265DDA mov r11,7FFC87150028h
00007FFC87265DE4 cmp dword ptr [rcx],ecx
00007FFC87265DE6 call qword ptr [7FFC87150028h]
00007FFC872661C0 lea eax,[rdx+r8]
00007FFC872661C4 ret
00007FFC87265DEC mov dword ptr [rbp+8],eax
00007FFC87265DEF mov rax,rbp
00007FFC87265DF2 add rsp,28h
00007FFC87265DF6 pop rbx
00007FFC87265DF7 pop rbp
00007FFC87265DF8 pop rsi
00007FFC87265DF9 pop rdi
00007FFC87265DFA ret
AddByContainerTypeSwitch_Primitive
28: switch(lhs)
00007FFC872567D0 push rdi
00007FFC872567D1 push rsi
00007FFC872567D2 sub rsp,0F8h
00007FFC872567D9 mov rsi,rdx
00007FFC872567DC test rcx,rcx
00007FFC872567DF je 00007FFC87256805
00007FFC872567E1 mov edi,dword ptr [rcx+8]
00007FFC872567E4 mov rcx,7FFC8730A778h
00007FFC872567EE call 00007FFCE6D5B3B0
00007FFC872567F3 mov ecx,edi
00007FFC872567F5 add ecx,dword ptr [rsi+8]
00007FFC872567F8 mov dword ptr [rax+8],ecx
00007FFC872567FB add rsp,0F8h
00007FFC87256802 pop rsi
00007FFC87256803 pop rdi
00007FFC87256804 ret
AddByValueTypeSwitch_Primitive
68: switch(lhs.Value)
00007FFC87256A90 push rdi
00007FFC87256A91 push rsi
00007FFC87256A92 sub rsp,28h
00007FFC87256A96 mov esi,dword ptr [rcx+8]
00007FFC87256A99 mov ecx,dword ptr [rdx+8]
00007FFC87256A9C mov edi,ecx
73: return new Container<int>(intL + r) as Container<T>;
00007FFC87256A9E mov rcx,7FFC8730A778h
00007FFC87256AA8 call 00007FFCE6D5B3B0
00007FFC87256AAD add esi,edi
00007FFC87256AAF mov dword ptr [rax+8],esi
00007FFC87256AB2 add rsp,28h
00007FFC87256AB6 pop rsi
00007FFC87256AB7 pop rdi
00007FFC87256AB8 ret
AddByTypeof_Primitive
114: if(typeof(T) == typeof(int))
00007FFC87256C90 push rdi
00007FFC87256C91 push rsi
00007FFC87256C92 sub rsp,28h
00007FFC87256C96 mov rsi,rdx
00007FFC87256C99 mov edi,dword ptr [rcx+8]
00007FFC87256C9C mov rcx,7FFC8730A778h
00007FFC87256CA6 call 00007FFCE6D5B3B0
00007FFC87256CAB mov edx,edi
00007FFC87256CAD add edx,dword ptr [rsi+8]
00007FFC87256CB0 mov dword ptr [rax+8],edx
00007FFC87256CB3 add rsp,28h
00007FFC87256CB7 pop rsi
00007FFC87256CB8 pop rdi
00007FFC87256CB9 ret
Method | Instruction Count | Mean |
---|---|---|
AddByOverload_Primitive | 14 | 87.54 us |
AddByStaticStrategy_Primitive | 30 | 150.60 us |
AddByContainerTypeSwitch_Primitive | 16 | 110.63 us |
AddByValueTypeSwitch_Primitive | 14 | 85.13 us |
AddByTypeof_Primitive | 14 | 91.29 us |
AddByValueTypeSwitch
、AddByTypeof
はAddByOverload
と同じ命令数まで最適化されています。
AddByTypeof
に至っては一字一句すべて一致しています。
つまり、JIT後はAddByOverload
とAddByTypeof
で全く同じということですね。
構造体
AddByOverload_Struct
184: => new Container<IntStruct>(lhs.Value + rhs.Value);
00007FFC87258530 push rsi
00007FFC87258531 sub rsp,20h
00007FFC87258535 mov ecx,dword ptr [rcx+8]
00007FFC87258538 mov eax,dword ptr [rdx+8]
00007FFC8725853B lea esi,[rcx+rax]
00007FFC8725853E mov rcx,7FFC8733B288h
00007FFC87258548 call 00007FFCE6D5B3B0
00007FFC8725854D mov dword ptr [rax+8],esi
00007FFC87258550 add rsp,20h
00007FFC87258554 pop rsi
00007FFC87258555 ret
AddByStaticStrategy_Struct
22: => new Container<T>(Arithmetic<T>.Default.Add(lhs.Value, rhs.Value));
00007FFC87257490 push rdi
00007FFC87257491 push rsi
00007FFC87257492 push rbp
00007FFC87257493 push rbx
00007FFC87257494 sub rsp,28h
00007FFC87257498 mov rax,23910002AE0h
00007FFC872574A2 mov rsi,qword ptr [rax]
00007FFC872574A5 mov edi,dword ptr [rcx+8]
00007FFC872574A8 mov ebx,dword ptr [rdx+8]
00007FFC872574AB mov rcx,7FFC8733B288h
00007FFC872574B5 call 00007FFCE6D5B3B0
00007FFC872574BA mov rbp,rax
00007FFC872574BD mov rcx,rsi
00007FFC872574C0 mov r8d,ebx
00007FFC872574C3 mov edx,edi
00007FFC872574C5 mov r11,7FFC87140038h
00007FFC872574CF cmp dword ptr [rcx],ecx
00007FFC872574D1 call qword ptr [7FFC87140038h]
00007FFC87257510 lea eax,[rdx+r8]
00007FFC87257514 ret
00007FFC872574D7 mov dword ptr [rbp+8],eax
00007FFC872574DA mov rax,rbp
00007FFC872574DD add rsp,28h
00007FFC872574E1 pop rbx
00007FFC872574E2 pop rbp
00007FFC872574E3 pop rsi
00007FFC872574E4 pop rdi
00007FFC872574E5 ret
AddByContainerTypeSwitch_Struct
28: switch(lhs)
00007FFC872677A0 push rdi
00007FFC872677A1 push rsi
00007FFC872677A2 push rbp
00007FFC872677A3 push rbx
00007FFC872677A4 sub rsp,0D8h
00007FFC872677AB vzeroupper
00007FFC872677AE vmovaps xmmword ptr [rsp+0C0h],xmm6
00007FFC872677B8 mov rsi,rdx
00007FFC872677BB mov rdi,rcx
00007FFC872677BE test rdi,rdi
00007FFC872677C1 je 00007FFC872678B6
00007FFC872677C7 mov rdx,rdi
00007FFC872677CA mov rcx,7FFC8731A778h
00007FFC872677D4 call 00007FFCE6D59C70
00007FFC872677D9 mov rbx,rax
00007FFC872677DC test rbx,rbx
00007FFC872677DF jne 00007FFC872677FD
00007FFC872677E1 mov rdx,rdi
00007FFC872677E4 mov rcx,7FFC8731A978h
00007FFC872677EE call 00007FFCE6D59C70
00007FFC872677F3 mov rbp,rax
00007FFC872677F6 test rbp,rbp
00007FFC872677F9 jne 00007FFC8726782E
00007FFC872677FB jmp 00007FFC8726786B
00007FFC8726786B mov ecx,dword ptr [rdi+8]
00007FFC8726786E mov eax,dword ptr [rsi+8]
00007FFC87267871 lea esi,[rcx+rax]
00007FFC87267874 mov rcx,7FFC8734B288h
00007FFC8726787E call 00007FFCE6D5B3B0
00007FFC87267883 mov dword ptr [rax+8],esi
43: return new Container<IntStruct>(intStructL.Value + r.Value) as Container<T>;
00007FFC87267886 jmp 00007FFC872678A0
00007FFC872678A0 vmovaps xmm6,xmmword ptr [rsp+0C0h]
00007FFC872678AA add rsp,0D8h
00007FFC872678B1 pop rbx
00007FFC872678B2 pop rbp
00007FFC872678B3 pop rsi
00007FFC872678B4 pop rdi
00007FFC872678B5 ret
AddByValueTypeSwitch_Struct
68: switch(lhs.Value)
00007FFC87257CC0 push rsi
00007FFC87257CC1 sub rsp,20h
00007FFC87257CC5 mov ecx,dword ptr [rcx+8]
00007FFC87257CC8 mov eax,dword ptr [rdx+8]
85: return new Container<IntStruct>(intStructL + r) as Container<T>;
00007FFC87257CCB lea esi,[rcx+rax]
00007FFC87257CCE mov rcx,7FFC8733B288h
00007FFC87257CD8 call 00007FFCE6D5B3B0
00007FFC87257CDD mov dword ptr [rax+8],esi
00007FFC87257CE0 add rsp,20h
00007FFC87257CE4 pop rsi
00007FFC87257CE5 ret
AddByTypeof_Struct
114: if(typeof(T) == typeof(int))
00007FFC87257F80 push rsi
00007FFC87257F81 sub rsp,20h
00007FFC87257F85 mov ecx,dword ptr [rcx+8]
00007FFC87257F88 mov eax,dword ptr [rdx+8]
00007FFC87257F8B lea esi,[rcx+rax]
00007FFC87257F8E mov rcx,7FFC8733B288h
00007FFC87257F98 call 00007FFCE6D5B3B0
00007FFC87257F9D mov dword ptr [rax+8],esi
00007FFC87257FA0 add rsp,20h
00007FFC87257FA4 pop rsi
00007FFC87257FA5 ret
Method | Instruction Count | Mean |
---|---|---|
AddByOverload_Struct | 11 | 132.88 us |
AddByStaticStrategy_Struct | 28 | 156.91 us |
AddByContainerTypeSwitch_Struct | 38 | 207.03 us |
AddByValueTypeSwitch_Struct | 11 | 175.25 us |
AddByTypeof_Struct | 11 | 129.96 us |
AddByValueTypeSwitch
とAddByTypeof
は引き続き優秀で、AddByTypeof
はAddByOverload
と同等なのもプリミティブ型のときと一緒です。
プリミティブ型では前述の2手法には及ばなかったAddByStaticStrategy
も、命令数肥大化がほとんどないためか構造体に対しては良好なパフォーマンスが得られていることがわかります。
一方でAddByContainerTypeSwitch
は著しく悪化してしまいました。
途中jne
/jmp
命令が挟まっていることから最適化による条件判定の消去が実施されていないことが伺えます。
クラス
AddByOverload_Class
194: => new Container<IntClass>(lhs.Value + rhs.Value);
00007FFC87299EC0 push rdi
00007FFC87299EC1 push rsi
00007FFC87299EC2 push rbx
00007FFC87299EC3 sub rsp,20h
00007FFC87299EC7 mov rsi,qword ptr [rcx+8]
00007FFC87299ECB mov rdi,qword ptr [rdx+8]
00007FFC87299ECF mov rcx,7FFC8737A988h
00007FFC87299ED9 call 00007FFCE6D5B3B0
00007FFC87299EDE mov rbx,rax
00007FFC87299EE1 mov ecx,dword ptr [rsi+8]
00007FFC87299EE4 add ecx,dword ptr [rdi+8]
00007FFC87299EE7 mov dword ptr [rbx+8],ecx
00007FFC87299EEA mov rcx,7FFC8737B7C8h
00007FFC87299EF4 call 00007FFCE6D5B3B0
00007FFC87299EF9 mov rsi,rax
00007FFC87299EFC lea rcx,[rsi+8]
00007FFC87299F00 mov rdx,rbx
00007FFC87299F03 call 00007FFCE6D59F10
00007FFC87299F08 mov rax,rsi
00007FFC87299F0B add rsp,20h
00007FFC87299F0F pop rbx
00007FFC87299F10 pop rsi
00007FFC87299F11 pop rdi
00007FFC87299F12 ret
AddByStaticStrategy_Class
22: => new Container<T>(Arithmetic<T>.Default.Add(lhs.Value, rhs.Value));
00007FFC872687D0 push r14
00007FFC872687D2 push rdi
00007FFC872687D3 push rsi
00007FFC872687D4 push rbp
00007FFC872687D5 push rbx
00007FFC872687D6 sub rsp,30h
00007FFC872687DA mov qword ptr [rsp+28h],rcx
00007FFC872687DF mov rsi,rcx
00007FFC872687E2 mov rdi,rdx
00007FFC872687E5 mov rbx,r8
00007FFC872687E8 mov rcx,qword ptr [rsi+30h]
00007FFC872687EC mov rbp,qword ptr [rcx]
00007FFC872687EF mov rcx,qword ptr [rbp+8]
00007FFC872687F3 test rcx,rcx
00007FFC872687F6 jne 00007FFC8726880D
00007FFC8726880D call 00007FFC872659F8
00007FFC87268890 push rsi
00007FFC87268891 sub rsp,30h
00007FFC87268895 mov qword ptr [rsp+28h],rcx
00007FFC8726889A mov rsi,rcx
00007FFC8726889D mov rcx,rsi
00007FFC872688A0 call 00007FFCE6EBE2E0
00007FFC872688A5 mov rcx,rsi
00007FFC872688A8 call 00007FFCE6CED420
00007FFC872688AD mov rax,qword ptr [rax]
00007FFC872688B0 add rsp,30h
00007FFC872688B4 pop rsi
00007FFC872688B5 ret
00007FFC87268812 mov r14,rax
00007FFC87268815 mov rdi,qword ptr [rdi+8]
00007FFC87268819 mov rbx,qword ptr [rbx+8]
00007FFC8726881D mov rbp,qword ptr [rbp+10h]
00007FFC87268821 test rbp,rbp
00007FFC87268824 jne 00007FFC8726883B
00007FFC8726883B mov rcx,rsi
00007FFC8726883E call 00007FFCE6D5B3B0
00007FFC87268843 mov rsi,rax
00007FFC87268846 mov rcx,r14
00007FFC87268849 mov r11,rbp
00007FFC8726884C mov rdx,rdi
00007FFC8726884F mov r8,rbx
00007FFC87268852 cmp dword ptr [rcx],ecx
00007FFC87268854 call qword ptr [rbp]
00007FFC872688D0 push rdi
00007FFC872688D1 push rsi
00007FFC872688D2 sub rsp,28h
00007FFC872688D6 mov rsi,rdx
00007FFC872688D9 mov rdi,r8
00007FFC872688DC mov rcx,7FFC8734A988h
00007FFC872688E6 call 00007FFCE6D5B3B0
00007FFC872688EB mov edx,dword ptr [rsi+8]
00007FFC872688EE add edx,dword ptr [rdi+8]
00007FFC872688F1 mov dword ptr [rax+8],edx
00007FFC872688F4 add rsp,28h
00007FFC872688F8 pop rsi
00007FFC872688F9 pop rdi
00007FFC872688FA ret
00007FFC87268857 lea rcx,[rsi+8]
00007FFC8726885B mov rdx,rax
00007FFC8726885E call 00007FFCE6D59F10
00007FFC87268863 mov rax,rsi
00007FFC87268866 add rsp,30h
00007FFC8726886A pop rbx
00007FFC8726886B pop rbp
00007FFC8726886C pop rsi
00007FFC8726886D pop rdi
00007FFC8726886E pop r14
00007FFC87268870 ret
AddByContainerTypeSwitch_Class
28: switch(lhs)
00007FFC87288B40 push r15
00007FFC87288B42 push r14
00007FFC87288B44 push r13
00007FFC87288B46 push r12
00007FFC87288B48 push rdi
00007FFC87288B49 push rsi
00007FFC87288B4A push rbp
00007FFC87288B4B push rbx
00007FFC87288B4C sub rsp,78h
00007FFC87288B50 vzeroupper
00007FFC87288B53 vmovaps xmmword ptr [rsp+60h],xmm6
00007FFC87288B5A mov qword ptr [rsp+58h],rcx
00007FFC87288B5F mov rdi,rcx
00007FFC87288B62 mov rsi,r8
00007FFC87288B65 mov rbx,rdx
00007FFC87288B68 test rbx,rbx
00007FFC87288B6B je 00007FFC87288E64
00007FFC87288B71 mov rdx,rbx
00007FFC87288B74 mov rcx,7FFC8733A778h
00007FFC87288B7E call 00007FFCE6D59C70
00007FFC87288B83 mov rbp,rax
00007FFC87288B86 test rbp,rbp
00007FFC87288B89 jne 00007FFC87288C2A
00007FFC87288B8F mov rdx,rbx
00007FFC87288B92 mov rcx,7FFC8733A978h
00007FFC87288B9C call 00007FFCE6D59C70
00007FFC87288BA1 mov r14,rax
00007FFC87288BA4 test r14,r14
00007FFC87288BA7 jne 00007FFC87288C6B
00007FFC87288BAD mov rdx,rbx
00007FFC87288BB0 mov rcx,7FFC8736B288h
00007FFC87288BBA call 00007FFCE6D59C70
00007FFC87288BBF mov r15,rax
00007FFC87288BC2 test r15,r15
00007FFC87288BC5 jne 00007FFC87288CB6
00007FFC87288BCB mov rdx,rbx
00007FFC87288BCE mov rcx,7FFC8736B488h
00007FFC87288BD8 call 00007FFCE6D59C70
00007FFC87288BDD mov r12,rax
00007FFC87288BE0 test r12,r12
00007FFC87288BE3 jne 00007FFC87288CFA
00007FFC87288BE9 mov rdx,rbx
00007FFC87288BEC mov rcx,7FFC8736B7C8h
00007FFC87288BF6 call 00007FFCE6D59C70
00007FFC87288BFB mov r13,rax
00007FFC87288BFE test r13,r13
00007FFC87288C01 jne 00007FFC87288D84
00007FFC87288D84 mov rdx,rsi
00007FFC87288D87 mov rcx,7FFC8736B7C8h
00007FFC87288D91 call 00007FFCE6D59C70
00007FFC87288D96 mov rsi,qword ptr [r13+8]
00007FFC87288D9A mov rbx,qword ptr [rax+8]
00007FFC87288D9E mov rcx,7FFC8736A988h
00007FFC87288DA8 call 00007FFCE6D5B3B0
00007FFC87288DAD mov rbp,rax
00007FFC87288DB0 mov ecx,dword ptr [rsi+8]
00007FFC87288DB3 add ecx,dword ptr [rbx+8]
00007FFC87288DB6 mov dword ptr [rbp+8],ecx
00007FFC87288DB9 mov rcx,7FFC8736B7C8h
00007FFC87288DC3 call 00007FFCE6D5B3B0
00007FFC87288DC8 mov rsi,rax
00007FFC87288DCB lea rcx,[rsi+8]
00007FFC87288DCF mov rdx,rbp
00007FFC87288DD2 call 00007FFCE6D59F10
53: return new Container<IntClass>(intClassL.Value + r.Value) as Container<T>;
00007FFC87288DD7 mov rcx,rdi
00007FFC87288DDA mov rdx,rsi
00007FFC87288DDD call 00007FFCE6D59C70
00007FFC87288DE2 jmp 00007FFC87288E4B
00007FFC87288E4B nop
00007FFC87288E4C vmovaps xmm6,xmmword ptr [rsp+60h]
00007FFC87288E53 add rsp,78h
00007FFC87288E57 pop rbx
00007FFC87288E58 pop rbp
00007FFC87288E59 pop rsi
00007FFC87288E5A pop rdi
00007FFC87288E5B pop r12
00007FFC87288E5D pop r13
00007FFC87288E5F pop r14
00007FFC87288E61 pop r15
00007FFC87288E63 ret
AddByValueTypeSwitch_Class
68: switch(lhs.Value)
00007FFC87269080 push r15
00007FFC87269082 push r14
00007FFC87269084 push r12
00007FFC87269086 push rdi
00007FFC87269087 push rsi
00007FFC87269088 push rbp
00007FFC87269089 push rbx
00007FFC8726908A sub rsp,90h
00007FFC87269091 vzeroupper
00007FFC87269094 vmovaps xmmword ptr [rsp+80h],xmm6
00007FFC8726909E vmovaps xmmword ptr [rsp+70h],xmm7
00007FFC872690A5 mov rsi,rcx
00007FFC872690A8 lea rdi,[rsp+50h]
00007FFC872690AD mov ecx,6
00007FFC872690B2 xor eax,eax
00007FFC872690B4 rep stos dword ptr [rdi]
00007FFC872690B6 mov rcx,rsi
00007FFC872690B9 mov qword ptr [rsp+68h],rcx
00007FFC872690BE mov rdi,rcx
00007FFC872690C1 mov rsi,r8
00007FFC872690C4 mov rbx,qword ptr [rdx+8]
00007FFC872690C8 test rbx,rbx
00007FFC872690CB je 00007FFC87269557
00007FFC872690D1 mov rbp,rbx
00007FFC872690D4 mov rdx,rbp
00007FFC872690D7 mov rcx,7FFCE69B6930h
00007FFC872690E1 cmp qword ptr [rbp],rcx
00007FFC872690E5 je 00007FFC872690E9
00007FFC872690E7 xor edx,edx
00007FFC872690E9 test rdx,rdx
00007FFC872690EC je 00007FFC87269119
00007FFC87269119 mov rbp,rbx
00007FFC8726911C mov rdx,rbp
00007FFC8726911F mov rcx,7FFCE69B6768h
00007FFC87269129 cmp qword ptr [rbp],rcx
00007FFC8726912D je 00007FFC87269131
00007FFC8726912F xor edx,edx
00007FFC87269131 test rdx,rdx
00007FFC87269134 je 00007FFC87269163
00007FFC87269163 mov rbp,rbx
00007FFC87269166 mov rdx,rbp
00007FFC87269169 mov rcx,7FFC8734A6B8h
00007FFC87269173 cmp qword ptr [rdx],rcx
00007FFC87269176 je 00007FFC8726917A
00007FFC87269178 xor edx,edx
00007FFC8726917A test rdx,rdx
00007FFC8726917D je 00007FFC872691AA
00007FFC872691AA mov rbp,rbx
00007FFC872691AD mov rdx,rbp
00007FFC872691B0 mov rcx,7FFC8734A820h
00007FFC872691BA cmp qword ptr [rbp],rcx
00007FFC872691BE je 00007FFC872691C2
00007FFC872691C0 xor edx,edx
00007FFC872691C2 test rdx,rdx
00007FFC872691C5 je 00007FFC872691F7
00007FFC872691F7 mov r12,rbx
00007FFC872691FA mov rdx,7FFC8734A988h
68: switch(lhs.Value)
00007FFC87269204 cmp qword ptr [r12],rdx
00007FFC87269208 je 00007FFC8726920D
00007FFC8726920D test r12,r12
00007FFC87269210 jne 00007FFC87269456
00007FFC87269456 mov rcx,qword ptr [rsi+8]
00007FFC8726945A test rcx,rcx
00007FFC8726945D je 00007FFC87269470
00007FFC8726945F mov rax,7FFC8734A988h
00007FFC87269469 cmp qword ptr [rcx],rax
00007FFC8726946C je 00007FFC87269470
00007FFC8726946E xor ecx,ecx
00007FFC87269470 mov rbx,rcx
00007FFC87269473 test rbx,rbx
00007FFC87269476 je 00007FFC87269557
97: return new Container<IntClass>(intClassL + r) as Container<T>;
00007FFC8726947C mov rcx,7FFC8734A988h
00007FFC87269486 call 00007FFCE6D5B3B0
00007FFC8726948B mov rsi,rax
00007FFC8726948E mov ecx,dword ptr [r12+8]
00007FFC87269493 add ecx,dword ptr [rbx+8]
00007FFC87269496 mov dword ptr [rsi+8],ecx
00007FFC87269499 mov rcx,7FFC8734B7C8h
00007FFC872694A3 call 00007FFCE6D5B3B0
00007FFC872694A8 mov rbx,rax
00007FFC872694AB lea rcx,[rbx+8]
00007FFC872694AF mov rdx,rsi
00007FFC872694B2 call 00007FFCE6D59F10
00007FFC872694B7 mov rcx,rdi
00007FFC872694BA mov rdx,rbx
00007FFC872694BD call 00007FFCE6D59C70
00007FFC872694C2 jmp 00007FFC87269533
00007FFC87269533 nop
00007FFC87269534 vmovaps xmm6,xmmword ptr [rsp+80h]
00007FFC8726953E vmovaps xmm7,xmmword ptr [rsp+70h]
00007FFC87269545 add rsp,90h
00007FFC8726954C pop rbx
00007FFC8726954D pop rbp
00007FFC8726954E pop rsi
00007FFC8726954F pop rdi
00007FFC87269550 pop r12
00007FFC87269552 pop r14
00007FFC87269554 pop r15
00007FFC87269556 ret
AddByTypeof_Class
114: if(typeof(T) == typeof(int))
00007FFC87269780 push rdi
00007FFC87269781 push rsi
00007FFC87269782 push rbp
00007FFC87269783 push rbx
00007FFC87269784 sub rsp,48h
00007FFC87269788 vzeroupper
00007FFC8726978B mov qword ptr [rsp+40h],rcx
00007FFC87269790 mov rsi,rcx
00007FFC87269793 mov rdi,r8
00007FFC87269796 mov rcx,qword ptr [rsi+30h]
00007FFC8726979A mov rcx,qword ptr [rcx]
00007FFC8726979D mov rbx,qword ptr [rcx]
00007FFC872697A0 mov rcx,rbx
00007FFC872697A3 mov ebp,ecx
00007FFC872697A5 and ebp,1
140: }
141:
142: if(typeof(T) == typeof(IntClass))
00007FFC872697A8 mov rcx,rbx
00007FFC872697AB test ebp,ebp
00007FFC872697AD je 00007FFC872697B3
00007FFC872697B3 mov rax,7FFC8734A988h
00007FFC872697BD cmp rcx,rax
00007FFC872697C0 jne 00007FFC8726983C
143: {
144: var l = lhs as Container<IntClass>;
00007FFC872697C2 mov rcx,7FFC8734B7C8h
00007FFC872697CC call 00007FFCE6D59C70
00007FFC872697D1 mov rbx,rax
00007FFC872697D4 mov rdx,rdi
00007FFC872697D7 mov rcx,7FFC8734B7C8h
00007FFC872697E1 call 00007FFCE6D59C70
00007FFC872697E6 mov rbp,qword ptr [rbx+8]
00007FFC872697EA mov rdi,qword ptr [rax+8]
00007FFC872697EE mov rcx,7FFC8734A988h
00007FFC872697F8 call 00007FFCE6D5B3B0
00007FFC872697FD mov rbx,rax
00007FFC87269800 mov ecx,dword ptr [rbp+8]
00007FFC87269803 add ecx,dword ptr [rdi+8]
00007FFC87269806 mov dword ptr [rbx+8],ecx
00007FFC87269809 mov rcx,7FFC8734B7C8h
00007FFC87269813 call 00007FFCE6D5B3B0
00007FFC87269818 mov rdi,rax
00007FFC8726981B lea rcx,[rdi+8]
00007FFC8726981F mov rdx,rbx
00007FFC87269822 call 00007FFCE6D59F10
146: return new Container<IntClass>(l.Value + r.Value) as Container<T>;
00007FFC87269827 mov rcx,rsi
00007FFC8726982A mov rdx,rdi
00007FFC8726982D call 00007FFCE6D59C70
00007FFC87269832 nop
00007FFC87269833 add rsp,48h
00007FFC87269837 pop rbx
00007FFC87269838 pop rbp
00007FFC87269839 pop rsi
00007FFC8726983A pop rdi
00007FFC8726983B ret
Method | Instruction Count | Mean |
---|---|---|
AddByOverload_Class | 24 | 167.72 us |
AddByStaticStrategy_Class | 69 | 388.40 us |
AddByContainerTypeSwitch_Class | 80 | 416.41 us |
AddByValueTypeSwitch_Class | 99 | 415.41 us |
AddByTypeof_Class | 51 | 256.51 us |
プリミティブ型や構造体のときと異なり、全体的に条件分岐の消去ができていない印象があります。
それでもAddByTypeof
は命令数も少なめで実測値もよろしく優秀。
StaticStrategy
もデザパタ的美しさの割りにはそれほど悪くはなさそうに感じます。
結論
typeof
を駆使するのが最速っぽそう
.Net Core上で、T
が値型の場合には普通に非ジェネリックオーバーロードで呼び分けるのと変わらない性能が出ます。
ただし、UnityではJIT最適化が甘いのかtypeof
での特殊化を書いても大幅に性能向上という感じではなさそうです。
また、最適化品質はプリミティブ型>構造体>>>クラスという感じで、特に値型と参照型の壁は非常に大きいものがあるみたいです。よっぽどじゃない限りクラスを使え、とは昔から言われていますが、これだけ最適化の恩恵があるなら構造体を使いたい欲が湧いてきます。イミュータブルなデータクラスならラッパー構造体を作るという手もありますね。
正直な話、いくらILに専用命令があるとはいえ3typeof
比較がこんなに速いとは思っていませんでした。
これだけ高品質な最適化が働くなら積極的に使っていってもいいのではないでしょうか?
未検証テーマ
-
加算のようにILで1命令で収まる小規模コードではなく、インライン化が利かなさそうな大規模処理相手だとどうなるか?
-
if
文やパターンマッチングswitch
文には本来順序依存性があるが、特殊化においてその影響はないのか?- → JIT結果を見る限りでは最適化で条件分岐がまるまる消えるので影響はない?確証を得るにはベンチマークを取って見る必要がありそう。
-
特殊化で分岐ルートが決定した後は
T
が何なのかわかっているはずなので、もっと効率の良いキャスト手段はないだろうか?-
System.Runtime.CompilerServices.Unsafe.As<TFrom, TTo>
とか?
-
-
.Net Framework, Unityや、.Net Coreの別のバージョンでのJIT検証
- UnityってJIT結果見る方法あるの?
編集履歴
-
2019/03/06
- リフレクションの定義に関する誤情報の訂正
-
typeof
が専用命令になるという誤情報の訂正 - Unityバージョンを加筆
- markdownバグの修正
- typo修正
-
2019/03/13
- typo修正
-
2020/10/19
- 細かな文章表現の調整