Theanoにおけるキャスト

theanoは型付が非常に厳格なライブラリです。

pythonのような強力な型推論はないので、当然キャストを活用することになります。

とりあえずcastしてみる

theanoのキャストはtheano.tensor.cast()で実装されています。

早速castを使ってみましょう。
デフォルトではfloat64型で定義される行列を、int32型に変換します。

import theano.tensor as T
x = T.matrix()
x_as_int = T.cast(x, 'int32')

type(x), type(x_as_int)

出力結果

(theano.tensor.var.TensorVariable, theano.tensor.var.TensorVariable)

キャストされてない？

上の結果ではどちらも同じ型（TensorVariable）が表示されています。

これはTheanoの仕様で、値ではなくシンボルを変数で定義しているからです。

シンボルの中身の型を見るには、theano.printing.debugprint()を使います。

xの型

import theano
theano.printing.debugprint(x)

出力結果

<TensorType(float64, matrix)> [id A]

x_as_intの型

theano.printing.debugprint(x_as_int)

出力結果

Elemwise{Cast{int32}} [id A] ''   
 |<TensorType(float64, matrix)> [id B]

なるほど、もともとfloat64型だったのが、x_as_intではint32にキャストされてそうです。

実際に値を入れてみて、挙動を確認しましょう。

関数を通して型を確認

シンボルに値を入れるには、関数を定義する必要があります。

シンボルを用いた数式を関数として定義して、その関数に値を入力することで、数式の処理結果が関数の返り値として返ります。

この際、入力値の型がシンボルの型に依存します。

不適な型を入れた場合にはエラーを吐くので、

int32型のシンボルx_as_intを入力に設定した場合、小数を入力するとエラーとなるはずです。

関数の定義

import numpy as np

mat = np.array([[1.0, 0.0], [0.0, 1.0]], dtype="float64")
mat_int = np.array([[1, 0], [0, 1]], dtype="int32")

y = x * 2
f = theano.function(inputs=[x], outputs=y)

y_as_int = x_as_int * 2
f_as_int = theano.function(inputs=[x_as_int], outputs=y_as_int)

f(x)

f(mat)

実行結果

array([[ 2.,  0.],
       [ 0.,  2.]])

f(x_as_int)

f(mat_int)

実行結果

array([[ 2.,  0.],
       [ 0.,  2.]])

f_as_int(x)

f_as_int(mat)

実行結果


---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-56-31692f0163e9> in <module>()
----> 1 f_as_int(mat)

/home/ubuntu/anaconda3/lib/python3.5/site-packages/theano/compile/function_module.py in __call__(self, *args, **kwargs)
    786                         s.storage[0] = s.type.filter(
    787                             arg, strict=s.strict,
--> 788                             allow_downcast=s.allow_downcast)
    789 
    790                     except Exception as e:

/home/ubuntu/anaconda3/lib/python3.5/site-packages/theano/tensor/type.py in filter(self, data, strict, allow_downcast)
    138                             '"function".'
    139                             % (self, data.dtype, self.dtype))
--> 140                         raise TypeError(err_msg, data)
    141                 elif (allow_downcast is None and
    142                         type(data) is float and

TypeError: ('Bad input argument to theano function with name "<ipython-input-54-50af382d0dd4>:2" at index 0 (0-based)', 'TensorType(int32, matrix) cannot store a value of dtype float64 without risking loss of precision. If you do not mind this loss, you can: 1) explicitly cast your data to int32, or 2) set "allow_input_downcast=True" when calling "function".', array([[ 1.,  0.],
       [ 0.,  1.]]))

f_as_int(x_as_int)

array([[2, 0],
       [0, 2]], dtype=int32)

予想通りに、入力シンボルにint32型を指定しながら入力値をfloat64型としたf_as_int(mat)ではエラーを吐きました。

これで無事にtheanoにおけるキャストの動きがわかりました。

Theanoでキャストする方法

Theanoにおけるキャスト

とりあえずcastしてみる

キャストされてない？

関数を通して型を確認