More than 3 years have passed since last update.

torch.dotとtorch.mmとtorch.mvとtorch.bmmとtorch.matmulを比較する

Last updated at 2020-08-17Posted at 2019-05-05

何をするか

pytorchの行列積演算関数のインプット・アウトプットを確認する。
torch.dotとtorch.mmとtorch.mvとtorch.bmmとtorch.matmulを比較する。
注意：返り値を保存する引数outについては、無視します。
まとめ：dot,mm,mv,bmmは特定の次元専用、matmulはいろいろな次元を計算してくれる。
※documentationのバージョンアップに伴いリンク修正（2020.08.17）
※torch.bmmが遅い件について更新（2020.08.17）

documentation一覧

前提

0次元の値をスカラー（scalar）
1次元の値をベクトル（vector）
2次元の値を行列（matrix）
3次元以上の値をテンソル(tensor)

とします。

shapeによるtorch.Sizeの表現

>>> # scalar
>>> a = np.array(1)
>>> torch.tensor(a).shape
torch.Size([])

>>> # vector
>>> a = np.array([1,1])
>>> torch.tensor(a).shape
torch.Size([2])

>>> # matrix
>>> a = np.array([[1,1],[1,1]])
>>> torch.tensor(a).shape
torch.Size([2, 2])

>>> # tensor
>>> a = np.array([[[1,1],[1,1],[1,1]]])
>>> torch.tensor(a).shape
torch.Size([1, 3, 2])

torch.dot

なにこれ

1次元のベクトル同士の積を計算します。(documentation)

dot

torch.dot(tensor1, tensor2) → Tensor

変数

インプット

input

>>> tensor1.shape
torch.Size([n])
>>> tensor2.shape
torch.Size([n])

アウトプット

output

>>> out.shape
torch.Size([])

使用例

example

>>> torch.dot(torch.tensor([2, 3]), torch.tensor([2, 1]))
tensor(7)

大事なこと

broadcastしません。1次元×1次元専用です。

torch.mm

なにこれ

2次元の行列同士の積を計算します。(documentation)。

torch.mm(mat1, mat2, out=None) → Tensor

変数

インプット

input

>>> mat1.shape
torch.Size([n, m])
>>> mat2.shape
torch.Size([m, p])

アウトプット

output

>>> out.shape
torch.Size([n, p])

使用例

example

>>> mat1 = torch.randn(2, 3)
>>> mat2 = torch.randn(3, 3)
>>> torch.mm(mat1, mat2)
tensor([[ 0.4851,  0.5037, -0.3633],
        [-0.0760, -3.6705,  2.4784]])

大事なこと

broadcastしません。2次元×2次元専用です。

torch.mv

なにこれ

2次元×1次元の積を計算します。(documentation)。

torch.mv(mat, vec, out=None) → Tensor

変数

インプット

input

>>> mat.shape
torch.Size([n, m])
>>> vec.shape
torch.Size([m])

アウトプット

output

>>> out.shape
torch.Size([n])

使用例

example

>>> mat = torch.randn(2, 3)
>>> vec = torch.randn(3)
>>> torch.mv(mat, vec)
tensor([ 1.0404, -0.6361])

大事なこと

broadcastしません。2次元×1次元専用です。

torch.bmm

なにこれ

バッチごとに2次元×2次元の行列積を演算するので、3次元×3次元の計算をします。(documentation)。

bmm

torch.bmm(batch1, batch2, out=None) → Tensor

変数

インプット

input

>>> batch1.shape
torch.Size([batch, n, m])
>>> batch2.shape
torch.Size([batch, m, p])

アウトプット

output

>>> out.shape
torch.Size([batch, n, p])

使用例

example

>>> batch1 = torch.randn(10, 3, 4)
>>> batch2 = torch.randn(10, 4, 5)
>>> res = torch.bmm(batch1, batch2)
>>> res.size()
torch.Size([10, 3, 5])

大事なこと

broadcastしません。3次元×3次元専用です。
遅いらしいです。⇒[PyTorch] torch.bmmよりも速く、batchごとに内積を計算する方法があった話
割と早くなりました。⇒torch.bmmのスピード改善について

torch.matmul

なにこれ

一般に積を計算します。documentation

matmul

torch.matmul(tensor1, tensor2, out=None) → Tensor

変数

インプット

input

3次元に限られません
使用例で確認します。

アウトプット

output

3次元に限られません。
使用例で確認します。

使用例

example

>>> # 1次元 × 1次元 -> 0次元
>>> # If both tensors are 1-dimensional,
>>> # the dot product (scalar) is returned.
>>> tensor1 = torch.randn(3)
>>> tensor2 = torch.randn(3)
>>> torch.matmul(tensor1, tensor2).size()
torch.Size([])

>>> # 2次元 × 1次元 -> 1次元
>>> # If the first argument is 2-dimensional and
>>> # the second argument is 1-dimensional, the
>>> # matrix-vector product is returned.
>>> tensor1 = torch.randn(3, 4)
>>> tensor2 = torch.randn(4)
>>> torch.matmul(tensor1, tensor2).size()
torch.Size([3])

>>> # 3次元 × 1次元 -> 2次元
>>> # documentationの5つ目のドットを見てください。
>>> tensor1 = torch.randn(10, 3, 4)
>>> tensor2 = torch.randn(4)
>>> torch.matmul(tensor1, tensor2).size()
torch.Size([10, 3])

>>> # 3次元 × 3次元 -> 3次元
>>> # bmmと同じ。
>>> tensor1 = torch.randn(10, 3, 4)
>>> tensor2 = torch.randn(10, 4, 5)
>>> torch.matmul(tensor1, tensor2).size()
torch.Size([10, 3, 5])

>>> # 3次元 × 2次元 -> 3次元
>>> # tensor2をブロードキャストして、(1, 4, 5)とした上で、bmmをする。
>>> tensor1 = torch.randn(10, 3, 4)
>>> tensor2 = torch.randn(4, 5)
>>> torch.matmul(tensor1, tensor2).size()
torch.Size([10, 3, 5])

>>> # 4次元 × 3次元 -> 4次元
>>> # tensor2をブロードキャストして、(1, 5, 4, 2)とした上で、
>>> # 前半2次元同士と、後半2次元同士の積をとる。
>>> # documentationの5つ目のドットを見てください。
>>> tensor1 = torch.randn(10, 1, 3, 4)
>>> tensor2 = torch.randn(5, 4, 2)
>>> torch.matmul(tensor1, tensor2).size()
torch.Size([10, 5, 3, 2])

>>> # 5次元 × 5次元 -> 5次元
>>> # 1,3次元目は一致しなければならない。
>>> # 4,5次元目は2次元x2次元の行列積。
>>> # 2次元目はbroadcastされる。
>>> ##### 正直何が起こっているのか何を意味しているのかよくわからない（笑）
>>> tensor1 = torch.randn(10, 1, 2, 3, 4)
>>> tensor2 = torch.randn(10, 9, 2, 4, 2)
>>> torch.matmul(tensor1, tensor2).size()
torch.Size([10, 9, 2, 3, 2])

分岐

matmulの分岐が多すぎてよくわからないので、一応下にコピペしておきます。翻訳ミスのご指摘歓迎です。

1つ目 (1次元×1次元)

If both tensors are 1-dimensional, the dot product (scalar) is returned.

両方のテンソルが1次元である場合、ドット積（スカラー）が返り値となる

2つ目 (2次元×2次元)

If both arguments are 2-dimensional, the matrix-matrix product is returned.

両方の引数が2次元である場合、行列-行列積が返り値となる

3つ目 (1次元×2次元)

If the first argument is 1-dimensional and the second argument is 2-dimensional, a 1 is prepended to its dimension for the purpose of the matrix multiply. After the matrix multiply, the prepended dimension is removed.

1つ目の引数が1次元で、2つ目の引数が2次元である場合、1つ目の引数の1次元目に"1"をつけて、行列積を計算する。行列積の計算後、つけた次元は削除される。（意訳）

4つ目 (2次元×1次元)

If the first argument is 2-dimensional and the second argument is 1-dimensional, the matrix-vector product is returned.

1つ目の引数が2次元で、2つ目の引数が1次元である場合、行列-ベクトル積が返り値となる

5つ目 (3次元以上の場合)

If both arguments are at least 1-dimensional and at least one argument is N-dimensional (where N > 2), then a batched matrix multiply is returned.

両方の引数が少なくとも1次元あり、少なくとも片方の引数が3次元以上である場合、バッチごとの行列積が返り値となる。

If the first argument is 1-dimensional, a 1 is prepended to its dimension for the purpose of the batched matrix multiply and removed after.

1つ目の引数が1次元であり、2つ目の引数が3次元以上である場合、1つ目の引数の1次元目に"1"をつけて、行列積を計算する。行列積の計算後、つけた次元は削除される。

If the second argument is 1-dimensional, a 1 is appended to its dimension for the purpose of the batched matrix multiple and removed after.

1つ目の引数が3次元以上であり、2つ目の引数が1次元である場合、2つ目の引数の2次元目以降に"1"をつけて、バッチごとの行列積を計算する。計算後、つけた次元は削除される。

The non-matrix (i.e. batch) dimensions are broadcasted (and thus must be broadcastable). For example, if tensor1 is a (j×1×n×m) tensor and tensor2 is a (k×m×p) tensor, out will be an (j×k×n×p) tensor.

バッチなど、行列以外の次元は、ブロードキャストされる。そのため、行列以外の次元はブロードキャストできるものでなければならない。例えば、tensor1が(j×1×n×m)のテンソルで、tensor2が(k×m×p)のテンソルである場合、返り値は、(j×k×n×p)のテンソルになる。

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up