1. 症状
---------------------------------------------------------------------------
RuntimeError Traceback (most recent call last)
<ipython-input-80-7ddaf143f54b> in <cell line: 6>()
4 fc2 = nn.Linear(10, 2)
5
----> 6 h = bn1(coordinates)
7 h = fc1(h)
8 h = F.relu(h)
2 frames
/usr/local/lib/python3.10/dist-packages/torch/nn/functional.py in batch_norm(input, running_mean, running_var, weight, bias, training, momentum, eps)
2448 _verify_batch_size(input.size())
2449
-> 2450 return torch.batch_norm(
2451 input, weight, bias, running_mean, running_var, training, momentum, eps, torch.backends.cudnn.enabled
2452 )
RuntimeError: mixed dtype (CPU): expect input to have scalar type of BFloat16
2. Environment
Versionなど | |
---|---|
OS | Google Colaboratory |
python | 3.10.12 |
torch | 2.0.1+cu118 |
3. 結論
torchへの入力データとなるpandasのデータフレームに、 .astype('f')
を適用したところ、エラー解消した。
4. エラー発生したコードと対策
上記エラーが発生したコードを掲載しておく。
以下のように記述すれば、エラー発生とエラーの解消を再現できる。
import文など、package準備
colaboratory
from typing import List, Tuple
import numpy as np
import pandas as pd
import torch
torch.cuda.empty_cache()
import torch.nn as nn
import torch.nn.functional as F
データの準備
colaboratory
orig_x = [5, 25, 33, 20, 50, 50, 75, 90, 98, 110, 120, 135, 145, 152, 175, 185, 200, 100, 80]
orig_y = [80, 30, 0, 50, 10, 40, 20, 5, 40, 40, 20, 5, 40, 30, 12.5, 70, 100, 40, 35]
real_x = [5, 3, 0, 10, 20, 135, 70, 80, 98, 110, 125, 150, 146, 165, 200, 190, 200, 100, 80]
real_y = [40, 16, 0, 40, 12, 40, 25, 15, 48, 45, 23, 5, 40, 28, 5, 40, 50, 50, 45]
df_point_map: pd.DataFrame = pd.DataFrame(
{
"orig_x": orig_x,
"orig_y": orig_y,
"real_x": real_x,
"real_y": real_y,
}
)
df_point_map.head(3)
# 以下のようなデータを使った
orig_x orig_y real_x real_y
0 5 80.0 5 40
1 25 30.0 3 16
2 33 0.0 0 0
pytorch用のデータセット作成 (原因はここだった)
エラーの原因はここにあった(`・ω・´)
# Dataset作成
class PointMappingDataset(torch.utils.data.Dataset):
def __init__(self, df_point_map: pd.DataFrame):
assert np.all(
df_point_map.columns == ["orig_x", "orig_y", "real_x", "real_y"]
), f"カラムが不足しています"
# NOTE: ***************** 🚧修正前🚧 *****************
self.orig_coordinates = df_point_map[["orig_x", "orig_y"]].values
self.x_real_coordinates = df_point_map["real_x"].values
self.y_real_coordinates = df_point_map["real_y"].values
# NOTE: ***************** 🌟修正後🌟 *****************
# 以下のように .astype('f') を適用することで解決した
# self.orig_coordinates = df_point_map[["orig_x", "orig_y"]].astype('f').values
# self.x_real_coordinates = df_point_map["real_x"].astype('f').values
# self.y_real_coordinates = df_point_map["real_y"].astype('f').values
def __getitem__(self, index: int) -> Tuple[torch.Tensor, torch.Tensor, torch.Tensor]:
orig_coordinate = self.orig_coordinates[index]
x_real = self.x_real_coordinates[index]
y_real = self.y_real_coordinates[index]
return orig_coordinate, x_real, y_real
def __len__(self):
return len(self.x_real_coordinates)
mapping_dataset: PointMappingDataset = PointMappingDataset(df_point_map)
# DataLoader作成
train_loader = torch.utils.data.DataLoader(
mapping_dataset, batch_size=2, shuffle=True, drop_last=True
)
推論処理の動作確認
colaboratory
# batch を1つだけ取り出す
# https://output-zakki.com/dataloader_iter_and_next/
batch = next(iter(train_loader))
# batchの内容
[tensor([[175.0000, 12.5000],
[ 25.0000, 30.0000]], dtype=torch.float64),
tensor([200., 3.]),
tensor([ 5., 16.])]
colaboratory
coordinates, x_real, y_real = batch
# error 発生行
bn1 = nn.BatchNorm1d(2) # バッチ正規化
---------------------------------------------------------------------------
RuntimeError Traceback (most recent call last)
<ipython-input-80-7ddaf143f54b> in <cell line: 6>()
4 fc2 = nn.Linear(10, 2)
5
----> 6 h = bn1(coordinates)
7 h = fc1(h)
8 h = F.relu(h)
2 frames
/usr/local/lib/python3.10/dist-packages/torch/nn/functional.py in batch_norm(input, running_mean, running_var, weight, bias, training, momentum, eps)
2448 _verify_batch_size(input.size())
2449
-> 2450 return torch.batch_norm(
2451 input, weight, bias, running_mean, running_var, training, momentum, eps, torch.backends.cudnn.enabled
2452 )
RuntimeError: mixed dtype (CPU): expect input to have scalar type of BFloat16
上記の箇所でエラー発生していたが、前記「🌟修正後🌟」と書かれているように .astype('f')
を適用したところ、以下のように推論処理が先へ進んだ!\( ˙꒳˙ \三/ ˙꒳˙)/
colaboratory
fc1 = nn.Linear(2, 10)
dropout1 = nn.Dropout(p=0.3)
fc2 = nn.Linear(10, 2)
h = bn1(coordinates)
h = fc1(h)
h = F.relu(h)
h = dropout1(h)
h = fc2(h)
pred_results = F.relu(h)
pred_results
# 結果
tensor([[0.0661, 0.0000],
[0.6185, 0.2009]], grad_fn=<ReluBackward0>)