LoginSignup
0
0

[PyTorch / python] RuntimeError: mixed dtype (CPU): expect input to have scalar type of BFloat16 の対策

Last updated at Posted at 2023-09-03

1. 症状

---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
<ipython-input-80-7ddaf143f54b> in <cell line: 6>()
      4 fc2 = nn.Linear(10, 2)
      5 
----> 6 h = bn1(coordinates)
      7 h = fc1(h)
      8 h = F.relu(h)

2 frames
/usr/local/lib/python3.10/dist-packages/torch/nn/functional.py in batch_norm(input, running_mean, running_var, weight, bias, training, momentum, eps)
   2448         _verify_batch_size(input.size())
   2449 
-> 2450     return torch.batch_norm(
   2451         input, weight, bias, running_mean, running_var, training, momentum, eps, torch.backends.cudnn.enabled
   2452     )

RuntimeError: mixed dtype (CPU): expect input to have scalar type of BFloat16

2. Environment

Versionなど
OS Google Colaboratory
python 3.10.12
torch 2.0.1+cu118

3. 結論

torchへの入力データとなるpandasのデータフレームに、 .astype('f') を適用したところ、エラー解消した。

4. エラー発生したコードと対策

上記エラーが発生したコードを掲載しておく。
以下のように記述すれば、エラー発生とエラーの解消を再現できる。

import文など、package準備

colaboratory
from typing import List, Tuple

import numpy as np
import pandas as pd
import torch
torch.cuda.empty_cache()

import torch.nn as nn
import torch.nn.functional as F

データの準備

colaboratory
orig_x = [5,  25, 33, 20, 50,  50, 75, 90, 98, 110, 120, 135, 145, 152,  175, 185, 200, 100, 80]
orig_y = [80, 30,  0, 50, 10,  40, 20,  5, 40,  40,  20,   5,  40,  30, 12.5,  70, 100,  40, 35]
real_x = [5,   3,  0, 10, 20, 135, 70, 80, 98, 110, 125, 150, 146, 165,  200, 190, 200, 100, 80]
real_y = [40, 16,  0, 40, 12,  40, 25, 15, 48,  45,  23,   5,  40,  28,    5,  40,  50,  50, 45]

df_point_map: pd.DataFrame = pd.DataFrame(
    {
        "orig_x": orig_x,
        "orig_y": orig_y,
        "real_x": real_x,
        "real_y": real_y,
    }
)

df_point_map.head(3)

# 以下のようなデータを使った
orig_x	orig_y	real_x	real_y
0	5	80.0	5	40
1	25	30.0	3	16
2	33	0.0	0	0

pytorch用のデータセット作成 (原因はここだった)

エラーの原因はここにあった(`・ω・´)

# Dataset作成
class PointMappingDataset(torch.utils.data.Dataset):
    def __init__(self, df_point_map: pd.DataFrame):
        assert np.all(
            df_point_map.columns == ["orig_x", "orig_y", "real_x", "real_y"]
        ), f"カラムが不足しています"

        # NOTE: ***************** 🚧修正前🚧 *****************
        self.orig_coordinates = df_point_map[["orig_x", "orig_y"]].values
        self.x_real_coordinates = df_point_map["real_x"].values
        self.y_real_coordinates = df_point_map["real_y"].values

        # NOTE: ***************** 🌟修正後🌟 *****************
        #     以下のように .astype('f') を適用することで解決した
        # self.orig_coordinates = df_point_map[["orig_x", "orig_y"]].astype('f').values
        # self.x_real_coordinates = df_point_map["real_x"].astype('f').values
        # self.y_real_coordinates = df_point_map["real_y"].astype('f').values

    def __getitem__(self, index: int) -> Tuple[torch.Tensor, torch.Tensor, torch.Tensor]:
        orig_coordinate = self.orig_coordinates[index]
        x_real = self.x_real_coordinates[index]
        y_real = self.y_real_coordinates[index]

        return orig_coordinate, x_real, y_real

    def __len__(self):
        return len(self.x_real_coordinates)


mapping_dataset: PointMappingDataset = PointMappingDataset(df_point_map)

# DataLoader作成
train_loader = torch.utils.data.DataLoader(
    mapping_dataset, batch_size=2, shuffle=True, drop_last=True
)

推論処理の動作確認

colaboratory
# batch を1つだけ取り出す
# https://output-zakki.com/dataloader_iter_and_next/
batch = next(iter(train_loader))

# batchの内容
[tensor([[175.0000,  12.5000],
         [ 25.0000,  30.0000]], dtype=torch.float64),
 tensor([200.,   3.]),
 tensor([ 5., 16.])]
colaboratory
coordinates, x_real, y_real = batch

# error 発生行
bn1 = nn.BatchNorm1d(2)  # バッチ正規化

---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
<ipython-input-80-7ddaf143f54b> in <cell line: 6>()
      4 fc2 = nn.Linear(10, 2)
      5 
----> 6 h = bn1(coordinates)
      7 h = fc1(h)
      8 h = F.relu(h)

2 frames
/usr/local/lib/python3.10/dist-packages/torch/nn/functional.py in batch_norm(input, running_mean, running_var, weight, bias, training, momentum, eps)
   2448         _verify_batch_size(input.size())
   2449 
-> 2450     return torch.batch_norm(
   2451         input, weight, bias, running_mean, running_var, training, momentum, eps, torch.backends.cudnn.enabled
   2452     )

RuntimeError: mixed dtype (CPU): expect input to have scalar type of BFloat16

上記の箇所でエラー発生していたが、前記「🌟修正後🌟」と書かれているように .astype('f') を適用したところ、以下のように推論処理が先へ進んだ!\( ˙꒳˙ \三/ ˙꒳˙)/

colaboratory
fc1 = nn.Linear(2, 10)
dropout1 = nn.Dropout(p=0.3)
fc2 = nn.Linear(10, 2)

h = bn1(coordinates)
h = fc1(h)
h = F.relu(h)
h = dropout1(h)
h = fc2(h)
pred_results = F.relu(h)
pred_results

# 結果
tensor([[0.0661, 0.0000],
        [0.6185, 0.2009]], grad_fn=<ReluBackward0>)
0
0
0

Register as a new user and use Qiita more conveniently

  1. You get articles that match your needs
  2. You can efficiently read back useful information
  3. You can use dark theme
What you can do with signing up
0
0