More than 1 year has passed since last update.

[PyTorch / python] RuntimeError: mixed dtype (CPU): expect input to have scalar type of BFloat16 の対策

Last updated at 2023-09-03Posted at 2023-09-03

1. 症状

---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
<ipython-input-80-7ddaf143f54b> in <cell line: 6>()
      4 fc2 = nn.Linear(10, 2)
      5 
----> 6 h = bn1(coordinates)
      7 h = fc1(h)
      8 h = F.relu(h)

2 frames
/usr/local/lib/python3.10/dist-packages/torch/nn/functional.py in batch_norm(input, running_mean, running_var, weight, bias, training, momentum, eps)
   2448         _verify_batch_size(input.size())
   2449 
-> 2450     return torch.batch_norm(
   2451         input, weight, bias, running_mean, running_var, training, momentum, eps, torch.backends.cudnn.enabled
   2452     )

RuntimeError: mixed dtype (CPU): expect input to have scalar type of BFloat16

2. Environment

	Versionなど
OS	Google Colaboratory
python	3.10.12
torch	2.0.1+cu118

3. 結論

torchへの入力データとなるpandasのデータフレームに、 .astype('f') を適用したところ、エラー解消した。

4. エラー発生したコードと対策

上記エラーが発生したコードを掲載しておく。
以下のように記述すれば、エラー発生とエラーの解消を再現できる。

import文など、package準備

colaboratory

from typing import List, Tuple

import numpy as np
import pandas as pd
import torch
torch.cuda.empty_cache()

import torch.nn as nn
import torch.nn.functional as F

データの準備

colaboratory

orig_x = [5,  25, 33, 20, 50,  50, 75, 90, 98, 110, 120, 135, 145, 152,  175, 185, 200, 100, 80]
orig_y = [80, 30,  0, 50, 10,  40, 20,  5, 40,  40,  20,   5,  40,  30, 12.5,  70, 100,  40, 35]
real_x = [5,   3,  0, 10, 20, 135, 70, 80, 98, 110, 125, 150, 146, 165,  200, 190, 200, 100, 80]
real_y = [40, 16,  0, 40, 12,  40, 25, 15, 48,  45,  23,   5,  40,  28,    5,  40,  50,  50, 45]

df_point_map: pd.DataFrame = pd.DataFrame(
    {
        "orig_x": orig_x,
        "orig_y": orig_y,
        "real_x": real_x,
        "real_y": real_y,
    }
)

df_point_map.head(3)

# 以下のようなデータを使った
orig_x	orig_y	real_x	real_y
0	5	80.0	5	40
1	25	30.0	3	16
2	33	0.0	0	0

pytorch用のデータセット作成 (原因はここだった)

エラーの原因はここにあった(｀･ω･´)

# Dataset作成
class PointMappingDataset(torch.utils.data.Dataset):
    def __init__(self, df_point_map: pd.DataFrame):
        assert np.all(
            df_point_map.columns == ["orig_x", "orig_y", "real_x", "real_y"]
        ), f"カラムが不足しています"

        # NOTE: ***************** 🚧修正前🚧 *****************
        self.orig_coordinates = df_point_map[["orig_x", "orig_y"]].values
        self.x_real_coordinates = df_point_map["real_x"].values
        self.y_real_coordinates = df_point_map["real_y"].values

        # NOTE: ***************** 🌟修正後🌟 *****************
        #     以下のように .astype('f') を適用することで解決した
        # self.orig_coordinates = df_point_map[["orig_x", "orig_y"]].astype('f').values
        # self.x_real_coordinates = df_point_map["real_x"].astype('f').values
        # self.y_real_coordinates = df_point_map["real_y"].astype('f').values

    def __getitem__(self, index: int) -> Tuple[torch.Tensor, torch.Tensor, torch.Tensor]:
        orig_coordinate = self.orig_coordinates[index]
        x_real = self.x_real_coordinates[index]
        y_real = self.y_real_coordinates[index]

        return orig_coordinate, x_real, y_real

    def __len__(self):
        return len(self.x_real_coordinates)


mapping_dataset: PointMappingDataset = PointMappingDataset(df_point_map)

# DataLoader作成
train_loader = torch.utils.data.DataLoader(
    mapping_dataset, batch_size=2, shuffle=True, drop_last=True
)

推論処理の動作確認

colaboratory

# batch を1つだけ取り出す
# https://output-zakki.com/dataloader_iter_and_next/
batch = next(iter(train_loader))

# batchの内容
[tensor([[175.0000,  12.5000],
         [ 25.0000,  30.0000]], dtype=torch.float64),
 tensor([200.,   3.]),
 tensor([ 5., 16.])]

colaboratory

coordinates, x_real, y_real = batch

# error 発生行
bn1 = nn.BatchNorm1d(2)  # バッチ正規化

---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
<ipython-input-80-7ddaf143f54b> in <cell line: 6>()
      4 fc2 = nn.Linear(10, 2)
      5 
----> 6 h = bn1(coordinates)
      7 h = fc1(h)
      8 h = F.relu(h)

2 frames
/usr/local/lib/python3.10/dist-packages/torch/nn/functional.py in batch_norm(input, running_mean, running_var, weight, bias, training, momentum, eps)
   2448         _verify_batch_size(input.size())
   2449 
-> 2450     return torch.batch_norm(
   2451         input, weight, bias, running_mean, running_var, training, momentum, eps, torch.backends.cudnn.enabled
   2452     )

RuntimeError: mixed dtype (CPU): expect input to have scalar type of BFloat16

上記の箇所でエラー発生していたが、前記「🌟修正後🌟」と書かれているように .astype('f') を適用したところ、以下のように推論処理が先へ進んだ！\( ˙꒳˙ \三/ ˙꒳˙)/

colaboratory

fc1 = nn.Linear(2, 10)
dropout1 = nn.Dropout(p=0.3)
fc2 = nn.Linear(10, 2)

h = bn1(coordinates)
h = fc1(h)
h = F.relu(h)
h = dropout1(h)
h = fc2(h)
pred_results = F.relu(h)
pred_results

# 結果
tensor([[0.0661, 0.0000],
        [0.6185, 0.2009]], grad_fn=<ReluBackward0>)

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up