LoginSignup
1
0

More than 1 year has passed since last update.

混ぜるなキケン!〜torch.Tensorとnumpy.ndarrayを混ぜると`ERROR: Unexpected bus error encountered in worker. `が出る〜

Posted at

はじめに

タイトルの通りです.
DeepLearningのフレームワークであるPyTorchで実装中,torch.Tensornumpy.ndarrayを混ぜると(恐らく)メモリリークを起こしたため,備忘録も兼ねて簡単に共有したいと思います.

混ぜた結果

Online Hard Example Miningを実装していて,unravel_index関数が必要だったのですが,PyTorchには22/09/07現在未実装でした.仕方ないので,一度numpyに変更してから計算しようとしたのがことの発端でした.

抜粋コード


# and hard sample
hard_indices_numpy = indices.numpy()[..., :hard_sample_nums].flatten()

# unravel_index is not supported in torch
hard_indices_numpy = np.unravel_index(hard_indices_numpy, shape_orig)

# create a mask from hard_indices
hard_mask = torch.zeros(shape_orig, dtype=torch.bool, device=device)
hard_mask[hard_indices_numpy] = True

実際,このコードで動きはします.しかし,学習を重ねていくと以下のエラーを吐くようになりました.

ERROR: Unexpected bus error encountered in worker. This might be caused by insufficient shared memory (shm)

Dockerでよくあるエラーらしいのですが(参考),私は使用していないので他に原因がありそうです.

対応策

いろいろ調べていくと,ここに行きつきました.cuda(GPU)とnumpy(CPU)を混ぜるとメモリリークしそうです.心当たりは↑であったので,仕方なくunravel_indextorch.Tensorで実装することにしました.幸いにも公式に間も無く実装されそうだったので,そちらを一足先に使うことで対応しました.

# from: https://github.com/krshrimali/pytorch/blob/fc1e9474a83736aaefd784b20df6220a27975ed2/torch/functional.py
# on 22/09/06 this code has not been merged
def unravel_index(
    indices: torch.Tensor,
    shape: Union[int, Sequence, torch.Tensor],
    *,
    as_tuple: bool = True
) -> Union[Tuple[torch.Tensor, ...], torch.Tensor]:
    r"""Converts a `Tensor` of flat indices into a `Tensor` of coordinates for the given target shape.
    Args:
        indices: An integral `Tensor` containing flattened indices of a `Tensor` of dimension `shape`.
        shape: The shape (can be an `int`, a `Sequence` or a `Tensor`) of the `Tensor` for which
               the flattened `indices` are unraveled.
    Keyword Args:
        as_tuple: A boolean value, which if `True` will return the result as tuple of Tensors,
                  else a `Tensor` will be returned. Default: `True`
    Returns:
        unraveled coordinates from the given `indices` and `shape`. See description of `as_tuple` for
        returning a `tuple`.
    .. note:: The default behaviour of this function is analogous to
              `numpy.unravel_index <https://numpy.org/doc/stable/reference/generated/numpy.unravel_index.html>`_.
    Example::
        >>> indices = torch.tensor([22, 41, 37])
        >>> shape = (7, 6)
        >>> torch.unravel_index(indices, shape)
        (tensor([3, 6, 6]), tensor([4, 5, 1]))
        >>> torch.unravel_index(indices, shape, as_tuple=False)
        tensor([[3, 4],
                [6, 5],
                [6, 1]])
        >>> indices = torch.tensor([3, 10, 12])
        >>> shape_ = (4, 2, 3)
        >>> torch.unravel_index(indices, shape_)
        (tensor([0, 1, 2]), tensor([1, 1, 0]), tensor([0, 1, 0]))
        >>> torch.unravel_index(indices, shape_, as_tuple=False)
        tensor([[0, 1, 0],
                [1, 1, 1],
                [2, 0, 0]])
    """
    def _helper_type_check(inp: Union[int, Sequence, torch.Tensor], name: str):
        # `indices` is expected to be a tensor, while `shape` can be a sequence/int/tensor
        if name == "shape" and isinstance(inp, Sequence):
            for dim in inp:
                if not isinstance(dim, int):
                    raise TypeError("Expected shape to have only integral elements.")
                if dim < 0:
                    raise ValueError("Negative values in shape are not allowed.")
        elif name == "shape" and isinstance(inp, int):
            if inp < 0:
                raise ValueError("Negative values in shape are not allowed.")
        elif isinstance(inp, torch.Tensor):
            if inp.dtype not in integral_types():
                raise TypeError(f"Expected {name} to be an integral tensor, but found dtype: {inp.dtype}")
            if torch.any(inp < 0):
                raise ValueError(f"Negative values in {name} are not allowed.")
        else:
            allowed_types = "Sequence/Scalar (int)/Tensor" if name == "shape" else "Tensor"
            msg = f"{name} should either be a {allowed_types}, but found {type(inp)}"
            raise TypeError(msg)

    _helper_type_check(indices, "indices")
    _helper_type_check(shape, "shape")

    # Convert to a tensor, with the same properties as that of indices
    if isinstance(shape, Sequence):
        shape_tensor: torch.Tensor = indices.new_tensor(shape)
    elif isinstance(shape, int) or (isinstance(shape, torch.Tensor) and shape.ndim == 0):
        shape_tensor = indices.new_tensor((shape,))
    else:
        shape_tensor = shape

    # By this time, shape tensor will have dim = 1 if it was passed as scalar (see if-elif above)
    assert shape_tensor.ndim == 1, "Expected dimension of shape tensor to be <= 1, "
    f"but got the tensor with dim: {shape_tensor.ndim}."

    # In case no indices passed, return an empty tensor with number of elements = shape.numel()
    if indices.numel() == 0:
        # If both indices.numel() == 0 and shape.numel() == 0, short-circuit to return itself
        shape_numel = shape_tensor.numel()
        if shape_numel == 0:
            raise ValueError("Got indices and shape as empty tensors, expected non-empty tensors.")
        else:
            output = [indices.new_tensor([]) for _ in range(shape_numel)]
            return tuple(output) if as_tuple else torch.stack(output, dim=1)

    if torch.max(indices) >= torch.prod(shape_tensor):
        raise ValueError("Target shape should cover all source indices.")

    coefs = shape_tensor[1:].flipud().cumprod(dim=0).flipud()
    coefs = torch.cat((coefs, coefs.new_tensor((1,))), dim=0)
    coords = torch.div(indices[..., None], coefs, rounding_mode='trunc') % shape_tensor

    if as_tuple:
        return tuple(coords[..., i] for i in range(coords.size(-1)))
    return coords

終わりに

動くからといって,下手なことはしたらダメだなと思いました.

1
0
0

Register as a new user and use Qiita more conveniently

  1. You get articles that match your needs
  2. You can efficiently read back useful information
  3. You can use dark theme
What you can do with signing up
1
0