はじめに
タイトルの通りです.
DeepLearningのフレームワークであるPyTorchで実装中,torch.Tensor
とnumpy.ndarray
を混ぜると(恐らく)メモリリークを起こしたため,備忘録も兼ねて簡単に共有したいと思います.
混ぜた結果
Online Hard Example Miningを実装していて,unravel_index
関数が必要だったのですが,PyTorchには22/09/07現在未実装でした.仕方ないので,一度numpyに変更してから計算しようとしたのがことの発端でした.
抜粋コード
(略)
# and hard sample
hard_indices_numpy = indices.numpy()[..., :hard_sample_nums].flatten()
# unravel_index is not supported in torch
hard_indices_numpy = np.unravel_index(hard_indices_numpy, shape_orig)
# create a mask from hard_indices
hard_mask = torch.zeros(shape_orig, dtype=torch.bool, device=device)
hard_mask[hard_indices_numpy] = True
実際,このコードで動きはします.しかし,学習を重ねていくと以下のエラーを吐くようになりました.
ERROR: Unexpected bus error encountered in worker. This might be caused by insufficient shared memory (shm)
Dockerでよくあるエラーらしいのですが(参考),私は使用していないので他に原因がありそうです.
対応策
いろいろ調べていくと,ここに行きつきました.cuda(GPU)とnumpy(CPU)を混ぜるとメモリリークしそうです.心当たりは↑であったので,仕方なくunravel_index
をtorch.Tensor
で実装することにしました.幸いにも公式に間も無く実装されそうだったので,そちらを一足先に使うことで対応しました.
# from: https://github.com/krshrimali/pytorch/blob/fc1e9474a83736aaefd784b20df6220a27975ed2/torch/functional.py
# on 22/09/06 this code has not been merged
def unravel_index(
indices: torch.Tensor,
shape: Union[int, Sequence, torch.Tensor],
*,
as_tuple: bool = True
) -> Union[Tuple[torch.Tensor, ...], torch.Tensor]:
r"""Converts a `Tensor` of flat indices into a `Tensor` of coordinates for the given target shape.
Args:
indices: An integral `Tensor` containing flattened indices of a `Tensor` of dimension `shape`.
shape: The shape (can be an `int`, a `Sequence` or a `Tensor`) of the `Tensor` for which
the flattened `indices` are unraveled.
Keyword Args:
as_tuple: A boolean value, which if `True` will return the result as tuple of Tensors,
else a `Tensor` will be returned. Default: `True`
Returns:
unraveled coordinates from the given `indices` and `shape`. See description of `as_tuple` for
returning a `tuple`.
.. note:: The default behaviour of this function is analogous to
`numpy.unravel_index <https://numpy.org/doc/stable/reference/generated/numpy.unravel_index.html>`_.
Example::
>>> indices = torch.tensor([22, 41, 37])
>>> shape = (7, 6)
>>> torch.unravel_index(indices, shape)
(tensor([3, 6, 6]), tensor([4, 5, 1]))
>>> torch.unravel_index(indices, shape, as_tuple=False)
tensor([[3, 4],
[6, 5],
[6, 1]])
>>> indices = torch.tensor([3, 10, 12])
>>> shape_ = (4, 2, 3)
>>> torch.unravel_index(indices, shape_)
(tensor([0, 1, 2]), tensor([1, 1, 0]), tensor([0, 1, 0]))
>>> torch.unravel_index(indices, shape_, as_tuple=False)
tensor([[0, 1, 0],
[1, 1, 1],
[2, 0, 0]])
"""
def _helper_type_check(inp: Union[int, Sequence, torch.Tensor], name: str):
# `indices` is expected to be a tensor, while `shape` can be a sequence/int/tensor
if name == "shape" and isinstance(inp, Sequence):
for dim in inp:
if not isinstance(dim, int):
raise TypeError("Expected shape to have only integral elements.")
if dim < 0:
raise ValueError("Negative values in shape are not allowed.")
elif name == "shape" and isinstance(inp, int):
if inp < 0:
raise ValueError("Negative values in shape are not allowed.")
elif isinstance(inp, torch.Tensor):
if inp.dtype not in integral_types():
raise TypeError(f"Expected {name} to be an integral tensor, but found dtype: {inp.dtype}")
if torch.any(inp < 0):
raise ValueError(f"Negative values in {name} are not allowed.")
else:
allowed_types = "Sequence/Scalar (int)/Tensor" if name == "shape" else "Tensor"
msg = f"{name} should either be a {allowed_types}, but found {type(inp)}"
raise TypeError(msg)
_helper_type_check(indices, "indices")
_helper_type_check(shape, "shape")
# Convert to a tensor, with the same properties as that of indices
if isinstance(shape, Sequence):
shape_tensor: torch.Tensor = indices.new_tensor(shape)
elif isinstance(shape, int) or (isinstance(shape, torch.Tensor) and shape.ndim == 0):
shape_tensor = indices.new_tensor((shape,))
else:
shape_tensor = shape
# By this time, shape tensor will have dim = 1 if it was passed as scalar (see if-elif above)
assert shape_tensor.ndim == 1, "Expected dimension of shape tensor to be <= 1, "
f"but got the tensor with dim: {shape_tensor.ndim}."
# In case no indices passed, return an empty tensor with number of elements = shape.numel()
if indices.numel() == 0:
# If both indices.numel() == 0 and shape.numel() == 0, short-circuit to return itself
shape_numel = shape_tensor.numel()
if shape_numel == 0:
raise ValueError("Got indices and shape as empty tensors, expected non-empty tensors.")
else:
output = [indices.new_tensor([]) for _ in range(shape_numel)]
return tuple(output) if as_tuple else torch.stack(output, dim=1)
if torch.max(indices) >= torch.prod(shape_tensor):
raise ValueError("Target shape should cover all source indices.")
coefs = shape_tensor[1:].flipud().cumprod(dim=0).flipud()
coefs = torch.cat((coefs, coefs.new_tensor((1,))), dim=0)
coords = torch.div(indices[..., None], coefs, rounding_mode='trunc') % shape_tensor
if as_tuple:
return tuple(coords[..., i] for i in range(coords.size(-1)))
return coords
終わりに
動くからといって,下手なことはしたらダメだなと思いました.