エラー内容
以下のような、すんごい長いエラーが出て、対処法もよくわからなかったので記録しておきます!(`・ω・´)
RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation: [torch.FloatTensor [256, 1]], which is output 0 of AsStridedBackward0, is at version 2; expected version 1 instead. Hint: enable anomaly detection to find the operation that failed to compute its gradient, with torch.autograd.set_detect_anomaly(True).
エラー文全体はこちら
---------------------------------------------------------------------------
RuntimeError Traceback (most recent call last)
<ipython-input-27-f15cc57dc47f> in <cell line: 0>()
24 memory = ReplayMemory(args['memory_size'])
25
---> 26 agent, episode_reward_list, eval_reward_list = train_eval(env, agent, memory, args)
27
28 print('Game Done !! Max Reward: {:.2f}'.format(np.max(eval_reward_list)))
4 frames
<ipython-input-12-17a49a33d1f5> in train_eval(env, agent, memory, args, recorder)
24 # memory が溜まるまではパラメータ更新不要
25 if len(memory) > args['batch_size']:
---> 26 agent.update_parameters(memory, args['batch_size'], n_update)
27 n_update += 1
28
<ipython-input-26-d821b69968b3> in update_parameters(self, memory, batch_size, n_update)
100
101 self.actor_optim.zero_grad()
--> 102 actor_loss.backward()
103 self.actor_optim.step()
104
/usr/local/lib/python3.11/dist-packages/torch/_tensor.py in backward(self, gradient, retain_graph, create_graph, inputs)
624 inputs=inputs,
625 )
--> 626 torch.autograd.backward(
627 self, gradient, retain_graph, create_graph, inputs=inputs
628 )
/usr/local/lib/python3.11/dist-packages/torch/autograd/__init__.py in backward(tensors, grad_tensors, retain_graph, create_graph, grad_variables, inputs)
345 # some Python versions print out the first line of a multi-line function
346 # calls in the traceback and some print out the last line
--> 347 _engine_run_backward(
348 tensors,
349 grad_tensors_,
/usr/local/lib/python3.11/dist-packages/torch/autograd/graph.py in _engine_run_backward(t_outputs, *args, **kwargs)
821 unregister_hooks = _register_logging_hooks_on_whole_graph(t_outputs)
822 try:
--> 823 return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass
824 t_outputs, *args, **kwargs
825 ) # Calls into the C++ engine to run the backward pass
RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation: [torch.FloatTensor [256, 1]], which is output 0 of AsStridedBackward0, is at version 2; expected version 1 instead. Hint: enable anomaly detection to find the operation that failed to compute its gradient, with torch.autograd.set_detect_anomaly(True).
with torch.autograd.set_detect_anomaly(True)
を使えと書かれているし、色々な記事にもそのように書かれている。
以下のような記事を見るとそう。
というわけでやってみる。
環境
- google colaboratory
調査方法
エラーメッセージによると gradient
に問題があるということなので、勾配計算が関わっていそうなところを set_detect_anomaly
ブロックで囲ってみよう。
自分は以下のようにした。
with torch.autograd.set_detect_anomaly(True):
# critic (value network) の損失
q_values = self.critic_net(state_batch, action_batch)
critic_loss = F.mse_loss(q_values, next_q_values)
# actor (policy network) の損失
action, _ = self.actor_net.sample(state_batch)
q_values = self.critic_net(state_batch, action) # スケーリングも見る
actor_loss = - q_values.mean()
# Critic の更新
self.critic_optim.zero_grad()
critic_loss.backward()
self.critic_optim.step()
# Actor の更新
self.actor_optim.zero_grad()
actor_loss.backward()
self.actor_optim.step()
この状態で再実行したところ、エラーメッセージが変わった。
エラーメッセージ全文も掲載するが、長すぎるので畳んでおく。
エラーメッセージ全文
/usr/local/lib/python3.11/dist-packages/torch/autograd/graph.py:823: UserWarning: Error detected in AddmmBackward0. Traceback of forward call that caused the error:
File "<frozen runpy>", line 198, in _run_module_as_main
File "<frozen runpy>", line 88, in _run_code
File "/usr/local/lib/python3.11/dist-packages/colab_kernel_launcher.py", line 37, in <module>
ColabKernelApp.launch_instance()
File "/usr/local/lib/python3.11/dist-packages/traitlets/config/application.py", line 992, in launch_instance
new_config.merge(self.cli_config)
File "/usr/local/lib/python3.11/dist-packages/ipykernel/kernelapp.py", line 712, in start
self.io_loop.start()
File "/usr/local/lib/python3.11/dist-packages/tornado/platform/asyncio.py", line 205, in start
self.asyncio_loop.run_forever()
File "/usr/lib/python3.11/asyncio/base_events.py", line 608, in run_forever
self._run_once()
File "/usr/lib/python3.11/asyncio/base_events.py", line 1936, in _run_once
handle._run()
File "/usr/lib/python3.11/asyncio/events.py", line 84, in _run
self._context.run(self._callback, *self._args)
File "/usr/local/lib/python3.11/dist-packages/ipykernel/kernelbase.py", line 510, in dispatch_queue
await self.process_one()
File "/usr/local/lib/python3.11/dist-packages/ipykernel/kernelbase.py", line 499, in process_one
await dispatch(*args)
File "/usr/local/lib/python3.11/dist-packages/ipykernel/kernelbase.py", line 406, in dispatch_shell
await result
File "/usr/local/lib/python3.11/dist-packages/ipykernel/kernelbase.py", line 730, in execute_request
reply_content = await reply_content
File "/usr/local/lib/python3.11/dist-packages/ipykernel/ipkernel.py", line 383, in do_execute
res = shell.run_cell(
File "/usr/local/lib/python3.11/dist-packages/ipykernel/zmqshell.py", line 528, in run_cell
return super().run_cell(*args, **kwargs)
File "/usr/local/lib/python3.11/dist-packages/IPython/core/interactiveshell.py", line 2975, in run_cell
except:
File "/usr/local/lib/python3.11/dist-packages/IPython/core/interactiveshell.py", line 3030, in _run_cell
shell_futures : bool
File "/usr/local/lib/python3.11/dist-packages/IPython/core/async_helpers.py", line 78, in _pseudo_sync_runner
attr = getattr(self._obj, key)
File "/usr/local/lib/python3.11/dist-packages/IPython/core/interactiveshell.py", line 3257, in run_cell_async
if not silent:
File "/usr/local/lib/python3.11/dist-packages/IPython/core/interactiveshell.py", line 3473, in run_ast_nodes
File "/usr/local/lib/python3.11/dist-packages/IPython/core/interactiveshell.py", line 3553, in run_code
except SystemExit as e:
File "<ipython-input-35-f15cc57dc47f>", line 26, in <cell line: 0>
agent, episode_reward_list, eval_reward_list = train_eval(env, agent, memory, args)
File "<ipython-input-12-17a49a33d1f5>", line 26, in train_eval
agent.update_parameters(memory, args['batch_size'], n_update)
File "<ipython-input-34-7e7afd9cbafb>", line 96, in update_parameters
critic_loss, actor_loss = self._calc_actor_critic_loss(batch_size)
File "<ipython-input-34-7e7afd9cbafb>", line 66, in _calc_actor_critic_loss
q_values = self.critic_net(state_batch, action) # スケーリングも見る
File "/usr/local/lib/python3.11/dist-packages/torch/nn/modules/module.py", line 1739, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/usr/local/lib/python3.11/dist-packages/torch/nn/modules/module.py", line 1750, in _call_impl
return forward_call(*args, **kwargs)
File "<ipython-input-4-9b5ebb79aeda>", line 18, in forward
x = self.linear3(x)
File "/usr/local/lib/python3.11/dist-packages/torch/nn/modules/module.py", line 1739, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/usr/local/lib/python3.11/dist-packages/torch/nn/modules/module.py", line 1750, in _call_impl
return forward_call(*args, **kwargs)
File "/usr/local/lib/python3.11/dist-packages/torch/nn/modules/linear.py", line 125, in forward
return F.linear(input, self.weight, self.bias)
(Triggered internally at /pytorch/torch/csrc/autograd/python_anomaly_mode.cpp:122.)
return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass
---------------------------------------------------------------------------
RuntimeError Traceback (most recent call last)
<ipython-input-35-f15cc57dc47f> in <cell line: 0>()
24 memory = ReplayMemory(args['memory_size'])
25
---> 26 agent, episode_reward_list, eval_reward_list = train_eval(env, agent, memory, args)
27
28 print('Game Done !! Max Reward: {:.2f}'.format(np.max(eval_reward_list)))
4 frames
<ipython-input-12-17a49a33d1f5> in train_eval(env, agent, memory, args, recorder)
24 # memory が溜まるまではパラメータ更新不要
25 if len(memory) > args['batch_size']:
---> 26 agent.update_parameters(memory, args['batch_size'], n_update)
27 n_update += 1
28
<ipython-input-34-7e7afd9cbafb> in update_parameters(self, memory, batch_size, n_update)
101
102 self.actor_optim.zero_grad()
--> 103 actor_loss.backward()
104 self.actor_optim.step()
105
/usr/local/lib/python3.11/dist-packages/torch/_tensor.py in backward(self, gradient, retain_graph, create_graph, inputs)
624 inputs=inputs,
625 )
--> 626 torch.autograd.backward(
627 self, gradient, retain_graph, create_graph, inputs=inputs
628 )
/usr/local/lib/python3.11/dist-packages/torch/autograd/__init__.py in backward(tensors, grad_tensors, retain_graph, create_graph, grad_variables, inputs)
345 # some Python versions print out the first line of a multi-line function
346 # calls in the traceback and some print out the last line
--> 347 _engine_run_backward(
348 tensors,
349 grad_tensors_,
/usr/local/lib/python3.11/dist-packages/torch/autograd/graph.py in _engine_run_backward(t_outputs, *args, **kwargs)
821 unregister_hooks = _register_logging_hooks_on_whole_graph(t_outputs)
822 try:
--> 823 return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass
824 t_outputs, *args, **kwargs
825 ) # Calls into the C++ engine to run the backward pass
RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation: [torch.FloatTensor [256, 1]], which is output 0 of AsStridedBackward0, is at version 2; expected version 1 instead. Hint: the backtrace further above shows the operation that failed to compute its gradient. The variable in question was changed in there or anywhere later. Good luck!
この中で、気になるのは自分の書いたコードが出てくる場所。それを探す。
自分の場合は以下のようなログが残されていた。
# エラーログ抜粋
File "<ipython-input-35-f15cc57dc47f>", line 26, in <cell line: 0>
agent, episode_reward_list, eval_reward_list = train_eval(env, agent, memory, args)
File "<ipython-input-12-17a49a33d1f5>", line 26, in train_eval
agent.update_parameters(memory, args['batch_size'], n_update)
File "<ipython-input-34-7e7afd9cbafb>", line 96, in update_parameters
critic_loss, actor_loss = self._calc_actor_critic_loss(batch_size)
File "<ipython-input-34-7e7afd9cbafb>", line 66, in _calc_actor_critic_loss
q_values = self.critic_net(state_batch, action) # スケーリングも見る
つまり、ここに問題があるってことになる。
この辺を見て試行錯誤することになる。
解決策
(論理的な流れは追えていないが、)試行錯誤している内に原因が特定できたので記録。
今回のエラーについては、Actor と Critic の loss それぞれを計算した後にそれぞれのモデル更新を行っている、この処理順に問題があったらしい。
- critic の loss を計算
- actor の loss を計算
- critic の loss を使って critic を更新 <--- ここで actor の loss 計算が壊される模様
- actor の loss を使って actor を更新 <--- actor の loss を使おうとしてエラー発生
そのため、処理の流れを以下のように変更したところ、エラーが解消した。
- critic の loss を計算
- critic の loss を使って critic を更新
- actor の loss を計算
- actor の loss を使って actor を更新
具体的には、以下のように修正した。
修正前
# critic (value network) の損失
q_values = self.critic_net(state_batch, action_batch)
critic_loss = F.mse_loss(q_values, next_q_values)
# actor (policy network) の損失
action, _ = self.actor_net.sample(state_batch)
q_values = self.critic_net(state_batch, action) # スケーリングも見る
actor_loss = - q_values.mean()
self.critic_optim.zero_grad()
critic_loss.backward()
self.critic_optim.step()
self.actor_optim.zero_grad()
actor_loss.backward()
self.actor_optim.step()
修正後
# critic (value network) の損失
q_values = self.critic_net(state_batch, action_batch)
critic_loss = F.mse_loss(q_values, next_q_values)
self.critic_optim.zero_grad()
critic_loss.backward()
self.critic_optim.step()
# actor (policy network) の損失
action, _ = self.actor_net.sample(state_batch)
q_values = self.critic_net(state_batch, action) # スケーリングも見る
actor_loss = - q_values.mean()
self.actor_optim.zero_grad()
actor_loss.backward()
self.actor_optim.step()
これで解決(*´v`)
考察
おそらく、Actor の loss 計算に Critic のモデルを使用しているため、「Actor の loss 計算後かつ Actor の更新前」に Critic を更新してしまうと、勾配の微分グラフの途中 (Critic ネットワークの出力部分) が書き換わってしまうために、 Actor の loss が使い物にならなくなるのだろうと思われる。