1
2

Delete article

Deleted articles cannot be recovered.

Draft of this article would be also deleted.

Are you sure you want to delete this article?

Mujoco やってみた(ロボットアーム)

1
Posted at

はじめに

以下の記事の続きです。ロボットアームも試してみました。目標位置にアームの先端が移動すれば成功です。

お試しプログラム(ロボットアーム)

model.learn(total_timesteps=200000) 200000 でそこそこの性能、できれば 500000 の方がよい(ただし、時間がかかる)。リストのプログラムを train_6.py の名前で保存し、実行した。

train_6.py
import gymnasium as gym
from stable_baselines3 import PPO
import time

# =========================
# 学習用環境(描画なし)
# =========================
train_env = gym.make("Reacher-v5")

# =========================
# PPOモデル作成
# =========================
model = PPO(
    "MlpPolicy",
    train_env,
    verbose=1
)

# =========================
# 学習
# =========================
model.learn(total_timesteps=200000)

# =========================
# モデル保存
# =========================
model.save("ppo_Reacher-v5_new")

# 学習環境終了
train_env.close()

# =========================
# 評価用環境(描画あり)
# =========================
eval_env = gym.make(
    "Reacher-v5",
    render_mode="human",
    width=1400,
    height=900
)

obs, info = eval_env.reset()

# =========================
# カメラ設定
# =========================
viewer = eval_env.unwrapped.mujoco_renderer.viewer

viewer.cam.distance = 0.7
viewer.cam.elevation = -20

viewer.cam.lookat[0] = 0
viewer.cam.lookat[1] = 0
viewer.cam.lookat[2] = 0

# =========================
# 評価ループ
# =========================
episode_reward = 0.0

for step in range(100):

    # 学習済みモデルが行動を決定
    action, _states = model.predict(
        obs,
        deterministic=True
    )

    # 1ステップ進める
    obs, reward, terminated, truncated, info = eval_env.step(action)

    # 報酬加算
    episode_reward += reward

    # 状態表示
    print(f"step = {step}")
    print("action =", action)
    print("reward =", reward)
    print("episode_reward =", episode_reward)
    print()

    # 描画を見やすくする
    time.sleep(0.1)

    # エピソード終了判定
    if terminated or truncated:

        print("Episode finished")
        print("Total episode reward =", episode_reward)
        print()
        
        obs, info = eval_env.reset()
        episode_reward = 0.0
        
# =========================
# 終了
# =========================
eval_env.close()

描画は学習終了直後に自動的に始まるため、見落とす可能性があるので、保存した学習モデルを使って描画するプログラムも作成し、test_6_disp.py の名前で保存した。

test_6_disp.py
import gymnasium as gym
from stable_baselines3 import PPO
import time

# =========================
# 学習済みモデル読み込み
# =========================
model = PPO.load("ppo_Reacher-v5")

# =========================
# 評価用環境(描画あり)
# =========================
eval_env = gym.make(
    "Reacher-v5",
    render_mode="human",
    width=1400,
    height=900
)

# =========================
# 環境初期化
# =========================
obs, info = eval_env.reset()

# =========================
# カメラ設定
# =========================
viewer = eval_env.unwrapped.mujoco_renderer.viewer

viewer.cam.distance = 0.7
viewer.cam.elevation = -20

viewer.cam.lookat[0] = 0
viewer.cam.lookat[1] = 0
viewer.cam.lookat[2] = 0

# =========================
# 評価ループ
# =========================
episode_reward = 0.0

for step in range(300):

    # 学習済みモデルが行動決定
    action, _states = model.predict(
        obs,
        deterministic=True
    )

    # 1ステップ進める
    obs, reward, terminated, truncated, info = eval_env.step(action)

    episode_reward += reward

    # 状態表示
    print(f"step = {step}")
    print("action =", action)
    print("reward =", reward)
    print("episode_reward =", episode_reward)
    print()

    # 描画を見やすくする
    time.sleep(0.1)

    # エピソード終了
    if terminated or truncated:

        print("Episode finished")
        print("Total episode reward =", episode_reward)
        print()

        obs, info = eval_env.reset()
        episode_reward = 0.0

# =========================
# 終了
# =========================
eval_env.close()

こんな感じで表示されます。

armrobot.png

おわりに

アームロボットは学習量をかなり多くしないと、ピタッと目標位置に来てくれません。難しいですね。

1
2
0

Register as a new user and use Qiita more conveniently

  1. You get articles that match your needs
  2. You can efficiently read back useful information
  3. You can use dark theme
What you can do with signing up
1
2

Delete article

Deleted articles cannot be recovered.

Draft of this article would be also deleted.

Are you sure you want to delete this article?