More than 5 years have passed since last update.

windowsでTensorFlow その12

Last updated at 2017-05-07Posted at 2017-05-05

概要

windowsのTensorFlowの環境で、OpenAiやってみた。
OpenAi GymのClassic環境を調査してみた。

'CartPole-v0'の場合

ポールを倒さない様、カートを動かす。
observationは4つ
x, x_dot, theta, theta_dot
rewardは、1
actionは2つ
0, 1
ゲームオーバーは、200回、角度195

'CartPole-v1'の場合

ポールを倒さない様、カートを動かす。
observationは4つ
x, x_dot, theta, theta_dot
rewardは、1
actionは2つ
0, 1
ゲームオーバーは、500回、角度475

'MountainCar-v0'の場合

車が山を登る。
observationは2つ
position, velocity
rewardは、-1
actionは3つ
0, 1, 2
ゲームオーバーは、200回、角度-110

'MountainCarContinuous-v0'の場合

車が山を登る。
observationは2つ
position, velocity
rewardは、-1
actionは3つ
0, 1, 2
ゲームオーバーは、999回、角度90

'Pendulum-v0'の場合

鉄棒の倒立。
observationは3つ
cos(theta), sin(theta), thetadot
rewardは、角度
actionは1つ
-2 : 2
ゲームオーバーは、200回

'Acrobot-v1'の場合

鉄棒の大回転。
observationは6つ
cos(s[0]), np.sin(s[0]), cos(s[1]), sin(s[1]), s[2], s[3]
rewardは、-1
actionは3つ
0, 1, 2
ゲームオーバーは、500回

サンプルコード

from __future__ import print_function
import sys, gym

# env = gym.make('CartPole-v0')
# env = gym.make('CartPole-v1')
# env = gym.make('MountainCar-v0')
# env = gym.make('MountainCarContinuous-v0')
# env = gym.make('Pendulum-v0')
env = gym.make('Acrobot-v1')
print ("id is ", env.spec.id)
print ("observation is ", env.observation_space)
print ("action is ", env.action_space)
print ("act is ", env.action_space.sample())
print ("fps is ", env.metadata.get('video.frames_per_second'))
print ("step is ", env.spec.max_episode_steps)
print ("threshold is ", env.spec.reward_threshold)

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up