0
0

Delete article

Deleted articles cannot be recovered.

Draft of this article would be also deleted.

Are you sure you want to delete this article?

More than 1 year has passed since last update.

AWS DeepRacer の reward function を考えたかった

Last updated at Posted at 2023-06-27

はじめに

Deep Racer の reward function について、浮かんだものを取り合えず書き残したかったのでここに記す的な、、、

内側とか特に考えずに爆速で走り抜けたかった。。

正確なコース取りよりも速度を重要視した評価の方が車も気持ちいいでしょ、たぶん。
最高速度が秒速4メートルって滅茶苦茶速くないか?

爆速で走りたし.py
import math
def reward_function():
  # Read input variables
  waypoints = params['waypoints']
  closest_waypoints = params['closest_waypoints']
  heading = params['heading']
  speed = params['speed']
  all_wheels_on_track = params['all_wheels_on_track']
  is_left_of_center = params['is_left_of_center']
  harf_track_width = params['track_width']/2.0
  distance_from_center = params['distance_from_center']

  reward = 1.0

  next_point = waypoints[closest_waypoints[1]]
  prev_point = waypoints[closest_waypoints[0]]

  track_direction = math.atan2(next_point[1] - prev_point[1], next_point[0] - prev_point[0])
  track_direction = math.degrees(track_direction)

  direction_diff = abs(track_direction - heading)
  if direction_diff > 180:
    direction_diff = 360 - direction_diff

  DIRECTION_THRESHOLD_5 = 5.0
  DIRECTION_THRESHOLD_10 = 10.0
  DIRECTION_THRESHOLD_15 = 15.0
  DIRECTION_THRESHOLD_20 = 20.0
  DIRECTION_THRESHOLD_25 = 25.0

  #速けりゃ速いほどいいよね_(:3 」∠)_
  reward *= speed /4.0

  #進行方向が路線上の方向とずれていると減点されるらしい(´・ω・`)
  if direction_diff <= DIRECTION_THRESHOLD_5:
    reward *= 1.0
  elif direction_diff <= DIRECTION_THRESHOLD_10 :
    reward *= 0.8
  elif direction_diff <= DIRECTION_THRESHOLD_15 :
    reward *= 0.6
  elif direction_diff <= DIRECTION_THRESHOLD_20 :
    reward *= 0.4
  elif direction_diff <= DIRECTION_THRESHOLD_25 :
    reward *= 0.2
  else :
    reward *= 0.1
  #内側ならちょっとはみ出てもええよ(*'ω'*)
  #外側は絶対にはみ出すな( ゚Д゚)
  if (is_left_of_center and distance_from_center > harf_track_width * 1.2) or (not is_left_of_center and not all_wheels_on_track):
    reward *= 1e-3
  return float(reward)

speedを4で割っているのは、車の最高速度を4m/sまで設定出来た気がするからです。

内側を攻めてくれそうなやつ

内側を攻めるために走るコースをかなり制限したやつ。
うまくいかなそうって思いながら書いた思い出の逸品です。

コーナーで差を付けたかった.py
#インコースぎりぎりをなるだけ速く駆け抜けるマシンくん(`・ω・´)
#なんだか走り方が窮屈になりそうだなあ。途中で転倒しそう、、(; ・`д・´)
def reward_function2():
  is_left_of_center = params['is_left_of_center']
  speed = params['speed']
  distance_from_center = params['distance_from_center']
  harf_track_width = params['track_width']
  all_wheels_on_track = params['all_wheels_on_track']
  reward = speed / 4.0
  #外側にいろ!(# ゚Д゚)
  if distance_from_center > harf_track_width * 1.0:
    reward *= 1.0
  if distance_from_center < harf_track_width * 0.8:
    reward *= 0.8
  if distance_from_center < harf_track_width * 0.6:
    reward *= 0.6
  if distance_from_center < harf_track_width * 0.4:
    reward *= 0.4
  if distance_from_center < harf_track_width * 0.2:
    reward *= 0.2
  #左にいろ!(# ゚Д゚)
  if is_left_of_center == False:
    reward *= 1e-3
  #内側ならちょっとはみ出てもええよ(*'ω'*)
  #外側は絶対にはみ出すな( ゚Д゚)
  if (is_left_of_center and distance_from_center > harf_track_width * 1.2) or (not is_left_of_center and not all_wheels_on_track):
    reward *= 1e-3
  return float(reward)

さいごに

とりあえず考えたけど、妄想のソースコードだからしっかり学習させてみないとわからないです。
実験前だから間違いがあってもうまく学習できなくても俺のせいじゃない!

参考文献

本投稿では以下の文献の一部を参照しております。大変助かりました。

0
0
0

Register as a new user and use Qiita more conveniently

  1. You get articles that match your needs
  2. You can efficiently read back useful information
  3. You can use dark theme
What you can do with signing up
0
0

Delete article

Deleted articles cannot be recovered.

Draft of this article would be also deleted.

Are you sure you want to delete this article?