More than 5 years have passed since last update.

SamurAI Cording！！　

Last updated at 2018-07-24Posted at 2018-07-24

情報処理学会のSamurAI Cordingに去年、一昨年と参加させてもらってます。
今回は去年行われたSamurAI Cordingの復習というか何というか・・
早くゴールできる馬を目指して日々奮闘。（わからなさすぎて悩むだけが多い）

こんな感じのゴールを目指すゲーム？です

http://samuraicoding.info/index-jp.html
これがホームページです。

Playerが選択できるのは加速度のみでx,y座標に+1,0,-1といった感じで指示を出していきます。
衝突を減らし相手より早くゴールできれば勝ちって感じですね。
今回は相手なしで考えてみます。

さて、SamurAI Cordingから基盤となるコードが出されているんですが、今回は勉強ということでシステムから自作していきたいと思います。

こんな感じでmain文を書いてみました。

main.py

from course import Course
from py3_map import Map
from player import Player
from log import Log
from Network import DQNAgent
import copy
import numpy as np
import random
import json
import itertools

class MyEncoder(json.JSONEncoder):
    def default(self, obj):
        if isinstance(obj, np.integer):
            return int(obj)
        elif isinstance(obj, np.floating):
            return float(obj)
        elif isinstance(obj, np.ndarray):
            return obj.tolist()
        else:
            return super(MyEncoder, self).default(obj)

if __name__ == "__main__":
    finish_counter = 0
    epoch_size = 2

    for epoch in range(1,epoch_size):
        #初期設定
        course = Course()
        log = Log(course)    
        player_info = Player(course)        

        if finish_counter == 3:
            break #10連続でクリアしたら終了
        
        #ゲーム開始
        for i in range(course.stepLimit):
            moves = []

            # ランダム行動
            for ax, ay in itertools.product(range(-1, 2), range(-1, 2)):
                next,_,judge = player_info.next_state(np.array([int(ax),int(ay)]))
                if ((next[0] == player_info.x and next[1] == player_info.y) or next[1] < player_info.y):
                    continue
                moves.append((ax, ay))
                
            if len(moves) > 0: # 衝突しない
                ax,ay = moves[random.randrange(0, len(moves))]
                    
            else: # 衝突する
                ax,ay = (-np.sign(player_info.vx), -np.sign(player_info.vy))
                    
                
            _,_,judge = player_info.next_state(np.array([int(ax),int(ay)]))
            log.add(i,player_info,ax,ay,judge)
            
            #情報更新
            player_info.update(course, np.array([int(ax),int(ay)]))

            if player_info.y >= course.length:
                finish_counter += 1
                break



        if course.stepLimit-1 == i:
            finish_counter = 0

        print("epoch : {}, step : {}, y : {}".format(epoch,i,player_info.y))

        f = open('log', 'w')
        json.dump(log.base,f,indent=4,cls=MyEncoder)
        f.close()

少しずつ書いたコードを見ていきます。

from course import Course
from py3_map import Map
from player import Player
from log import Log
from Network import DQNAgent
import copy
import numpy as np
import random
import json
import itertools

importしてるのが多いんですが追々説明していきます。

class MyEncoder(json.JSONEncoder):
    def default(self, obj):
        if isinstance(obj, np.integer):
            return int(obj)
        elif isinstance(obj, np.floating):
            return float(obj)
        elif isinstance(obj, np.ndarray):
            return obj.tolist()
        else:
            return super(MyEncoder, self).default(obj)

json.dumpをする時にjsonに無い型がある場合エラーが出ます。
TypeError: Object of type 'int64' is not JSON serializable
こんな感じに。それを防ぐための関数です。

if __name__ == "__main__":
    finish_counter = 0
    epoch_size = 2

    for epoch in range(1,epoch_size):
        course = Course()
        log = Log(course)    
        player_info = Player(course)        

        if finish_counter == 3:
            break #10連続でクリアしたら終了

        for i in range(course.stepLimit):
            moves = []
            for ax, ay in itertools.product(range(-1, 2), range(-1, 2)):
                next,_,judge = player_info.next_state(np.array([int(ax),int(ay)]))
                if ((next[0] == player_info.x and next[1] == player_info.y) or next[1] < player_info.y):
                    continue
                moves.append((ax, ay))
                
            if len(moves) > 0: # 衝突しない
                ax,ay = moves[random.randrange(0, len(moves))]
                    
            else: # 衝突する
                ax,ay = (-np.sign(player_info.vx), -np.sign(player_info.vy))

epoch_size : 試行回数
finish_counter : 10連続クリアで終了のカウンタ
courseやlog,Playerはclassで定義してるのでまた載せます！
for ax, ay のfor文はランダム試行の部分です。
単純にランダムにしてしまうのはもったい無いので衝突しないものからランダムに選択。
もし衝突するなら逆方向に加速させます。(元のコードにありました笑)

                
            _,_,judge = player_info.next_state(np.array([int(ax),int(ay)]))
            log.add(i,player_info,ax,ay,judge)
            
            player_info.update(course, np.array([int(ax),int(ay)]))

            if player_info.y >= course.length:
                finish_counter += 1
                break

        if course.stepLimit-1 == i:
            finish_counter = 0

        print("epoch : {}, step : {}, y : {}".format(epoch,i,player_info.y))

        f = open('log', 'w')
        json.dump(log.base,f,indent=4,cls=MyEncoder)
        f.close()

後半部分です。加速を決めることができたので移動します。
player_info.next_stateは今回あんまり意味が無いです。今後人工知能を組み込むときの衝突してるかしてないかで報酬を決めるために作っておきました。

player_info.updateで加速度などを用いで移動しています。

ざっと書いてみたけど文章に起こすのは大変でした。
次はplayer,course,logなどの関数を見てみたいと思います。

SamurAI CordingのサイトのソフトウェアからGitなどでダウンロードできるところに
viewerがあると思います！。後々のLogなどで使うのでよければダウンロードしておいてください！

viewerの仕様に合わせてファイルを作っていきます。

追記でしたw

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up