概要

Unityが公開した機械学習のライブラリ（ML-Agents）を利用して何かをしたかった。
その成果がこれ。
左が何もしてない素のRagDollで重力で地面に倒れこむが、右が機械学習で倒れないようにしたRagDollで、「くの字」になって安定した姿勢をとっている。お辞儀みたいで待機モーションっぽくは無いが、まあ良しとしよう。

目的

機械学習、強化学習のデモでよく見るのが人のモデルに条件を与えて歩くモーションを学習させるというものがある。
Google「DeepMind」、コンピュータが人型ベースでB地点にたどり着く最善の方法（柔軟な動き）を独学で生成する強化学習を用いたアプローチを提案した論文を発表
これを今回のライブラリを使って出来るかどうか試してみた。手始めに、一番簡単だと思われる待機モーションを作ってみた。

機械学習ライブラリの導入

下記記事が分かりやすいので、これを参照に導入する。
Unityが機械学習のライブラリを公開したので使ってみる　(和訳付き)

待機モーションの定義

使うのはUnityで人体モデルにRagDollを設定し、人体の部位ごとにRigitBodyによる物理演算が働いて、何もしないと糸の切れたマリオネットのように崩れ落ちる。RagDollの設定の仕方に関しては以下の記事を参照した。
「見ろ、人がゴミのようだ」の作り方（RagDoll with Unity）

このRagDollに、機械学習で適切な関節の動きを与えることによって崩れ落ちずに立っている状態を維持できる動きを待機モーションとする。

実験条件

今回、ML-AgentsでUnityのスクリプトから設定した項目は以下のようなった。

Agentのアクションの定義
- RagDollで設定された体の部位の回転角をアクションとした。関節を回すイメージ。今回はとりあえず全部位の関節を設定。部位が12個で回転角(オイラー角でx,y,z)が3つなので、計36個のfloat値パラメータ。
Agentの状態の定義
- RagDollの部位のRigitBodyの床からの位置と速度を設定。位置と速度がそれぞれ3軸(x,y,z)で12x6で72個のfloatパラメータ。 *報酬を受ける条件
- 倒れないようにするのが目的なので、倒れたとき（両足の足以外の部位が地面に設置したとき）に報酬を-1。
- 待機するのが目的なので、Ragdollの初期位置からある範囲を超えて動いたら報酬を-1。
- それ以外の条件で報酬を0.1

ソースコード

今回自分が書いたソースコードとインスペクタの設定です。ML-Agentsのサンプルを参考にして書きました。
HumanAgentがAgent、SetRagdollがラグドールの制御クラスとなります。

using System.Collections;
using System.Collections.Generic;
using UnityEngine;

public class HumanAgent : Agent
{

    [SerializeField] SetRagdoll _ragdoll;

    /// <summary>
    /// Agentの状態。
    /// </summary>
    /// <returns>The state.</returns>
    public override List<float> CollectState ()
    {
        List<float> state = new List<float> ();
        for (int index = 0; index < 12; index++) {
            var rigit = _ragdoll.GetBone (index);
            state.Add (rigit.transform.position.x);
            state.Add (rigit.transform.position.y);
            state.Add (rigit.transform.position.z);
            state.Add (rigit.velocity.x);
            state.Add (rigit.velocity.y);
            state.Add (rigit.velocity.z);
        }
        return state;
    }

    // Agentのアクションの定義。報酬の条件等もここに書く。
    public override void AgentStep (float[] act)
    {

        if (brain.brainParameters.actionSpaceType == StateType.continuous) {
            int count = 0;
            float x = 0;
            float y = 0;
            float z = 0;
            for (int index = 0; index < act.Length; index = index + 3) {

                for (int localIndex = 0; localIndex < 3; localIndex++) {

                    if (localIndex == 0) {
                        x = act [index + localIndex];
                    } else if (localIndex == 1) {
                        y = act [index + localIndex];
                    } else {
                        z = act [index + localIndex];
                    }
                }
                var force = new Vector3 (x, y, z);

                _ragdoll.SetRotation (count, force);

                count++;
            }
        }
        var cols = Physics.OverlapSphere (this.transform.position, 0.2f);
        var isInArea = false;
        for (int index = 0; index < cols.Length; index++) {
            if (cols [index].gameObject.layer == LayerMask.NameToLayer ("Floor")) {
                isInArea = true;
            }
        }
        if (done == false) {
            if (!isInArea) {
                done = true;
                reward = -1f;
            } else {
                if (_ragdoll.IsCollisionFloorReactiveProperty.Value) {
                    done = true;
                    reward = -1f;
                } else {
                    reward = 0.1f;
                }
            }
        }


    }

    // Agentのステップ完了時のリセット処理。初期状態に戻す。
    public override void AgentReset ()
    {
        Debug.Log ("AgentReset");

        for (int index = 0; index < _ragdoll.allrigits.Length; index++) {
            var rigit = _ragdoll.allrigits [index];
            rigit.isKinematic = true;
            rigit.transform.localRotation = _ragdoll._boneTransformDict [index].rotation;
            rigit.transform.localPosition = _ragdoll._boneTransformDict [index].position;
            rigit.isKinematic = false;
        }

        _ragdoll.transform.localPosition = new Vector3 (0, 0f, 0);
        _ragdoll.IsCollisionFloorReactiveProperty.Value = false;
    }
}

using System.Collections;
using System.Collections.Generic;
using UnityEngine;
using System.Linq;
using UniRx;
using UniRx.Triggers;
using System;

/// <summary>
/// ラグドールの制御
/// </summary>
public class SetRagdoll : MonoBehaviour
{

    public BoolReactiveProperty IsCollisionFloorReactiveProperty = new BoolReactiveProperty (false);

    public Rigidbody[] allrigits {
        get {
            return _allRigits;
        }
    }

    [SerializeField] Animator _animator;
    [SerializeField] float _drag = 10f;
    [SerializeField] HumanBodyBones _targetBone;
    [SerializeField] Rigidbody[] _controllRigits;
    [SerializeField] Rigidbody[] _allRigits;
    [SerializeField] Rigidbody _floor;

    public struct RigitTransform
    {
        public Vector3 position;
        public Quaternion rotation;

        public RigitTransform (Vector3 position, Quaternion rotation)
        {
            this.position = position;
            this.rotation = rotation;
        }
    }

    public Dictionary<int, RigitTransform> _boneTransformDict = new Dictionary<int, RigitTransform> ();

    void Awake ()
    {

        var leftFoot = _animator.GetBoneTransform (HumanBodyBones.LeftFoot).GetComponent<Rigidbody> ();
        var rightFoot = _animator.GetBoneTransform (HumanBodyBones.RightFoot).GetComponent<Rigidbody> ();
        var leftleg = _animator.GetBoneTransform (HumanBodyBones.LeftLowerLeg).GetComponent<Rigidbody> ();
        var rightleg = _animator.GetBoneTransform (HumanBodyBones.RightLowerLeg).GetComponent<Rigidbody> ();

        _floor.OnCollisionEnterAsObservable ()
            .Where (p => p.rigidbody != leftFoot)
            .Where (p => p.rigidbody != rightFoot)
            .Where (p => p.rigidbody != leftleg)
            .Where (p => p.rigidbody != rightleg)
            .Subscribe (p => {
            IsCollisionFloorReactiveProperty.Value = true;
        });

        _allRigits = this.GetComponentsInChildren<Rigidbody> ();
        for (int index = 0; index < _allRigits.Length; index++) {
            _boneTransformDict [index] = new RigitTransform (_allRigits [index].transform.localPosition, _allRigits [index].transform.localRotation);
        }

    }

    public Rigidbody GetBone (int index)
    {
        return _controllRigits [index];
    }


    public void SetRotation (int index, Vector3 euler)
    {
        var rigit = _controllRigits [index];

        var joint = rigit.GetComponent<ConfigurableJoint> ();
        if (joint == null) {
            return;
        }
        rigit.angularVelocity = euler;
    }

BrainにStateのサイズとアクションのサイズを設定。

まとめ

今回、Ml-Agentsを使って待機モーションを作ってみました。Ml-Agentsは使い方さえ分かっていればTensolflowをそこまで理解していなくても機械学習が利用できる素晴らしいSDKです。書かないとならないソースコードも少ないですし。

Unityの機械学習ライブラリで待機モーションを作成する実験をした

概要

目的

機械学習ライブラリの導入

待機モーションの定義

実験条件

ソースコード

まとめ