第1回　Quick Start

Posted at 2025-01-09

LeRobot: State-of-the-art AI for real-world robotics

最先端のRobotics分野の学習モデルやデータセット，実ロボットのツールを提供　ロボティクスAIや学習モデル，データセットに入門する人のバリアを減らすためのツール
模倣学習及び強化学習にフォーカス
最新の事前学習済みモデル，人間のデモで収集したデータセット，シミュレーション環境などを提供　簡易な実ロボットも提供予定

Install

condaで仮想環境を作る

git clone https://github.com/huggingface/lerobot.git
cd lerobot
conda create -y -n lerobot python=3.10
conda activate lerobot
pip install -e .

エラーが出た場合はcmakeやbuild-essentialsのインストールが必要

sudo apt-get install cmake build-essential

シミュレーションでは以下のようなロボットが使える．使用する場合は追加のインストールが必要

pip install -e ".[aloha, pusht, xarm]"

wandbを使う場合はログインする

wandb login

Walk through

リポジトリの構成は以下

.
├── examples             # いろんなプログラム例が入っている
|   └── advanced         # より高度なプログラム例
├── lerobot
|   ├── configs          # 設定ファイル　コマンドラインで上書き可能
|   |   ├── default.yaml   # デフォルト　diffusion policyのpusht環境
|   |   ├── env            # シミュレータ環境とデータセット: aloha.yaml, pusht.yaml, xarm.yaml
|   |   └── policy         # 行動方策モデル: act.yaml, diffusion.yaml, tdmpc.yaml
|   ├── common           # contains classes and utilities
|   |   ├── datasets       # 人間が操作して集めたデータセット: aloha, pusht, xarm
|   |   ├── envs           # シミュレータ環境: aloha, pusht, xarm
|   |   ├── policies       # 行動方策: act, diffusion, tdmpc
|   |   ├── robot_devices  # 実デバイス: dynamixel motors, opencv cameras, koch robots
|   |   └── utils          # various utilities
|   └── scripts          # コマンドラインで実行するプログラム
|       ├── eval.py                 # 方策を読み込んで評価する
|       ├── train.py                # 学習
|       ├── control_robot.py        # 遠隔操作用のコード
|       ├── push_dataset_to_hub.py  # データセットをLerobotのデータ形式に変換するコード
|       └── visualize_dataset.py    # データセットを読み込んで可視化する
├── outputs               # 結果を出力するフォルダ
└── tests                 # テスト用

Visualize datasets

どんなデータが内包されているのか可視化して確認できる
例えば，以下のデータセットを読み込んである特定のエピソードを可視化する

python lerobot/scripts/visualize_dataset.py --repo-id lerobot/aloha_static_coffee --episode-index 0

UIが表示され，データが可視化される

ローカルに落としたデータセットを読み込むことも可能　例えば

python lerobot/scripts/visualize_dataset.py \
    --repo-id lerobot/pusht \
    --root ./my_local_data_dir \
    --local-files-only 1 \
    --episode-index 0

`LeRobotDataset` format

Hugginfaceのデータセットを内包　更に情報が追加されている

dataset attributes:
  ├ hf_dataset: a Hugging Face dataset (backed by Arrow/parquet). Typical features example:
  │  ├ observation.images.cam_high (VideoFrame):
  │  │   VideoFrame = {'path': path to a mp4 video, 'timestamp' (float32): timestamp in the video}
  │  ├ observation.state (list of float32): position of an arm joints (for instance)
  │  ... (more observations)
  │  ├ action (list of float32): goal position of an arm joints (for instance)
  │  ├ episode_index (int64): index of the episode for this sample
  │  ├ frame_index (int64): index of the frame for this sample in the episode ; starts at 0 for each episode
  │  ├ timestamp (float32): timestamp in the episode
  │  ├ next.done (bool): indicates the end of en episode ; True for the last frame in each episode
  │  └ index (int64): general index in the whole dataset
  ├ episode_data_index: contains 2 tensors with the start and end indices of each episode
  │  ├ from (1D int64 tensor): first frame index for each episode — shape (num episodes,) starts with 0
  │  └ to: (1D int64 tensor): last frame index for each episode — shape (num episodes,)
  ├ stats: a dictionary of statistics (max, mean, min, std) for each feature in the dataset, for instance
  │  ├ observation.images.cam_high: {'max': tensor with same number of dimensions (e.g. `(c, 1, 1)` for images, `(c,)` for states), etc.}
  │  ...
  ├ info: a dictionary of metadata on the dataset
  │  ├ codebase_version (str): this is to keep track of the codebase version the dataset was created with
  │  ├ fps (float): frame per second the dataset is recorded/synchronized to
  │  ├ video (bool): indicates if frames are encoded in mp4 video files to save space or stored as png files
  │  └ encoding (dict): if video, this documents the main options that were used with ffmpeg to encode the videos
  ├ videos_dir (Path): where the mp4 videos or png images are stored/accessed
  └ camera_keys (list of string): the keys to access camera features in the item returned by the dataset (e.g. `["observation.images.cam_high", ...]`)

データセットの使い方は第2回で詳しく学ぶ予定

Evaluate a pretrained policy

学習済みモデルを評価するスクリプト例えば

python lerobot/scripts/eval.py -p lerobot/act_aloha_sim_transfer_cube_human eval.n_episodes=20 eval.batch_size=10

ローカル環境にあるモデルを指定するときも-pでいける

python lerobot/scripts/eval.py -p {OUTPUT_DIR}/checkpoints/last/pretrained_model

Train your own policy

自分の環境およびデータセットで学習するスクリプト例えば

python lerobot/scripts/train.py policy=act env=aloha env.task=AlohaTransferCube-v0 dataset_repo_id=lerobot/aloha_sim_transfer_cube_human

outputの場所を指定したいときは

hydra.run.dir=path/to/output/dir

を指定する
wandbを使うときは

    wandb.enable=true

を追加

ざっくりとした使い方は以上　次回からはexampleを見ていく

Link

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up

第1回 Quick Start