0

transformersのTrainerのSchedulerについて

Posted at 2024-09-14

本記事の目的

Hugging FaceのTrainerを最近使用しているが、Schedulerについての説明記事が少ないのでメモ用として記載する。

Version

transformers==4.39.3
torch==2.3.1

設定方法

from transformers import TrainingArguments

TrainingArguments(
    optim='adamw_torch' # adamw_torch, adamw_tf, sgd, ...
    lr_scheduler_type=<ここにstring型で設定>
)

下記項目が設定可能

linear
cosine
cosine_with_restarts
polynomial
constant
constant_with_warmup
inverse_sqrt
reduce_lr_on_plateau

スケジューラの説明

名前だけではどれがなんのスケジューラと対応しているのかわかりずらいのでメモ用に記載

linear

get_linear_schedule_with_warmup

cosine

get_cosine_schedule_with_warmup

cosine_with_restarts

get_cosine_with_hard_restarts_schedule_with_warmup

polynomial

get_polynomial_decay_schedule_with_warmup

constant

get_constant_schedule

constant_with_warmup

get_constant_schedule_with_warmup

inverse_sqrt

get_inverse_sqrt_schedule

reduce_lr_on_plateau

get_reduce_on_plateau_schedule

引数について

num_training_stepsについては、Trainer内部で自動に与えられるため、与える必要はない

num_warmup_stepsについては、下記のようにTrainingArgumentsのwarmup_ratioで設定可能

TrainingArguments(
    warmup_ratio=0.1
)

その他スケジューラによっての固有の引数はTrainingArgumentsのlr_scheduler_kwargsで設定

TrainingArguments(
    lr_scheduler_kwargs={"num_cycles": 4.5},
)

0

Register as a new user and use Qiita more conveniently

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up

0