1
0

Delete article

Deleted articles cannot be recovered.

Draft of this article would be also deleted.

Are you sure you want to delete this article?

More than 5 years have passed since last update.

Video Representation Learning by Dense Predictive Coding

Posted at

  • 自己教師あり学習でビデオから時空間埋め込みを学習。人間の動作認識
  • Dense Predictive Coding という手法を提案
    recurrently predicting future representation
  • カリキュラムトレーニング
  • Kinetics 400 で事前学習してaction recognitionで評価。HMDB51で35.7%

ネットワーク

image.png

  • 2.5秒分学習して1.5秒分の埋め込みを予測。
  • fでxを隠れ変数z_t に変換、その後集計関数gでc_tに変換。
  • z_t = f(x_t), c_t = g(z_1, z_2, .., z_t)
  • 予測関数Φでc_tからz^_t+1を予測

Contrastive loss

  • NCE Noise Contrastive Estimation
  • z^とzをdot product で比較。コサイン距離?
  • 3つのnegative pairs
  • Easy Negative - 別のビデオから
  • Spatial Negative - 同じビデオの空間的に異なる場所
  • Temporal Negative - 同じビデオの同じ場所の、時間的に異なる場所

カリキュラムトレーニング

予測するフレーム数をだんだん増やして遠くまで予測するようにする。

1
0
0

Register as a new user and use Qiita more conveniently

  1. You get articles that match your needs
  2. You can efficiently read back useful information
  3. You can use dark theme
What you can do with signing up
1
0

Delete article

Deleted articles cannot be recovered.

Draft of this article would be also deleted.

Are you sure you want to delete this article?