More than 3 years have passed since last update.

BodyPix と PoseNet の歩き方

Last updated at 2020-12-28Posted at 2020-06-22

BodyPix と PoseNet ってなに？

BodyPixは人体の24種類部位の領域分割（セグメンテーション)を行う TensorFlow.js 上で動作するオープンソースソフトです：

領域分割に加えて姿勢推定(PoseNet)も同時に行うことができています。PoseNet を開発した Google のグループが開発していて、ソースコードも共通な部分があり、共通化が検討されているようですね：
Add BodyPix ResNet model #280

紹介記事

[Updated] BodyPix: Real-time Person Segmentation in the Browser with TensorFlow.js(Nov. 18, 2019)
Introducing BodyPix: Real-time Person Segmentation in the Browser with TensorFlow.js(Feb 16, 2019)
[BodyPix の概要: ブラウザと TensorFlow.js によるリアルタイム人セグメンテーション] (https://developers-jp.googleblog.com/2019/04/bodypix-tensorflowjs.html)(一つ上の記事の日本語訳)
撮影中の映像からリアルタイムで人物を抜き出してパーツごとに認識可能な「BodyPix」(Gigazine)

BodyPix と PoseNet を使うには？

手軽に試す

カメラのついたPC やスマホなどで試すことができます。ライブデモ

応用例(BodyPix, PoseNet)

仕組みを知るには？

参考文献

[1] G. Papandreou, T. Zhu, L.-C. Chen, S. Gidaris, J. Tompson, and K. Murphy, “PersonLab: Person Pose Estimation and Instance Segmentation with a Bottom-Up, Part-Based, Geometric Embedding Model,” Lect. Notes Comput. Sci. (including Subser. Lect. Notes Artif. Intell. Lect. Notes Bioinformatics), vol. 11218 LNCS, pp. 282–299, Mar. 2018.

[2] G. Papandreou et al., “Towards accurate multi-person pose estimation in the wild,” in Proceedings - 30th IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2017, 2017.

参考文献解説

出展：参考文献[1]の Fig. 1

ひとつの画像入力'Input image'に対して、畳み込みネットワーク(CNN)が同時に姿勢推定結果('Detected Human Poses')と領域分割結果('Instance segmentation')に必要なデータを出力します。領域分割タスクと姿勢推定タスクという別のタスクが共通のネットワークモジュールである CNN を用いて実現されていることがこのシステムの特徴です。マルチタスク学習でしょうか？

(Pose Estimation Module)
CNN は、画像分類に用いられる MobileNet か ResNet の最後の分類層を得られる出力を５つのネットワークに渡します。ネットワーク (1) 'Heatmaps', (2)Short-range offsets', (3)'Mid-range' は姿勢推定のために用いられます。'Heatmaps' はそれぞれの部位(Keypoints)を推定し、'Short-range offsets' が画像内での部位位置の精度を高めるために使われ、'Mid-range offsets' は、部位間（例えば右上腕部）の場所を推定し、関節を繋げるために使われています。

(Instance Segmentation Module)
一方、(4)'Person segmentation mask' は人間部のピクセルマスクを推定します。それだけでは、複数人いた際に人を区別できないので、(5)'Long-range offsets' の推定結果と先ほど説明した 'Detected Human Poses' の推定結果を用いて、人ごとの領域分割を実現しています。

開発をするには？

GitHub上のソースコード(APIの叩き方の説明も記載されてます)
GitHub上の質問コーナー
- PoseNet の転移学習、ファインチューニングについての可能性についての質問
  - https://github.com/tensorflow/tfjs/issues/1388
  - https://github.com/tensorflow/tfjs/issues/1388#issuecomment-520871523
    - Unfortunately, the training script for PoseNet is not open-sourced. @tylerzhu-github for more details on that front.

以上から学習済モデルは公開されているが、学習するプログラムは公開されていないことがわかる。
https://github.com/tensorflow/tfjs/issues/1581#issuecomment-494636572
論文を参考にいくつか実装が提供されている：

https://github.com/scnuhealthy/Tensorflow_PersonLab (Tensorflow ver. 1.80)
https://github.com/octiapp/KerasPersonLab
https://github.com/Naykira/personlab

TensorFlow.js へのモデルの変換：

TensorFlow で学習されたモデル形式をTensorFlow.js で用いるには、 tensorflowjs_converter で変換しないといけません。

https://techblog.exawizards.com/entry/2018/11/01/174122

最近のニュースによると、そのまま読めるようになったということですが、どうでしょうか？ @tensorflow/tfjs-node で使うもののようですが。。。

Run a TensorFlow SavedModel in Node.js directly without conversion (2020/01/14)

You can now bring a pre-trained TensorFlow model in SavedModel format, load it in Node.js through the @tensorflow/tfjs-node (or tfjs-node-gpu) package, and execute the model for inference without using tfjs-converter.

BodyPix を Python で使う簡単な方法：

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up