More than 3 years have passed since last update.

Hand 検出について調査中

Last updated at 2022-01-13Posted at 2018-11-24

Hand　検出について調査中です。

追記 2022.01.13 時点では、代表的なプラットフォームで、hand poseの検出の学習済みモデルが提供されるようになってきています。あなたが利用しようとするプラットフォームで、検索してみてください。

次に示すmediapipeのライブラリにもhand poseの検出を含んでいます。Android iOS C++ Python JSの各環境で利用することができます。

追記　2019.6.25時点で私が使えそうに思っているのは
TensorFlow での実装です。
そのため、記述の順序を入れ替えました。

追記　2019.10.4 github で hand-detection についての一覧があります。

https://github.com/topics/hand-detection

How to Build a Real-time Hand-Detector using Neural Networks (SSD) on Tensorflow

The Egohands Dataset を利用している。

github https://github.com/victordibia/handtracking

victordibia/handtracking

github をcloneして動作させるとよい。

$ pip install tensorflow

# 引数の指定がないときは、ビデオ入力を処理する。
$ python detect_single_threaded.py

Youtube Hand recognition full source code OpenCV

Youtube のサイトから、ソースコードへのリンクが貼ってある。

顔を検出してから、それを元に肌色の領域を検出している。
2値化した領域を
凸包(Convex Hull)
の情報を元に、ハンドジェスチャを判断していると見られます。

Youtube Hand detection using opencv

これも、肌色の領域を求めて判断するアルゴリズムです。

Youtube Basic Hand Detection Finger Counter with C++ and OpenCV (source code included)

動画を、見ると固定カメラを用いて、しかも画像中の固定の領域で処理をしており、輪郭処理によって指の本数を数えている。

github https://github.com/redbeardanil/Estimation-of-the-number-of-real-time-hand-fingers-with-image-processing

上記のYoutubeに対応するGithub のサイトです。

BackgroundSubtractorMOG2()
を使って領域を求めているので、どうやら、カメラが固定されていることが必須の条件となります。

convexHull 点集合に対する凸包を作って、convexityDefects 輪郭の凹状欠損を見つけています。見つけた凹状欠損の数を元に、指を何本広げているのかを判定しています。

手を検出するということ自体に役に立つことではないけれども、見つけた後に輪郭情報を使って、広げた指の本数を判定するのに役立てることができます。

Youtube Hand detection + Gesture recognition by YOLOv2

Hand detection + Gesture recognition by YOLOv2
と書かれています。
対応するソースコードと学習に何を用いたのかの情報にたどり着けていません。

Youtube Real-time hand detection, tracking and segmentation

2012/11/27 に公開

Hand detection using random forrests, handtracking using a particle filter, hand segmentation using active contours. The background in this demo video is rather simple; I'm mainly testing the tracking and segmentation behaviour in bad illumination conditions (where for instance pure skin color based tracking or detection would fail)

コードにはたどり着けていません。

Youtube Joint Hand Detection and Rotation Estimation Using CNN

Youtube のページからProject webpage　にたどり着けます。

http://www.idengxm.com/handdetection/index.html

The rotation annotation of hand detection dataset will be available.

と言ってくれているので、dataset の公開が待たれます。

paper http://www.idengxm.com/handdetection/TIP2018_handdeteciton_cameraready.pdf

www Hand detection using multiple proposals

データが次の場所から入手できます。
Visual Geometry Group Hand Dataset

またソースコードが以下の場所からダウンロードできます。
Visual Geometry Group A reference implementation of the hand detection
2012 年時点の実装です。

輪郭についてはHOG特徴量を使っています。深層学習以前の実装です。
深層学習では、エッジ特徴も色の特徴も同時に扱える枠組みです。
hand検出が深層学習で実装されているのに着目すべきでしょう。

Github Real-time Hand-Detection using Neural Networks (SSD) on Tensorflow

DNNを用いた実装です。

This repo documents steps and scripts used to train a hand detector using Tensorflow (Object Detection API). As with any DNN based task, the most expensive (and riskiest) part of the process has to do with finding or creating the right (annotated) dataset.

と言っており、ここでも、適切なデータセットを見つける、あるいは自前で作ることが重要になっている。
このページでは以下の２つのデータセットを用いている。

dataset http://www.robots.ox.ac.uk/~vgg/data/hands/

dataset http://vision.soic.indiana.edu/projects/egohands/

WWW Welcome to the VIVA Hand Detection Challenge!

Welcome to the VIVA hand detection benchmark! The dataset consists of 2D bounding boxes around driver and passenger hands from 54 videos collected in naturalistic driving settings of illumination variation, large hand movements, and common occlusion. There are 7 possible viewpoints, including first person view. Some of the data has been captured in our testbeds, while some was kindly provided by YouTube.

このような車の中での運転者と同乗者の手の画像のデータセットです。

このチャレンジに応募した人の論文に、上記のホームページからたどることができます。

paper http://adas.cvc.uab.es/cvvt2017/wp-content/uploads/sites/14/2014/03/5.pdf

Github Hand Tracking : Tracking hands using SSD with MobilenetV1

Tensorflow Object Detection API Tensorflow Object Detection API Tutorial
を使ったHand 検出です。

このgithub のリポジトリの著者は、自分で手のデータとアノテーションを用意して、このリポジトリにデータを公開しています。
hand sign についても開発を進めています。

Github https://github.com/EvilPort2/Sign-Language

GithubのREADME.md にはYouTubeの動画にリンクが示しています。

この動画の様子をみれば、手の領域がどのあたりにあるのかは予め与えていて、手の画像で手話のどの表現になっているかの判定に集中しているものらしいことがわかります。背景は、手との分離がしやすい壁画像になっています。

Deep Convolutional Neural Networkによる手形状領域の抽出

paper Deep Convolutional Neural Network による手形状領域の抽出

Github https://github.com/PierfrancescoSoffritti/Handy

A few assumptions have been made:

The camera is supposed to be static.
The camera has no automatic regulations, such as auto-focus etc.
The user is not moving in the frame (eg: he sits at his desk in front of the camera).
There are no particular constraints on the color of the background, but it should be approximately static (no moving objects/strong changes of illumination in the background).

となっており、固定カメラであることを仮定しているのは、用途が限られる。