This article is a Private article. Only a writer and users who know the URL can access it.
Please change open range to public in publish setting if you want to share this article with other users.

MediaPipeを使ったじゃんけん検出

Posted at 2024-12-21

はじめに

AdventCalendar12月22日担当kag1022です。
この記事では、MediaPipeとOpenCVを使用して、カメラからリアルタイムでじゃんけんの手を検出するプログラムを紹介します。MediaPipeは機械学習を用いた手のランドマーク検出が可能で、これを利用することで高精度なハンドトラッキングを実現できます。

プログラム全体のコード

import cv2
import mediapipe as mp
import numpy as np


class JankenDetector:
   def __init__(self):
       self.mp_hands = mp.solutions.hands
       self.hands = self.mp_hands.Hands(
           static_image_mode=False,
           max_num_hands=2, 
           min_detection_confidence=0.7,
           min_tracking_confidence=0.7
       )
       self.mp_draw = mp.solutions.drawing_utils

   def detect_gesture(self, landmarks):
       fingers = []
       
       for i in range(8, 21, 4):
           if landmarks[i].y < landmarks[i - 2].y:
               fingers.append(1)
           else:
               fingers.append(0)
       
       total_fingers = sum(fingers)

       if total_fingers == 0:
           return "Rock"
       elif total_fingers == 4:
           return "Paper"
       elif total_fingers == 2 and fingers[0] == 1 and fingers[1] == 1:
           return "Scissors"
       else:
           return "unclear"

   def process_frame(self, frame):
       rgb_frame = cv2.cvtColor(frame, cv2.COLOR_BGR2RGB)
       results = self.hands.process(rgb_frame)

       if results.multi_hand_landmarks:
           for hand_landmarks in results.multi_hand_landmarks:
               self.mp_draw.draw_landmarks(
                   frame, 
                   hand_landmarks, 
                   self.mp_hands.HAND_CONNECTIONS
               )
               gesture = self.detect_gesture(hand_landmarks.landmark)

               h, w, c = frame.shape
               wrist_x = int(hand_landmarks.landmark[0].x * w)
               wrist_y = int(hand_landmarks.landmark[0].y * h)

               text_size = cv2.getTextSize(gesture, cv2.FONT_HERSHEY_SIMPLEX, 1, 2)[0]
               text_x = wrist_x - text_size[0] // 2
               text_y = wrist_y - 20

               cv2.rectangle(
                   frame,
                   (text_x - 10, text_y - text_size[1] - 10),
                   (text_x + text_size[0] + 10, text_y + 10),
                   (0, 0, 0),
                   -1
               )

               cv2.putText(
                   frame,
                   gesture,
                   (text_x, text_y),
                   cv2.FONT_HERSHEY_SIMPLEX,
                   1,
                   (255, 255, 255),
                   2,
                   cv2.LINE_AA
               )
       return frame

   def run(self):
       cap = cv2.VideoCapture(0)

       while cap.isOpened():
           ret, frame = cap.read()
           if not ret:
               break

           processed_frame = self.process_frame(frame)
           cv2.imshow("Janken Detection", processed_frame)

           if cv2.waitKey(1) & 0xFF == ord("q"):
               break

       cap.release()
       cv2.destroyAllWindows()
       self.hands.close()


if __name__ == "__main__":
   detector = JankenDetector()
   detector.run()

必要なライブラリ

プログラムを実行するために、以下のライブラリが必要です：

import cv2
import mediapipe as mp
import numpy as np

プログラムの解説

1. 手の検出設定

初期設定では、以下のパラメータを使用します：

def __init__(self):
    self.mp_hands = mp.solutions.hands
    self.hands = self.mp_hands.Hands(
        static_image_mode=False,
        max_num_hands=2,
        min_detection_confidence=0.7,
        min_tracking_confidence=0.7
    )

static_image_mode=False：動画ストリーム用に最適化されています
min_detection_confidence=0.7：誤検出を減らすための閾値を設定しています

2. じゃんけんの判定ロジック

def detect_gesture(self, landmarks):
    fingers = []
    
    # 人差し指から小指までの状態を確認
    for i in range(8, 21, 4):
        if landmarks[i].y < landmarks[i-2].y:
            fingers.append(1)  # 指が伸びている
        else:
            fingers.append(0)  # 指が曲がっている

判定基準：

グー：すべての指が曲がっている（total_fingers = 0）
パー：すべての指が伸びている（total_fingers = 4）
チョキ：人差し指と中指のみ伸びている（total_fingers = 2）

3. 結果の表示処理

def process_frame(self, frame):
    # BGRからRGBに変換
    rgb_frame = cv2.cvtColor(frame, cv2.COLOR_BGR2RGB)
    
    # 手の検出と判定
    results = self.hands.process(rgb_frame)
    
    if results.multi_hand_landmarks:
        for hand_landmarks in results.multi_hand_landmarks:
            # ランドマークの描画
            self.mp_draw.draw_landmarks(frame, hand_landmarks, 
                                      self.mp_hands.HAND_CONNECTIONS)
            
            # 判定結果のテキスト表示
            gesture = self.detect_gesture(hand_landmarks.landmark)

使用方法

必要なライブラリのインストールします
```
pip install opencv-python mediapipe numpy
```
プログラムを実行します
カメラに手を映します
グー・チョキ・パーのいずれかを出します
判定結果が画面上に表示されます
終了する場合はQキーを押します

改善点

OpenCVでは日本語を表示させることができないため、別で実装する必要があります
チョキのとき、検出精度があまり良くないです

まとめ

このプログラムを使用することで、リアルタイムでじゃんけんの手を認識することができます。
なお、本プログラムはQiitaの類似記事とClaude AIを参考にさせていただきました。

参考文献

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up