Skip to content

Latest commit

 

History

History
30 lines (21 loc) · 2.74 KB

README.md

File metadata and controls

30 lines (21 loc) · 2.74 KB

OpenCV-Hand-Gesture-Recognition

The aim of this project is to identify single and multiple hands through skeletonization.

Methodology

Single hand gesture recognition

image

Image thresholding

For every pixel, the same threshold value is applied. If the pixel value is smaller than the threshold, it is set to 0, otherwise it is set to a maximum value.

Convexity Defects

The function cv2.convexityDefects() for finding the convexity defects of a contour. This takes as input the contour and its corresponding hull indices and returns an array containing the convexity defects as output. The output contains an array where each row contains 4 values [start point, endpoint, farthest point, approximate distance to the farthest point]. First, we find the contours for the hand using the concept of skeletonization, then we find the convex hull for the contour. Then we find the convexity defects and identify number of fingers.

image

Multiple hand recognition

image

Media Pipe framework

Implementation for the second part of the project makes use of Media Pipe. MediaPipe Hands is a high-fidelity hand and finger tracking solution. It employs machine learning (ML) to infer 21 3D landmarks of a hand from just a single frame. Current state-of-the-art approach methods rely primarily on powerful desktop environments for inferencing, whereas this method outperforms other methods and achieves very good results in real-time. image

Hand Landmark Model

Hand landmark model performs precise key point localization of 21 3D hand-knuckle coordinates inside the detected hand regions via regression, that is direct coordinate prediction. The model learns a consistent internal hand pose representation and is robust even to partially visible hands and self-occlusions.

Applying media pipe hand pose

The first two important metrics associated are min_detection_confidence and min_tracking_confidence. When we first use the media pip hand model, it is going to detect our hand and then track it. We set min_detection_confidence to be 80% accurate for first detection and min_tracking_confidence as 50% accuracy for tracking after the first detection. Setting it to a higher value can increase robustness of the solution, at the expense of a higher latency. We can set the number of hands that need to be recognized using max_num_hands. image