IOT

How to design a Gesture Controlled Virtual Mouse with ESP32-CAM & OpenCV

IoT or the internet of things is characterized as a forthcoming innovation that empowers us to control equipment gadgets through the Internet. Homes of the next generation will become more and more self-controlled and mechanized because of the solace it gives, particularly when utilized in a private home.  

Touchless gestures are the new frontier in the world of human-machine interfaces. You can control a computer, microcontroller, robot, or another device by swiping your palm over a sensor. To open or close any app, start the music, answer calls, and so on, most phones now have a gesture control capability. This is a very useful function for saving time, and it also looks amazing to use gestures to control any gadget. 

In this blog, we will create a Gesture Controlled Virtual Mouse with ESP32-CAM and OpenCV.  The mouse tracking and clicking activities can be controlled wirelessly using the ESP32 Camera Module and a Python application.

To get started, you’ll need a strong understanding of Python, image processing, Embedded Systems, and the Internet of Things. First, we’ll learn how to control mouse tracking and clicking, as well as the requirements for running a Python program. We’ll start by testing the entire Python script with a webcam or a laptop’s inbuilt camera.

The Python code will be run on an ESP32-CAM Module in the second section. As a result, instead of a PC camera, the ESP32-CAM will be used as an input device or any other external camera.

What are Gestures?

A Gesture is a movement made with a part of your body, most often your hands, to convey emotion or information. A gesture is a type of nonverbal communication in which a person’s visible bodily gestures can convey a message. It is possible to control activities without contacting the real equipment by detecting these movements.

Left, right, up, down, forward, backward, clockwise, anticlockwise, and waving are the movements that users can recognize in this situation. You can also add right-left, left-right, up-down, down-up, forward-backward, and backward-forward to the mix.

Hand gestures are a widely recognized language and the most powerful and expressive way of human communication. It is sufficiently expressive to be understood by the deaf and dumb.

Hardware Required:

  • ESP32-CAM Board-AI-Thinker ESP32 Camera Module
  • FTDI Module-USB-to-TTL Converter Module
  • USB Cable-5V Mini-USB Data Cable
  • Jumper Wires-Female to Female Connectors

Controlling Mouse Tracking & Clicks with PC Camera

Let’s develop a Gesture Controlled Virtual Mouse using PC image recognition technologies before moving on to the project.

Installing Python & Required Libraries

In order for the live video stream to appear on our computer, we must develop a Python script that allows us to retrieve the video frames. The first step is to get Python installed. Download Python version 3.7.8 from python.org. A few libraries will not work unless you obtain this version or downgrade to this version.

  • Once downloaded and installed, open a command prompt and type the following commands:
python –version
  • The output should look like this, with version 3.7.8.
  • Now we have to install a few libraries. For that run the following commands below one after another until all the libraries are installed.

pip install numpy
pip install opencv-python
pip install autopy
pip install mediapipe
  • If the python version you installed is correct then installing these won’t be an issue.

Source Code/Program

  • Create a folder, and inside that folder, create a new python file called track_hand.py.
  • Now save the code below by copying and pasting it.
import cv2
import mediapipe as mp
import time
import math
import numpy as np
class handDetector():
def __init__(self, mode=False, maxHands=1, modelComplexity=1, detectionCon=0.5, trackCon=0.5):
self.mode = mode
self.maxHands = maxHands
self.modelComplex = modelComplexity
self.detectionCon = detectionCon
self.trackCon = trackCon
self.mpHands = mp.solutions.hands
self.hands = self.mpHands.Hands(self.mode, self.maxHands, self.modelComplex,
self.detectionCon, self.trackCon)
self.mpDraw = mp.solutions.drawing_utils
self.tipIds = [4, 8, 12, 16, 20]
def findHands(self, img, draw=True):
imgRGB = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
self.results = self.hands.process(imgRGB)
# print(results.multi_hand_landmarks)
if self.results.multi_hand_landmarks:
for handLms in self.results.multi_hand_landmarks:
if draw:
self.mpDraw.draw_landmarks(img, handLms,
self.mpHands.HAND_CONNECTIONS)
return img
def findPosition(self, img, handNo=0, draw=True):
xList = []
yList = []
bbox = []
self.lmList = []
if self.results.multi_hand_landmarks:
myHand = self.results.multi_hand_landmarks[handNo]
for id, lm in enumerate(myHand.landmark):
# print(id, lm)
h, w, c = img.shape
cx, cy = int(lm.x * w), int(lm.y * h)
xList.append(cx)
yList.append(cy)
# print(id, cx, cy)
self.lmList.append([id, cx, cy])
if draw:
cv2.circle(img, (cx, cy), 5, (255, 0, 255), cv2.FILLED)
xmin, xmax = min(xList), max(xList)
ymin, ymax = min(yList), max(yList)
bbox = xmin, ymin, xmax, ymax
if draw:
cv2.rectangle(img, (xmin – 20, ymin – 20), (xmax + 20, ymax + 20),
(0, 255, 0), 2)
return self.lmList, bbox
def fingersUp(self):
fingers = []
# Thumb
#print(self.lmList)
#print(self.tipIds)

if self.lmList[self.tipIds[0]][1] > self.lmList[self.tipIds[0] – 1][1]:
fingers.append(1)
else:
fingers.append(0)
# Fingers
for id in range(1, 5):
if self.lmList[self.tipIds[id]][2] < self.lmList[self.tipIds[id] – 2][2]:
fingers.append(1)
else:
fingers.append(0)
# totalFingers = fingers.count(1)
return fingers
def findDistance(self, p1, p2, img, draw=True,r=15, t=3):
x1, y1 = self.lmList[p1][1:]
x2, y2 = self.lmList[p2][1:]
cx, cy = (x1 + x2) // 2, (y1 + y2) // 2
if draw:
cv2.line(img, (x1, y1), (x2, y2), (255, 0, 255), t)
cv2.circle(img, (x1, y1), r, (255, 0, 255), cv2.FILLED)
cv2.circle(img, (x2, y2), r, (255, 0, 255), cv2.FILLED)
cv2.circle(img, (cx, cy), r, (0, 0, 255), cv2.FILLED)
length = math.hypot(x2 – x1, y2 – y1)
return length, img, [x1, y1, x2, y2, cx, cy]
def main():
pTime = 0
cTime = 0
cap = cv2.VideoCapture(0)
detector = handDetector()
while True:

success, img = cap.read()

img = detector.findHands(img)
lmList, bbox = detector.findPosition(img)
if len(lmList) != 0:
print(lmList[4])
cTime = time.time()
fps = 1 / (cTime – pTime)
pTime = cTime
fingers = detector.fingersUp()
cv2.putText(img, str(int(fps)), (10, 70), cv2.FONT_HERSHEY_PLAIN, 3,
(255, 0, 255), 3)
cv2.imshow(“Image”, img)
cv2.waitKey(1)
if __name__ == “__main__”:
main()
  • Make a new python file called final.py inside the same folder.
  • Now save the code below by copying and pasting it. However, before saving, make the following changes:
  • ** The width and height of your webcam should be adjusted using the wCam and hCam characteristics.
import numpy as np
import track_hand as htm
import time
import autopy
import cv2
wCam, hCam = 1280, 720
frameR = 100
smoothening = 7
pTime = 0
plocX, plocY = 0, 0
clocX, clocY = 0, 0
cap = cv2.VideoCapture(0)
cap.set(3, wCam)
cap.set(4, hCam)
detector = htm.handDetector(maxHands=1)
wScr, hScr = autopy.screen.size()
# print(wScr, hScr)
while True:
# 1. Find hand Landmarks
fingers=[0,0,0,0,0]
success, img = cap.read()
img = detector.findHands(img)
lmList, bbox = detector.findPosition(img)
# 2. Get the tip of the index and middle fingers
if len(lmList) != 0:
x1, y1 = lmList[8][1:]
x2, y2 = lmList[12][1:]
# print(x1, y1, x2, y2)

# 3. Check which fingers are up
fingers = detector.fingersUp()
# print(fingers)
cv2.rectangle(img, (frameR, frameR), (wCam – frameR, hCam – frameR),
(255, 0, 255), 2)
# 4. Only Index Finger : Moving Mode
if fingers[1] == 1 and fingers[2] == 0:
# 5. Convert Coordinates
x3 = np.interp(x1, (frameR, wCam – frameR), (0, wScr))
y3 = np.interp(y1, (frameR, hCam – frameR), (0, hScr))
# 6. Smoothen Values
clocX = plocX + (x3 – plocX) / smoothening
clocY = plocY + (y3 – plocY) / smoothening

# 7. Move Mouse
autopy.mouse.move(wScr – clocX, clocY)
cv2.circle(img, (x1, y1), 15, (255, 0, 255), cv2.FILLED)
plocX, plocY = clocX, clocY

# 8. Both Index and middle fingers are up : Clicking Mode
if fingers[1] == 1 and fingers[2] == 1:
# 9. Find distance between fingers
length, img, lineInfo = detector.findDistance(8, 12, img)
print(length)
# 10. Click mouse if distance short
if length < 40:
cv2.circle(img, (lineInfo[4], lineInfo[5]),
15, (0, 255, 0), cv2.FILLED)
autopy.mouse.click()

# 11. Frame Rate
cTime = time.time()
fps = 1 / (cTime – pTime)
pTime = cTime
cv2.putText(img, str(int(fps)), (20, 50), cv2.FONT_HERSHEY_PLAIN, 3,
(255, 0, 0), 3)
# 12. Display
cv2.imshow(“Image”, img)
cv2.waitKey(1)

Testing

Now run the above code, something similar to the below must be visible.

Gesture Controlled Virtual Mouse

The image should follow the entire hand, including the fingers.

  • The pointer now moves when you move your hand inside the pink bounding area. To click, elevate the central figure and position it on the spot where the cursor is.
  • Now, you’ve completed half of the task. Let’s move on to another section which is the device or embedded part.

ESP32 CAM Module

The ESP32 Based Camera Module was developed by AI-Thinker. The controller contains a Wi-Fi + Bluetooth/BLE chip and is powered by a 32-bit CPU. It has a 520 KB internal SRAM and an external 4M PSRAM. UART, SPI, I2C, PWM, ADC, and DAC are all supported by its GPIO Pins.

The module is compatible with the OV2640 Camera Module, which has a camera resolution of 1600 x 1200 pixels. A 24-pin gold plated connector links the camera to the ESP32 CAM Board. A 4GB SD Card can be used on the board. The photographs captured are saved on the SD Card.

ESP32-CAM Features :

  • The smallest 802.11b/g/n Wi-Fi BT SoC module.
  • Low power 32-bit CPU, can also serve the application processor.
  • Up to 160MHz clock speed, summary computing power up to 600 DMIPS.
  • Built-in 520 KB SRAM, external 4MPSRAM.
  • Supports UART/SPI/I2C/PWM/ADC/DAC.
  • Support OV2640 and OV7670 cameras, built-in flash lamp.
  • Support image WiFI upload.
  • Supports TF card.
  • Supports multiple sleep modes.
  • Embedded Lwip and FreeRTOS.
  • Supports STA/AP/STA+AP operation mode.
  • Support Smart Config/AirKiss technology.
  • Support for serial port local and remote firmware upgrades (FOTA).

ESP32-CAM FTDI Connection

  • There is no programmer chip on the PCB. So, any form of USB-to-TTL Module can be used to program this board. FTDI Modules based on the CP2102 or CP2104 chip, or any other chip, are widely accessible.
  • Connect the FTDI Module to the ESP32 CAM Module as shown below.
ESP32 CAM FTDI Module Connection
ESP32-CAMFTDI Programmer
GNDGND
5VVCC
U0RTX
U0TRX
GPIO0GND

Connect the ESP32’s 5V and GND pins to the FTDI Module’s 5V and GND. Connect the Rx to UOT and the Tx to UOR Pin in the same way. The most crucial thing is that you must connect the IO0 and GND pins. The device will now be in programming mode. You can remove it once the programming is completed.

Project PCB Gerber File & PCB Ordering Online

If you don’t want to put the circuit together on a breadboard and instead prefer a PCB. EasyEDA’s online Circuit Schematics & PCB Design tool was used to create the PCB Board for the ESP32 CAM Board. The PCB appears as seen below.

The Gerber File for the PCB is given below. You can simply download the Gerber File and order the PCB from https://www.nextpcb.com/

Download Gerber File: ESP32-CAM Multipurpose PCB

Now you can visit the NextPCB official website by clicking here: https://www.nextpcb.com/. So you will be directed to the NextPCB website.

  • You can now upload the Gerber File to the Website and place an order. The PCB quality is excellent. That is why the majority of people entrust NextPCB with their PCB and PCBA needs.
  • The components can be assembled on the PCB Board.

Installing ESP32CAM Library

Another streaming process will be used instead of the general ESP webserver example. As a result, another ESPCAM library is required. On the ESP32 microcontroller, the esp32cam library provides an object-oriented API for using the OV2640 camera. It’s an esp32-camera library wrapper.

Download the zip library as shown in the image from the following Github Link

After downloading, unzip the library and place it in the Arduino Library folder. To do so, follow the instructions below:

Open Arduino -> Sketch -> Include Library -> Add .ZIP Library… -> Navigate to downloaded zip file -> add

Source Code/Program for ESP32 CAM Module

The source code/program ESP32 CAM Gesture Controlled Mouse can be found in Library Example. So go to Files -> Examples -> esp32cam -> WifiCam.

You must make a little adjustment to the code before uploading it. Change the SSID and password variables to match the WiFi network you’re using.

Compile the code and upload it to the ESP32 CAM Board. However, you must follow a few steps each time you post.

  • When you push the upload button, make sure the IO0 pin is shorted to the ground.
  • If you notice dots and dashes during uploading, immediately press the reset button.
  • Remove the I01 pin shorting with Ground and push the reset button one more after the code has been uploaded.
  • If the output is still not the Serial monitor, push the reset button once again.

Now you can see a similar output as in the image below.

So that’s it for the ESP32-CAM section. Because the ESP32-CAM is broadcasting live video, make a note of the IP address displayed.

Python Code + Gesture Controlled Virtual Mouse with ESP32-CAM

Now we’ll finish up the Gesture Controlled Virtual Mouse with ESP32-CAM project. So, we return to our final.py code and make any necessary modifications or just paste the code provided.

import numpy as np
import track_hand as htm
import time
import autopy
import cv2

url=”http://192.168.1.61/cam-hi.jpg”
wCam, hCam = 800, 600
frameR = 100
smoothening = 7
pTime = 0
plocX, plocY = 0, 0
clocX, clocY = 0, 0
”’cap = cv2.VideoCapture(0)
cap.set(3, wCam)
cap.set(4, hCam)”’

detector = htm.handDetector(maxHands=1)
wScr, hScr = autopy.screen.size()
while True:
# 1. Find hand Landmarks
fingers=[0,0,0,0,0]
#success, img = cap.read()

img_resp=urllib.request.urlopen(url)
imgnp=np.array(bytearray(img_resp.read()),dtype=np.uint8)
img=cv2.imdecode(imgnp,-1)

img = detector.findHands(img)
lmList, bbox = detector.findPosition(img)

# 2. Get the tip of the index and middle fingers
if len(lmList) != 0:
x1, y1 = lmList[8][1:]
x2, y2 = lmList[12][1:]
# print(x1, y1, x2, y2)

# 3. Check which fingers are up
fingers = detector.fingersUp()

cv2.rectangle(img, (frameR, frameR), (wCam – frameR, hCam – frameR),
(255, 0, 255), 2)
# 4. Only Index Finger : Moving Mode
if fingers[1] == 1 and fingers[2] == 0:
# 5. Convert Coordinates
x3 = np.interp(x1, (frameR, wCam – frameR), (0, wScr))
y3 = np.interp(y1, (frameR, hCam – frameR), (0, hScr))
# 6. Smoothen Values
clocX = plocX + (x3 – plocX) / smoothening
clocY = plocY + (y3 – plocY) / smoothening

# 7. Move Mouse
autopy.mouse.move(wScr – clocX, clocY)
cv2.circle(img, (x1, y1), 15, (255, 0, 255), cv2.FILLED)
plocX, plocY = clocX, clocY

# 8. Both Index and middle fingers are up : Clicking Mode
if fingers[1] == 1 and fingers[2] == 1:
# 9. Find distance between fingers
length, img, lineInfo = detector.findDistance(8, 12, img)
print(length)
# 10. Click mouse if distance short
if length < 40:
cv2.circle(img, (lineInfo[4], lineInfo[5]),
15, (0, 255, 0), cv2.FILLED)
autopy.mouse.click()

# 11. Frame Rate
cTime = time.time()
fps = 1 / (cTime – pTime)
pTime = cTime
cv2.putText(img, str(int(fps)), (20, 50), cv2.FONT_HERSHEY_PLAIN, 3,
(255, 0, 0), 3)
# 12. Display
cv2.imshow(“Image”, img)
cv2.waitKey(1)

Make sure you adjust your URL variable in the preceding code according to the IP displayed on the Arduino IDE Serial monitor. Also, according to the resolution being displayed, change the wCam and hCam variables.

When you run the code, the ESP32Cam’s wireless stream with mouse tracking should be visible and functional.

Gesture Controlled Mouse ESP32-CAM

As a result, we’ve created our wireless Virtual Gesture Controlled Virtual Mouse with ESP32-CAM and OpenCV.

Gesture Controlled Virtual Mouse ESP32-CAM

Conclusion:

I hope all of you had understand how to design a Gesture Controlled Virtual Mouse with ESP32-CAM & OpenCV. We will be back soon with more informative blogs.

Leave a Reply

Your email address will not be published.