Control Web Pages with Hand Gestures Using Tampermonkey & MediaPipe

This tutorial shows how to inject JavaScript with Tampermonkey and use MediaPipe's hand‑gesture recognition to enable air‑gesture page scrolling, cursor movement, and click simulation, turning any web page into a touch‑free interface.

Rare Earth Juejin Tech Community
Rare Earth Juejin Tech Community
Rare Earth Juejin Tech Community
Control Web Pages with Hand Gestures Using Tampermonkey & MediaPipe

Introduction

Some readers asked whether the ability of Tampermonkey to inject arbitrary front‑end JavaScript into a page could be combined with hand‑gesture recognition to achieve remote page control, similar to air‑page turning on a phone. The author explored this idea and implemented it.

Features

Up/down page scrolling controlled by hand gestures: open left hand to scroll down, fist left hand to scroll up.

A simulated cursor moves with the right hand.

Fist gesture with the right hand triggers click actions.

Additional gestures include a two‑hand "peace" sign to close the current page, a left thumb up with the right hand to zoom, and many others.

Implementation Principle

The solution simply combines Tampermonkey and MediaPipe hand‑gesture recognition .

Tampermonkey

Tampermonkey is a browser extension that lets users inject custom JavaScript into pages at load time, enabling enhancement, modification, or automation of web behavior. It can be used for auto‑login, data scraping, ad blocking, and more.

MediaPipe Hand‑Gesture Recognition

MediaPipe provides a library of AI and ML tools, including hand‑gesture detection. The demo uses the @mediapipe/tasks-vision NPM package to obtain a gesture recognizer.

// Create task for image file processing:
const vision = await FilesetResolver.forVisionTasks(
  "https://cdn.jsdelivr.net/npm/@mediapipe/tasks-vision@latest/wasm "
);
const gestureRecognizer = await GestureRecognizer.createFromOptions(vision, {
  baseOptions: {
    modelAssetPath: "https://storage.googleapis.com/mediapipe-tasks/gesture_recognizer/gesture_recognizer.task"
  },
  numHands: 2
});

Combining Both

By injecting the MediaPipe gesture code via a Tampermonkey script, the gesture recognizer runs on any web page. The script requests camera permission; the video preview is hidden to avoid visual clutter.

Gesture detection works by analyzing key point coordinates returned by MediaPipe. Simple distance checks differentiate gestures such as open hand, fist, and victory sign.

// Determine if hand is open
function isHandOpen(hand) {
  const fingers = [[8,5],[12,9],[16,13],[20,17]];
  return fingers.filter(([tip,base])=>dist(hand[tip],hand[base])>0.1).length>=4;
}
// Determine if hand is a fist
function isFist(hand) {
  const fingers = [[8,5],[12,9],[16,13],[20,17]];
  return fingers.filter(([tip,base])=>dist(hand[tip],hand[base])<0.06).length>=3;
}
// Victory sign
function isVictory(hand) {
  const extended=[8,12];
  const folded=[16,20];
  return (
    extended.every(i=>dist(hand[i],hand[i-3])>0.1) &&
    folded.every(i=>dist(hand[i],hand[i-3])<0.05)
  );
}

The hand object comes from MediaPipe and contains the positions of key landmarks; custom logic maps these to desired actions.

Further Learning

Explore the official MediaPipe demo and NPM package documentation for more gestures and features. For deeper Tampermonkey scripting, refer to the "Tampermonkey Script Practical Guide".

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

JavaScriptWeb DevelopmentMediaPipeTampermonkeyGesture control
Rare Earth Juejin Tech Community
Written by

Rare Earth Juejin Tech Community

Juejin, a tech community that helps developers grow.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.