I built a small experiment over a 3-hour vibe coding session: a real-time T9 keyboard controlled by hand gestures, running entirely in the browser.
It uses:
YOLOX for gesture detection
ONNX Runtime Web for in-browser inference
Plain JS for the UI
The original goal was simple: Could I make real-time gesture-based input usable inside a browser without freezing the UI?
A few observations:
In-browser ML performance is better than I expected on modern laptops
Subtle gesture distinctions (e.g. similar seals like Tiger vs Ram) require stronger detection than MediaPipe provided — YOLOX performed noticeably better
Lighting consistency matters more than hand size
It’s obviously not production-grade, but it was an interesting exploration of browser-based vision input.
Curious what others think about gesture interfaces as alternative input systems.
Demo: https://ketsuin.clothpath.com/
Comments URL: https://news.ycombinator.com/item?id=47058090
Points: 1
# Comments: 0