Beyond Chatbots: Real-Time Gesture Control for IoT

AI isn't limited to generating text. It can interact with the physical world in ways that feel like science fiction.

Most low-code automation tools work like a basic conversation. You send a request, wait for an API response, and get your result. But real-world applications (robotics, smart home control, interactive displays) need continuous, real-time data streams that don't pause for breath.

In this project, we're using Kemu to build a "Minority Report" style gesture controller that bridges computer vision and IoT in real time.

The Build: Where MediaPipe Meets IoT

What happens when you mix completely different domains? You get a video feed processed by an advanced machine learning model that converts into hardware control signals. All visual, all connected.

1. The Eyes: Visual Input

It starts simple. A webcam node feeds a live video stream into the Track Hand service. That's it. No complex setup.

2. The Brain: MediaPipe Hand Landmarker

Here's where the magic happens. The Track Hand node runs a MediaPipe Hand Landmarker model that doesn't just "see" a hand: it understands its full skeletal structure in 3D space as it moves.

Check the latency. Actually, you won't find any. As your hand moves, the digital skeleton matches it instantly. This is Kemu handling high-frequency data streams without breaking a sweat.

Performance note: The hand‑tracker’s responsiveness depends on your hardware. This demo ran in the browser using my laptop's built-in GPU. Results shown were recorded on a MacBook Pro M1.

3. The Logic: Pinch & Validate

Landmark data alone doesn't tell you what someone wants to do. We pipe it into logic widgets that calculate specific gestures:

Pinch Detection: Spots when thumb and index finger meet
Z-Rotation: Tracks the angle of hand rotation

Suddenly, you've got a human hand acting as a physical dial.

4. The Action: IoT Control

Now we connect to the physical world. In this demo, a Lamp service responds as you rotate your hand: the value updates and the lamp dims or brightens on the spot.

Note: The "Lamp" in the video is a simulation. In production, just swap this node for a real IoT integration: a Philips Hue bulb, a media server's volume controller, or a servo motor in a robotics rig.

Why This Actually Matters

This isn't just a cool demo. It shows two things that separate Kemu from typical workflow engines:

First, domain mixing made simple. We're seamlessly connecting Object Detection (AI/Vision) with IoT (Hardware Control) in one cohesive graph. No separate stacks for AI logic and hardware drivers. It just works together.

Second, you're not limited to our toolbox. The hand tracker and lamp services are custom services. Got a proprietary model or niche hardware device? Wrap it as a Kemu service and drag it onto the canvas. That's it.

A Glimpse Into the Future

This points toward something bigger: interfaces that feel fluid and invisible. Imagine gesture controls for smart homes that anyone can use, or touchless systems for sterile medical environments where you can't touch anything.

The foundation for that future? It's built with graphs like these. Start building yours.

Beyond Chatbots: Real-Time Gesture Control for IoT

The Build: Where MediaPipe Meets IoT

Why This Actually Matters

A Glimpse Into the Future

Ready to get started with Kemu?