CLAW MACHINE
Hand Gesture-Controlled Robot in Robosuite
Akhilesh Basetty · Jiali Chen · Omar Lejmi
Arcade Claw Machine.
Powered by Your Hand.
We built a virtual claw machine game inside Robosuite's physics simulation, controlled entirely by your hand via webcam. Open your palm to move a Franka Panda robot arm around a table — close your fist to trigger the grab sequence and pick up the cube. We compared this hand-tracking interface against a keyboard baseline to evaluate which mode of human-robot control is more effective.
Three Steps to Robot Control
Hand movements become robot actions through a real-time perception → mapping → control pipeline.
Google MediaPipe processes each webcam frame to detect 21 hand landmarks. Palm center position is extracted and the hand is classified as open (move) or closed fist (grab trigger) by comparing fingertip distances to the wrist.
Normalized screen coordinates (u, v) ∈ [0, 1] are linearly mapped to table-space XY targets in the Robosuite environment. Moving your hand left/right/up/down translates directly to the robot's workspace over the table.
A PID controller drives the Panda arm's end-effector to the target. A state machine handles two modes: joystick (hover over table, gripper open) and pickup sequence (descend → close gripper → lift).
See It In Action
Both outcomes captured from live gameplay — hand-gesture controlled robot arm in Robosuite.
Results
5 trials per condition across 6 participants. Success = cube lifted above height threshold after the pickup sequence.
Success Rate
Success Rate
Tested
(not significant)
We found no statistically significant difference (p = 1.0) between the two control methods — both achieved a 30% success rate. This suggests hand-gesture control is a viable alternative to keyboard control for teleoperation, with implications for AR/VR interfaces and imitation learning data collection.