Guide to Air Gesture

Since I can’t seem to find a generic term for the body tracking motions of the Kinect, pills I wanted to come up with my own set of basic gestures and their names. For now I’m calling these air gestures.  Others have offered the name motion sensing or even just computer vision. These are just some of my rough ideas.

Hand Gestures:


This is a simple gesture that involves holding one hand in a location for specific period of time. This could be used to select an item or change a mode of the UI.

Double Hold

With this gesture it’s pretty much the same as hold, ed except that it would involve two hands. This gesture is currently used in the calibration pose, case but could also be used to enable a double click like selection or advanced interface mode switching.


The push gesture is the closest one to a mouse click or touch available to a user.  The push gesture involves a user moving a hand forward rapidly.  The amount of distance the hand moves and over how much time could be set my the application.  Different amounts of pushes could result in different experience.  Let’s use Google maps as an example a slow forward push gesture could be handled a zoom out. While a slow pull back could be a zoom in.  Then a quick push could select an item in the UI.  Double push could back out.


The pull gesture is just like the push only in reverse. A quick retraction of the hand could be used for “back” behavior in an UI. The challenge with push and pull is that they could easily interfere with each other and there is also a high chance of false positives.  That being said with the proper tweaking of the gesture detection code I think this could be smoothed out.

Swipe (up, down, left, right)

The swipe gesture is the tracking of a hand movement quickly in one direction. Moving a hand up rapidly or down rapidly could be used to scroll a list of information on a screen up or down.  The left or right swipe could used to either rotate a 3D object or navigate through linear pages of information.  The up and down gestures could also be used to zoom in and out of an interface.

Raise Hand

The gesture is the motion take a hand that is down and raising it up. This gesture could be used as a quick acknowledgment of a message. If the UI required the user to agree to something or accept a prompt. The raised hand has a positive feel to it.  Raising a hand could also be used to detect multiple users. The raised hand could identify which person the application should be addressing. Raising a hand could also be used to pause an active application and bring up help information.

Squeeze and Release

Not like the way Darth Vader crushes his enemies throats at a distance gesture, but instead this is a two handed gesture. It’s performed by either bringing two hands together, the squeeze, or by pulling two hands apart, the release.  The squeeze and release is air gesture equivalent to the pinch to zoom behavior of a multi-touch screen. The issue with this two handed gesture is that in order to repeatedly go in one direction you need to go in the other direction. This gesture could be controlled by having the hands located above or below a half way point to best determine which to process. The other option, could be to define it based on the speed of the motion. The gesture is active only if the hands move a specific distance over a specific time frame.


Slightly different from the hold gesture, the hover gesture is generated whenever a hand over a UI element on the screen. For a quick action interface a series of objects could be placed around the sides of the screen. As soon as the users hand enters that element an action is fired off. This would be useful for games or any thing that required a very quick interaction model.


Moving a hand back and forth quickly has been used to give that hand focus. It could also be used to switch between modes of an application experience.  I think the wave is also an interesting gesture to be used as an undo. Wave could also be used as a way to end a session. For example begin a user session by raising your hand. Then proceed to interact with an application and then wave goodbye when done.

Draw Shapes

Circle: Moving a hand to draw a circle in a clockwise or counterclockwise motion could be used to rotate an item on the screen or speed up slow down and item.

Triangle: The points drawn with a slight pause at each point could be used with a video player to skip ahead or backwards depending on clockwise or counter clockwise 30 seconds.

Arc: a partial circle which is a natural shape for a moving hand to draw could also work as a navigation metaphor within an application.

Air Gesture Phrases (Combinations)

I’m not going to spend a lot of time on this one right now, but the next step is to take combinations of any of the gestures above to create a form similar to that of something like sign language. These gesture phrases could be strung together to accomplish complex user tasks.  Here’s an example: raise hand + hover over a word (selects text) + swipe left or right to select a larger range of text, then squeeze to copy. Move the hand over another portion of the text, hover until selected then release gesture to paste.


Next I’ll spend some time on whole body tracking.