The Potion for Motion: Interactive Interfaces / Apps (SXSW)

Hello there! I’m at SXSW (South By Southwest) 2012 this weekend, going to the interactive conference. Anyway, I tend to take copious notes when I go to talks. I thought they might be useful to a wider audience, so here you go. You can see other SXSW 2012 posts that I’ve made as well.
IMAG2384.jpg

The Potion for Motion: Interactive Interfaces / Apps

2005, Hillcrest developed ‘Home’, the first TV interface for motion. Innovation is a good thing, but it can create confusion.
Products that benefit from motion:

  • smart tvs
  • mobile devices
  • game consoles
  • set top boxes
  • pc peripherals

Motion sensors are becoming more commonplace. Every smartphone has an accelerometer, for example. Marketing has caused some confusion, so I’ll clear it up.

What is motion?

(Shows a photo of a man pointing.)
Nautral motion, gesture, control knobs, pointing
An equal number of you said this is a gesture, and an equal number said it is pointing. This is a key misperception. Motion is more than gestures.
(Shows a photo of a woman running.) Not a gesture. Natural motion.
(Shows Spock’s ‘V’ gesture.) A gesture.
Motion is more than gesture. Using “gestures” to mean all motion is an oversimplification. No more than 5-7 gestures is easy to learn in an interface. There are probably only 5-7 on your smart phone. Swipe up, zoom in/out. Gestures require a lot more system performance and processing to account for the gesture and human variations, which makes it more complex for the system to understand.

  • natural motion
  • pointing, cursor control
  • virtual controls
  • gestures

Walking, running, hitting a ball – all natural motions. Pointing is a specific motion we identify at an early age. Before you’re 11 months old, you learn how to point to indicate simple interest or selection. Pointing is not a gesture per se, it’s a complex mode of motion that is distinct from the others.
Virtual controls are an interesting way to interact. Today in more advanced interfaces, with higher processing capability, linear functions like volume and fader are still relevant to be controlled by virtual controls.
How to add motion to a product: start by defining the problem.

  • What is the primary function? What’s the priority? What comes first?
  • What are the use cases? Are you trying to do gaming? A health interface? A television UI?
  • Who is the target customer? If the customer is an enterprise or business user, it’s very different if the customer is a consumer looking for entertainment. A good example of this is the Blackberry: had a keyboard, and tuned to enterprises. Trade-off between efficiency and ease-of-use. Blackberry was tuned to efficiency, and had to be learned. The iphone touch interface is more inefficient, but it ended up being easier across a wide range of applications and for the general consumer.
  • How much should it cost? If it’s a consumer product that has to cost under $100, motion sensor has to be small part of cost. If you have a larger target price, you can do more motion.
  • Where will we sell it?
  • When do we need to ship it?

Remote control device design could be done independent of the device itself. The remote control design typically is done by a different department than the one that designs the TV. This is not a good model for development of motion interaction.

Motion control is part of a system.

Humans are a system. A complex system with sensors. When you add motion control to a technology product, you necessarily make it complex.
Sensor detects human – motion sensor or optical sensor – then it has to process that motion, determine what action the UI has to do based on that. Then, the human observes the interface change on screen, and processes it. That closed loop has to happen roughly 25 times in a minutes, or 40 ms or less. If you’ve used kinect or other emergent motion interfaces, the system responds a little sluggishly. We call that control by a piece of spaghetti – if the delay is too much, interface is sluggish and it isn’t real time.

Critical Design Factors for Motion

  1. Performance of sensors (cost a significant driver; 3D camera more expensive than inertia system) Sensitivity, drift, non-linearity, aging
  2. System responsiveness – latency, wake-up time, gain settings. If you had to move your mouse a distance equivalent to how much it moves on screen, if you have a large screen, it would be very fatiguing.
  3. User control – accuracy, orientation free, tremor. Humans have tremor naturally, and we all have a different tremor signature. To create an accurate experience, you have to remove the unintended motions, and only translate the intended motion.
  4. Cost – BOM, integration, support

History of Natural UI innovation

He founded Hillcrest Labs 10 years ago, with the goal of using new technology for the control of television UIs. 2003-2008, they developed on of the first OSes from the ground up to use natural motion and pointing interfaces to control platforms. Worked with Roku and LG on motion interfaces most recently.

More context

Televisions have traditionally used 50-button remote controls for their television. We’re hoping that will go way down in the next few years. The remote controls have made television a complicated platform when it is intended to be extremely simply.
The evolution of TV from ‘dumb’ 1939-2009, to ‘smart’, 2009 on (introduced by Samsung). It started out with just a few channels, limited to video and audio. Bumped up number of channels with cable and satellite. Now, with the internet, we can change the nature of television with so many more content sources.
Is SmartTV really smart? Smart can mean elegant. Matthew May, “In Search of Elegance: Why the best ideas have something missing.” His premise: something is smart or elegant if it solves a problem in a particular away. A simple solution to a simple problem, not elegant. If it’s a complex solution to a complex problem, it’s not elegant. A simple solution to a complex problem is elegant. I found that the smart TVs are a complex solution to a complex problem: smart TV tried to be a nexus between broadcast, broadband, and personal media. Their remote controls have gotten worse than the 50 button remote controls. Google TV has a 90-button remote control: this is going the wrong way.
When Google started to launch, and Samsung had already launched, we surveyed users and the literature. Consumers complained Google TV was too complicated. Wall Street Journal called it a geek product, not mainstream. “A confusing UI nightmare,” “too complex,” on and on. We were not creating elegance.
Recommends “The Laws of Simplicity,” John Maeda. He talked about interfaces in the same way Matthew May did. “Simplicity is about subtracting the obvious and keeping the meaningful.” So we asked, what was obvious about the TV remote control? It was designed to be incredibly for anybody to use: anybody can push a button. 4 buttons: north, south, east, west. A focused state interface – easy, but not efficient: how many control elements can you put on screen? With HD TVs with more pixels, you can fill every pixel, but with up, down, left, right, you’re going to have to hit the button an extreme number of times.
We decided Smart TV needs a natural UI. Why can’t I point at which album I want, which movie I want? We decided the change would be dramatic if we could bring pointing to the screen. Why does it have to be a square box with 50 buttons? If up / down / left / right go away, what happens? Remote control, every time you want a new function, it’s a new button. But you’re supposed to be watching the TV, not looking down at the remote. Why do I have to look down to find pause?
We subtracted the obvious: we took all buttons off except for a few. Select, back, scroll wheel. I’ll show you videos of this in action. Our perspective is to subtract the obvious, subtract the buttons. Add meaningful: pointing and dynamic control.

The evolution of the TV Human Interface

  1. Natural – only a few buttons, you point primarily
  2. Elegant – don’t bring the pc or smart phone to the television. Make the design relevant to the experience. In the GUI, we subtracted menus and made them relevant.
  3. Fluid – seamless flow from page to page. Our brains are wired to form mental maps of our spatial environment with motion cues – so why do apps flashcut from one content type to another without a fluid dynamic motion map / a spatial navigation of our content environment?
  1. TV is about video first, but must help users find, consume, use content and apps.
  2. TV is not a PC: entertainment, not productivity
  3. TV is a “Lean Back” experience – TV should do most of the work to present things to me. Find, graze, browse. Seamless search tool to find what I want if I know exactly what I want. If I know generally what I want: a drama, an action flick: that should be possible. If I want to browse, I should be able to traverse large amounts of content as well, as much in a short period of time.
Use cases to take advantage of natural motion interface
  • Group of friends watching a soccer. Beers, food. Pizza delivery guy arrives. They have to find the remote to pause it. He has to put his beer down if he’s picking up a keyboard to use both hands.
  • Watching movie on a date. Door bell rings, have to put movie on pause. Pick up smart TV remote. Touchscreen, buttons: have to fumble to find pause. Prefer not to do this. One-handed operation is smart to hug wife with other arm.

They did years of laboratory research and found that natural motion interfaces based on pointing via inertial control is an excellent choice.
Their device has accelerometers and gyroscopes. You don’t have to look at the device, just wiggle it around. Idea of an air mouse has been around, nothing new. This device… I hold my arm out and it keeps the cursor dead still – not possible with a laser pointer. It dynamically removes human tremor; lets me control a single pixel on an HD Screen; can distinguish between intended and unintended motion. If he turns the device on its side, goes up / down / right / left: up is always up, you don’t have to worry about which way its up on the device.
Camera based systems have to be line-of-sight, and they have to see you. Having a device that can be under a blanket is better. Voice is also an interesting application, but one tool is not the answer.
Maslow’s Hammer: the golden hammer. If you are a hammer, you see all the world as a nail. Really important to avoid this. We come up with a new application: iphone, Siri, Wii. Avoid temptation to use one solution or approach to all the problems.
The integrated natural interface: think of it as a table setting. There’s generally more than one utensil: a fork, the knife, the spoon.

  • The fork – a handheld motion remote – the workhorse.
  • The knife – voice input – efficient for a particular function. Text and navigation.
  • The spoon – the camera – special applications like skype and gaming.

When you design a new user interface, you should think about using all the utensils and use them in the right way.

Summary

Digital Media Industry Priorities
  • A natural inteface that supports find, graze, and browse of content and apps.
  • Want it to be elegant
  • Want it to be fluid.

This can be applied to a wide range of applications, not just entertainment. Military, robotics, industrial, mobile – all can benefit from natural motion.

Natural motion will transform UIs:
  • Motion encompasses more than just gestures
  • It makes interactions more efficient and more natural
  • It is more fun to use than existing button-based interfaces
Adding natural motion to products is not easy
  • It is a systems problem
  • The systems issues are often not related to physics but psychology
  • Optimized use of motion requires a good device and UI.
Hillcrest Labs can help
  • We already sovled the systems problems
  • We designed end-to-end systems using motion for 8+ years
  • Our team can guide you and make you a success


IMAG2385

Questions:

When is this kind of interface wholly inappropriate?
  • Camera interfaces generally inappropriate for lean back control. I don’t want to wave my hands around from the couch. If the number of commands exceeds 5-7, you can’t learn the interface, stay away from gestures.
  • In the operating room, sterile environments – a challenge to have these sensors.
  • Harsh environments – sensors need to be more robust.
Is the lean back experience changing? Lean back in some contexts, lean forward social experience in others…? How do you apply some of the things you’ve talked about to multi-screen experiences?

TV is still a central experience: a group or a community experience. But you want to do some things personally. Maybe you want to look up about the actress on your tablet, not up on the central screen. Or maybe you want to text someone about the show without disturbing everyone else watching. Ultimately, TV is about sitting back and watching the movie. If you want to change the volume or put it on pause quick – those behaviors have to be very simple. I think we’ll see other devices overlaid on top of this natural system to augment the experience, create new forms of lean forward activities during the lean back experience.
We think about the separation between personal and community. Personal may be more lean forward.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.