COVER STORY PODCAST
In a darkened loft in the industrial district of downtown Los Angeles, Gesture Studios CEO Kevin Parent slips on a pair of black gloves studded with iridescent white, purple, and yellow dots. Standing about 10 feet from a wall-size screen, he lifts his hands like a conductor. With a series of precise gestures, he calls up photos and videos of urban Los Angeles. Raising his thumbs and pointing his index fingers toward the screen as if miming a cowboy with two guns, he swiftly sorts the images, zooming in on certain buildings and playing snips of films depicting various street scenes. To pause the film, he extends one hand like a traffic cop. With other crisp movements, he can spin 3D objects in space or snatch a bullet point of text and drag it across the screen. "You just put on the gloves and go," Parent explains. "Think turbo PowerPoint."
The technology preview Parent arranged for BusinessWeek bears an eerie resemblance to a famous scene in Minority Report, Steven Spielberg's 2002 film featuring Tom Cruise as a cop under investigation for murder. Techies still talk about the wireless data gloves and clipped hand signals Cruise uses to sort through evidence on a giant screen at police headquarters. That interface is just what Gesture is selling to companies that create presentations at the 14,000 trade shows and conferences in the U.S. each year. The hardware and software will be priced from a few thousand dollars and up. Soon, anyone making a PowerPoint presentation to colleagues or business partners could operate the same setup, which uses cameras to track hand movements and translate them into computer instructions. The similarities to Minority Report are no coincidence: Gesture Studios is the brainchild of Massachusetts Institute of Technology wunderkind John UnderKoffler, who helped Spielberg's production team design the scene in the movie.
GOODBYE TO THE REMOTE
Gesture's system, known as GoodPoint, showcases a technology called motion capture, which film studios and video game makers have used for years to make computer-animated characters appear more realistic. Now motion capture is bursting out of Hollywood and changing the way consumers interact with home electronics. Motion sensing is the secret ingredient in Nintendo's (NTDOY) wildly successful Wii game system, which lets you swing a wand in your living room to hit a home run in an animated ballpark on your TV set. Intel (INTC) Corp. is developing a more advanced version of motion capture that will let people wave at their TV sets from across the room to turn up the volume or change channels—no gloves or sensor dots required. Within five years "you could use gesture recognition to get rid of the remote control," predicts Intel Chief Technology Officer Justin Rattner. "We've done prototypes of that, as have other people." Rattner says body tracking—the whole body, not just the hands—could drive demand for its important new generation of semiconductors, the superprocessors known as teraflop chips, which Intel previewed in February.
Slide Show >>
Motion capture is starting to transform how businesses market their products as well as design and manufacture them. This spring the Las Vegas McCarren International Airport will set up large plasma screens with a motion- tracking component that lets advertisers bring pedestrians into their commercials. When you walk past a car ad, for example, the vehicle might move at the same speed you're walking. When you turn to look at the driver, he'll turn to look at you, and you'll be staring into an image of your own face. Dozens of blue-chip aerospace, auto, and heavy-equipment makers, from Lockheed Martin (LMT) to BMW to Caterpillar already use motion tracking to let workers collaborate in shared virtual environments, sometimes when they are thousands of miles apart. Together they can test the ergonomics of a design for a car or a plane. "Any company that creates a product used by people needs to understand how the human body moves," says Iek van Cruyningen, head of securities at Libertas Capital Group, a specialist investment bank. "Motion-tracking systems and virtual simulations accelerate product development and boost productivity."
Motion capture marks a new stage in a revolution heralded back in the 1980s. That's when computer geeks first started talking about an immersive digital domain called virtual reality (VR). You may remember the first dorky VR goggles users donned to experience virtual worlds, the unabashed pronouncements about world-changing, computer-generated realms, and camp media depictions such as the 1982 Walt Disney (DIS) movie Tron. Some well-funded academic laboratories went ahead and developed these goggle-and-glove environments. But as a consumer application, virtual reality 1.0 was a bust. The hype was too loud, computers were too slow, networking was too complicated, and because of motion-sickness issues that were never quite resolved, the whole VR experience was, frankly, somewhat nauseating.
It was a classic tale of high-tech crash and burn. The technology arrived prematurely, dragged investors through bitter disappointment, and lost startups and their backers buckets of cash without ever yielding a return. VR 2.0, enhanced by motion capture, is different in many critical ways. Most important, the first batch of applications, such as the Wii, while still primitive, are easy to use, inexpensive, and hard to crash. You don't get anything close to a fully sense-surround experience, but neither do you feel sick after you put down the wand. The games are simple and intuitive, which is why a new stereotype has emerged in the game market: Wii-toting grandmas and grandpas.
Pioneers in VR 2.0 are likely to piggyback on Nintendo's strategy: Wow the audience with fairly simple ways to bring their bodies into the action. Gesture Studios' GoodPoint system enables a presenter to take audiences on a tour of a 3D architectural design or on a fly-through of a model city. And the presenter's measured theatrics make a big impression. "Everyone's looking for the new, sexy way to communicate with their employees and their clients. We're selling their ability to sell," says Tom Wiley, Gesture's director of business development and operations. Future versions of GoodPoint will help air- traffic controllers visualize complex flight configurations and let security personnel sift quickly through hours of video using hand gestures alone. There's even a diminutive glove or thimble version for use with a PC, which could end the decades-long tyranny of the keyboard and mouse.
The advertising industry is dreaming up uses for motion capture that literally stop consumers in their tracks. Recently, with little fanfare, Target (TGT), adidas Group, and Clorox (CLX) began running interactive ads on subway station walls in New York. One Target ad, a 6-by-20-ft. projection, featured snowflakes gently fluttering from the sky. It seemed unremarkable until you approached the wall. If you swiped your hand in the air, the background scene transformed from a wooded winter scene into a city skyline. And by waving both hands you could send the snowflakes into a swirl.
Adidas chose a similar approach for its ad in the entrance of the Mandalay Bay Resort & Casino in Las Vegas. The ad perked up when people walked by and responded with a shower of shoes. The more they gesticulated, the bigger the deluge became. "People don't ignore the ads—they want to play with them," says John Payne, president of Monster Media, which created the campaigns for adidas, Clorox, and Target. "It's like Willy Wonka."
Lockheed Martin, the world's largest defense contractor, has pushed motion sensing to even more exotic extremes in an effort to reduce design and manufacturing costs. The VR effort at its Ship Air Integration Lab in Fort Worth is headed by Pascale Rondot, a petite French Canadian who worked in a similar program at General Electric Co. (GE) Her lab is helping to develop new F-35 stealth fighter jets that can take off on land or at sea. Lockheed won a contract in 2001, now valued at $25.7 billion, and the first of the fleet of nearly 4,000 planes will be completed by 2010—part of a program valued at roughly $276 billion.
To help engineers, technicians, pilots, and Lockheed customers understand how the plane will perform, Rondot equips teams of visitors—up to four at a time—with VR headsets and suits dotted with motion-capture sensors. They enter a darkened 15-by-20-ft. area where 24 cameras track their every move. What the visitors "see" through their head displays are the fighter prototype and lifelike avatars of one another. They can walk through the prototype, crouch down to inspect or change a part, and practice physical routines they will replicate in real-world planes many months later. Nearby, in a separate area called the cave, Lockheed invites active-duty and retired Navy, Marine, and Air Force senior officers who can view both the people and the virtual aircraft in the simulation.
Aeronautics veterans who hear about this program are sometimes skeptical. "When people cannot touch a prototype, it's always a hard sell. But then they see how our virtual world matches the real world and how much time and money we've saved," says Rondot with a Quebecois lilt. For one task—examining the approach speed of the plane as it lands—Lockheed was able to save $50 million in design changes and avoid 50% of the cost of the mockup by using the VR lab instead of traditional wind-tunnel tests. In a simulation, Rondot can bring Lockheed engineers together with far-flung counterparts at partner companies such as GE and Pratt & Whitney.
With profits of $2.5 billion on $39.6 billion in revenues, Lockheed can afford to splurge on motion capture. But even companies with much tighter budgets are drawn to the technology. At Ford Motor Co. (F) headquarters in Dearborn, Mich., engineers use motion tracking and simulation in both product design and manufacturing to reduce dependence on expensive metal prototypes. Manufacturing ergonomics expert Allison Stephens creates digital versions of Ford factories, then analyzes the reach and posture of people on the assembly lines to reduce risk of injury. At one pilot plant, such simulations have reduced the expected number of disability cases by 80%, says Stephens. Disability payments can run to tens of millions of dollars a year.
As users rack up successes, suppliers of motion-capture systems for factories could be in a sweet spot. "It's early, but such simulations could be one of the most profitable areas in the future," says Kathleen Maher, an analyst at Jon Peddie Research who has written reports about the auto industry's use of modeling software. Maher pegged the size of the field of "augmented reality"—simulations with some props—at $142 million in 2006 and says it is growing quickly.
Manufacturers may be the power users of VR 2.0, but Hollywood studios will continue to be the biggest patrons and innovators. Director James Cameron says the technology first had a huge impact on film in his Titanic, the highest-grossing movie of all time, which brought in upwards of $1.8 billion. Only a small number of actors in motion-capture suits were required to create the famous crowd scenes on the decks of the sinking ship. Digital data were collected and then duplicated to recreate the scene of the disaster.
The technology then spawneda slewof computer-generated creatures, including Gollum, the gurgling, wispy-haired ring thief in Peter Jackson's Lord of the Rings trilogy, and King Kong, by the same director. British actor Andy Serkis spent hours scampering around on set in sensor-studded suits to create Gollum's scuttling and Kong's signature grimaces. For such scenes in the past, animators might draw each new movement by hand, frame by frame. And that's not the only giant expense motion capture could address. During a movie shoot, "the cost of a day on set can range from $50,000 to $1 million," says Gary Roberts, a vice-president at House of Moves Inc., a Los Angeles motion-capture lab. "Motion capture shaves days off the overall production time." And the process could get even simpler. In Cameron's current film project, titled Avatar, sophisticated software may eliminate the need for sensor dots on actors' faces. Instead, one tiny camera on an actor's head cap tracks and interprets every twitch.
Companies that come up with such innovations stand to make small fortunes. "There is an arms race in entertainment," says Phil Sparks, an analyst at Evolution Securities Ltd. He's bullish on Vicon, part of Oxford Metrics Group, the largest company making motion-capture systems for entertainment and now many other clients. "They're no longer just mapping out an orc and a hobbit having a fight. They're now doing the expressions on dozens of actors' faces." Motion Analysis Corp., another motion-tracking player, built systems that were used for Kong and one of the Rings films and also outfitted Lockheed's lab.
Capturing actors' movements is potent stuff, says Cameron: "The technique frees performers from the limitations of body type, age, race, and gender. The essence of their spirit as actors can be infused into any physicality they or the filmmakers dream up." For example, you could tell a story that follows the same character from childhood to old age and have the same actor play the character at every stage—without makeup. Video game makers are also channeling mounds of money into motion capture. Tiger Woods donned a sensor suit to make his video game doppelgänger look more realistic.
WHAT THE WII WROUGHT
There is always the risk that the buzz about an emerging technology will get ahead of reality. Jackie Fenn at market research firm Gartner Inc. charts such "hype cycles" and notes that virtual reality has spent many years mired in something she calls the "trough of disillusionment." But PDAs and MP3 players also languished in this stage, she says, until the "beautifully designed interfaces" of the Palm Pilot and the iPod and services such as iTunes launched these devices into much wider public acceptance.
It's possible that Nintendo's Wii could do the same for gesture recognition. Four months after this revolutionary system arrived in the U.S., it is still trouncing the competition, according to market researcher NPD Group Inc. And it is cropping up in communities that never appreciated video games. One retirement home near Chicago has started a Wii bowling league, and game experts expect to see more such developments. Software giant Electronic Arts Inc. is rolling out golf and war games for the Wii, as well as a title based on The Godfather films. EA believes motion sensing will become standard in game controllers. "The Wii is helping debug this question about how you move in virtual ways," says Jaron Lanier, a scholar at the Center for Entrepreneurship & Technology at the University of California at Berkeley who is credited with coining the term "virtual reality." After a year with the Wii, society "will be better educated about the overlap of the virtual and the real world," he says.
The semiconductor market is responding to Nintendo's huge success. Demand for the special sensing chips used in the Wii will double, to $10 billion, by 2010, according to industry researcher Yole Development. Such chips, produced by the likes of Intel and STMicroelectronics, will be used in all kinds of industrial applications, not just home systems. But Intel expects to play an important role in promoting motion technology in the entertainment arena. And just as its processors helped fuel the PC revolution, Intel's next-generation chips and software could lower costs and jump-start a mass market for VR 2.0. The company is seeding such ideas in its community of application developers and even placing some of its own software in the public domain.
READING YOUR MOOD
One frontier lies in tracking facial expressions without costly and cumbersome equipment. Advertisers, for example, would like to have the ability to read the changing expressions on their customers' faces as they browse Web sites—or shop at the corner store. In August, 2006, Google Inc. (GOOG) acquired a small company called Neven Vision for an estimated $40 million. The startup has several patents on algorithms for tracking movements of key points on the face—the corner of the eye, the curves of the mouth—using a webcam, a mobile phone, or security cameras in bank machines and convenience stores. Google hasn't yet announced plans to create products based on these patents. But researchers at Stanford University have already come up with systems that can read such signs to tell whether a person is interested, happy, or annoyed. And they can even map those responses onto the face of an avatar in a virtual world.
In certain areas of medicine, motion technology can improve treatment quality. Howard J. Hillstrom, director of the Motion Analysis Laboratory at the Hospital for Special Surgery in New York, has used motion tracking to help people with disorders from cerebral palsy to arthritis. As patients fitted with sensors walk in front of cameras, Hillstrom's computer screen displays a skeleton in a 3D grid, and software helps the doctor analyze whether the patient is moving properly. "The number of baby boomers with arthritis will pass 60 million by 2020. Their movement is often the first thing to deteriorate," he says.
Tracking the surgical tools of several doctors in different locations can allow them to occupy the same simulation. "A surgeon could teach a resident in a remote village thousands of miles away, showing them how and where to move their hands," says Dr. Parvati Dev of Stanford University's Summit Lab, where residents learn to suture, cut, and cauterize virtual organs. Lately, doctors have also incorporated a technology called haptics—"force feedback" built directly into the surgical tools. When a surgeon probes or tugs on a gall bladder or some other tissue, he or she senses resistance (imagine pushing or pulling a spring). Dr. Dev's lab has done studies showing that residents trained on such simulations have a much shorter learning curve when they get into the operating room.
Tracking faces can save lives as well. Toyota, (TM) Nissan, (NSANY) and others have sponsored research at Stanford investigating the expressions drivers typically have five seconds before they fall asleep. These can be detected with simple cameras installed in a steering wheel or dashboard that trigger an alarm. Jeremy Bailenson and Cliff Nass, who are leading the research, say such cameras are now in cars coming off production lines in Japan. Siemens (SI) is developing similar "drowsiness detection" devices.
Demand for systems that watch the world in motion could be broad and compelling. At the Intel Developer Forum in Beijing in April, the chipmaker will preview something called sports summarization that is designed to make good use of Intel's superchips. Outfitted with such processors, your television could spot highlights such as a soccer goal, penalty kick, or any signature motion you desire. Then, instead of relying on the newscaster to recap each night of the World Cup, you would tell your TV to go back and find your players' finest moments or their heartbreaking mistakes.
Motion tracking has all the marks of a disruptive technology, slinking on to the scene in unexpected ways. Sherry Turkle, a clinical psychologist and professor at MIT, talks about "the mirroring of body motion, and of course the subtle things like hand gestures, or the way someone characteristically cocks his head before speaking." Captured and incorporated into business and entertainment systems, "these motions will give us a much greater sense of connection with our online selves. The virtual will seem much closer to the real," she says. Imagine how you'll feel when your avatar smiles back from your computer screen just the way you smile.
For a look at how motion-tracking works, click here. To see how companies and research institutes are using ingenious ways to track humans in motions, click here.
By Aili McConnon