The quest for the perfect game face

Video game designers are racing to create characters that feel real. Now, if they could only turn digital figures into flesh and blood.

Published February 19, 2008 12:40PM (EST)

Early in his career, David Cage made the decision to discard what he calls the "traditional mechanics" of game design -- a puzzle, a solution, and a chest of gold.

Cage, who is fond of saying he is in the business of "creating emotion," is best known as a video game designer. A few years ago, his studio, Quantic Dream, released its first major title, "Indigo Prophecy"; it went on to sell more than 700,000 copies.

The success, Cage recently explained, was spurred by the game's open-endedness: in "Indigo," plotlines spill, amorphously, in several directions, none exactly wrong and none exactly right. Audiences are forced to view each level dynamically. But shortly after "Indigo" was launched, Cage began to receive complaints about the game's aesthetic realism, which one reviewer labeled, disparagingly, "atmospheric, but not stellar." The most significant flaw was the face of Lucas Kane, the hero of "Indigo Prophecy." In some scenes, Kane's face looked wooden; in others, the muscles around his mouth moved too much, giving him an eerie, reptilian quality.

"Human emotions are expressed pretty much by every single part of our body," Cage says. "Think also about what emotions can be expressed with hands. Your skin says as much -- you can sweat if you are nervous, be blank if you are scared, blush if you are confused, etc. "None of this is spectacular in itself, but all these little signs contribute to the expression of emotion, and we are used to decoding them unconsciously."

When a video game designer fails to accurately encode those expressions, the results can be disastrous. Consider "Grand Theft Auto IV," the latest installment of the bestselling franchise. In the game's trailer, the hero, Niko, speeds across a gritty cityscape, waving a chrome pistol out the window of his car. He shouts at pimps, whispers to gangsters, and sprints down a dark alley. Later, he pilots a speedboat down a crowded canal, while gunfire rains overhead.

For the most part, the rendering in the trailer is superb: the water ripples, the mist floats, the muzzle flashes are bright and sharp. Niko, meanwhile, is imbued with a robotic stoicism. His eyebrows are thick and unresponsive, and when he frowns, his forehead turns rubbery, like a twisted plastic bag.

"A face is extremely complex in its muscular structure, and if anything is not perfectly right, you will immediately notice," Cage explains. A still life, in other words, is much easier to create than a portrait. We look at faces every day; we have taught ourselves to look for signs of sadness, or happiness, or deceit. It is a survival mechanism. If a digitalized face appears "broken" or rubbery, it destroys our sense of the illusion.

If a developer could create a photorealistic face -- and make it move, and respond like the flesh-and-blood original -- gamers would become doubly immersed in the game. So a few months after the release of "Indigo Dream," Cage set the team at Quantic to work. Their first step was to experiment with a combination of synthetic animation and motion-capture technology, which uses the movements of an actor.

From there, producers developed specific rendering systems to produce, in Cage's formulation, "a skin that does not look synthetic but has enough details to look real"; "reflections in the eye that give the feeling that the eyes are wet"; and a clammy moistness in the character's brow.

As a test, Cage wrote a teleplay for a short film, "The Casting." The plot was simple: Mary Smith, a 24-year-old actress, is trying out for a part in an upcoming project. She is nervous, she explains to the casting agent, who sits somewhere out of view -- there have been other auditions, and even a few big parts, but "there's always something wrong with me."

As Smith begins to read her lines, she grows more confident and slips quietly into character. Her lover is cheating on her, Smith says. She knows this because she followed him to a hotel where she met the girl: "Then my whole world falls apart." As she speaks, Smith quietly produces a pistol and presses it to her temple. She doesn't shoot; the bullet, she explains with a fierce intensity, is intended for someone else.

"The Casting" debuted at the E3 gaming summit in Los Angeles, on May 16, 2006. It was greeted, instantly, as a resounding success -- an example of what could be done with digital technology. Still, what Quantic Dream had produced was not a game. There was no interactivity to speak of, except the Play button. For months after the debut, Cage and his team remained relatively quiet about how they intended to use the technology, and the industry chatter subsided.

Then, in December 2007, in a widely read interview, Quantic executive Guillaume de Fondaumière announced that the studio had signed an exclusive deal with Sony to develop a game for the PlayStation 3. The game, he said, was called "Heavy Rain," and it would feature "hundreds" of realistic characters, which would each be rendered with the same technology used to "create" Mary Smith.

De Fondaumière added that the "uncanny" effect that had hobbled scores of otherwise perfect games was a relic of the past. Quantic's new system had been maximized to pick up "very, very small motions, which means we can not only capture full body movement, but also facial movement and expressions."

Undistracted by rubbery digital faces, gamers would finally experience total emotional immersion.

The idea that a gap in realism triggers mental unease was first conceived by a German psychologist named Ernst Jentsch. In 1906, Jentsch published "On the Psychology of the Uncanny," where he theorized that cognition is divided between two poles: the "new/foreign/hostile" and the "old/known/familiar." If the brain can not immediately associate an object or experience with one or the other, Jentsch wrote, it becomes stuck somewhere in between, mired in doubt that "only makes itself felt obscurely in one's consciousness."

Imagine, for instance, that you are standing in the middle of a dimly lit room, surrounded by wax manikins. As Jentsch notes, your imagination will probably begin to swell. Is one of those dolls actually a human? What if it begins to move? Will I scream? (Ridley Scott plied a similar fear to great effect in "Blade Runner.") Jentsch suggested that anxiety is "repeatedly and automatically aroused anew when one looks again and perceives finer details."

Jentsch's theory lay essentially dormant for the next half century (it is referenced, tangentially, in a series of essays by Freud) until 1970, when Masahiro Mori penned the influential essay "The Uncanny Valley." Mori, a Japanese roboticist, was fascinated by the emotional and psychic reactions humans have to "humanoids." Famously, Mori noted that "human beings themselves lie at the final goal of robotics," but the closer roboticists came to simulating a realistic face, the more noticeably a human would recoil from its appearance.

For instance, what if a designer tacks a plastic mask to the front of a robot? That increases the "familiarity," and a viewer might begin to anthropomorphize its movements. This humanizing effect is subtle, and reassuring; one is cognizant that the machine is still a machine. Conversely, when that same viewer is presented with an aesthetically accurate "humanoid," the prevailing mood often turns to dread.

Mori sketched a scenario where a woman walks into a dark room and shakes a rubbery hand. What if she doesn't know, initially, that the hand belongs to a robot? What if it feels too cold or it lacks the moisture of human skin? Her cognitive reaction, Mori wrote, will most closely resemble "a kind of horror." The hand is not transparently mechanic; its biological identity is ambiguous; it occupies that gray ground between Jentsch's "new/foreign/hostile" and "old/known/familiar"; the woman wonders if it is humanoid or human, and in wondering, she destroys the illusion the roboticist worked so hard to create.

The import of the "uncanny valley" on robotics was clear. Until the technology becomes more advanced, Mori suggested, scientists should concentrate on a "safe familiarity by a nonhumanlike design."

The alternatives would only scare people to high hell.

In the past decade, as CGI graphics have become more advanced, the "uncanny valley" effect has been felt most widely by video game designers, who struggle to render, say, the simple movement of a human jaw with the same complexity as a tropical jungle.

One way to hop that gap is by improving motion-capture technology. Take "Tiger Woods PGA Tour 08," the bestselling golf game, which was developed by Electronic Arts. To render the movements of the title character, a production team at EA filmed Woods from hundreds of angles; that footage was graphically textured and woven back into the game. The result resembles footage from the Golf Channel.

"In the past, you saw cartoon heroes," says Rich Diamant, the lead character artist for "Uncharted: Drake's Fortune," a PlayStation 3 game released last fall. "Now people want to see something 'real.' They want total realism. And getting faces and emotions right is a big part of that."

Diamant and the rest of the team at Naughty Dog, a Santa Monica-based production company, based much of "Uncharted" on the movements of a pair of actors hired for the game's two leading roles. Using mapping technology, they concentrated on "muscle sliding over bone, and cheeks that move up and down, naturally," Diamant said. "When you smile, it's not just the corner of your mouth," he added. "It's a full joint system, involving most of your face, including your eyes."

The effect is extremely realistic: Nate Drake, the story's hero, watches as his T-shirt gets dirtier and sweatier as the game progresses. His brow wrinkles when presented with a puzzle, and his eyes furrow in concentration when the bullets begin to fly.

Part of the reason motion-capture technology works so well in "Drake's Fortune" is that the game is exceptionally linear. Players are allowed some flexibility in terms of where they can travel; mostly, though, they're operating within very specific boundaries. I must walk into a certain area to trigger a particular event and move the story forward. If I don't, the character merely stands in place, looking puzzled.

Greg Rinaldi, a producer for "Tiger Woods PGA Tour 08," told me motion capture is an effective solution for some games. It can also be extremely time intensive. In the case of "Tiger Woods," the company had to spend days filming Woods, and "whatever we capture is what we get," Rinaldi says. "They can't edit the footage," meaning that the virtual Tiger is operating within a very specific set of restrictions.

This isn't a fatal flaw for "PGA Tour." The game has a limited number of golf courses, and compared to the vastness of an epic like "Grand Theft Auto," it poses few emotional quandaries for Woods. (Save a poorly struck shank into the bunker.)

But gaming, in recent years, has begun to trend toward the open-ended -- audiences want "sandbox" worlds, where anything can be built and everything is interactive. In the online game "Second Life," for example, my avatar is fully customizable, from his shoes to his hair to his neon-blue dog.

Players in "Second Life" don't care to move simply from A to B, and they don't expect a "game over" screen. Instead, they want to poke at the edges of their world and wander freely from island to island and house to house. They want, in other words, something like reality.

In the real world, we are cognizant of an array of stimuli -- the feel of the wind on our face, or the electric whine of the television in the next room -- and we register responses to those stimuli with our faces. I might shiver at a cold breeze or frown at the volume of the television set.

But the digitalized characters in today's video games often become emotionally unresponsive. One problem is the "dead spot" -- an unscripted moment where a character will suddenly appear distinctly mechanical. He might bump into a wall or stand stony-faced as a sports car explodes behind his head. In those situations, motion capture -- in and of itself -- is not a viable solution.

Developers, says Glen Entis, "will say, 'Let's just make this game photorealistic,' without thinking about what that means." Entis, the chief visual officer for Electronic Arts, made headlines last year when he told attendees of the Siggraph convention that the current trend of "just adding polygons" was extremely problematic. Designers, he suggested, need to concentrate on the less sexy aspects of design, including gait, head-to-torso angles, or the way a character waves his hands when he gets angry.

"We need to be working on how to communicate energy, how to communicate speed, and how to communicate a dangerous environment," Entis says. "These are areas that have to be heavily developed," alongside the graphic technology.

British producer Peter Molyneux has worked to refine a similar dynamism in his games. True realism, he says, will be achieved via a complex equation that combines photorealism ("realistic hair modeling," for instance) and an improved situational awareness.

Molyneux, who heads a production studio called Lionshead, is best known for his work on "Fable," a complex adventure game. Released in 2004, "Fable" was among the first titles to introduce moral ambiguity into a virtual world: Players were free to save hapless villagers or to slay them; they could also ink up their arms with a mass of tattoos and spend the afternoon getting drunk. The gameplay model duly shifted as the hero grew more evil or more respected. Commit too many atrocities, and your avatar trailed a horde of flies in his wake.

This spring, Lionshead will release "Fable 2," which Molyneux calls "an emotional drama." While he says his production team could have created "more horrific moments" or "action moments," they decided to concentrate on more subtle interactions between the main character and his peers.

It's an important distinction. As Entis told me, gaming is careening toward a "particularly fruitful" intersection of technological advances. Processing power is improving fast; so is artificial intelligence, along with graphic fidelity. But the most realistic games, he says, will rely less on motion capture than the more subtle science of emotional interaction.

"Hopefully," Peter Molyneux says, "'Fable 2' will make you "care more about the people that love you in the game -- and if you do, then we will have conquered the uncanny valley."

By Matt Shaer

MORE FROM Matt Shaer

Related Topics ------------------------------------------

Computers Gaming Tiger Woods Video Games