DISCOVER Vol. 22 No. 2 (February 2001)
Table of Contents

Can You See What I'm Saying?
By Fenella Saunders

People often hear each others' voices without ever seeing the faces they belong to. "Nowadays we are talking away on the phone without meeting people," says Seung-Jae Moon. And from business conference calls to tawdry chat lines, people often imagine they would recognize the speaker if they saw him or her. Seung-Jae Moon, a linguist at Ajou University in Korea found that, under certain conditions, they're actually right.

Moon decided to see just how close those mental pictures match up with reality, and if there was any relation to how people speak rather than what they are saying. He recorded 16 Koreans, half men and half women, reading the same passage, and took a full-body photo and head shot of each speaker. Then he played the tapes for 361 Koreans and 173 Americans who did not speak Korean and asked his subjects to match up voice and picture. The Korean participants viewing full-body photos were quite perceptive: A majority linked 6 of the 8 woman to the correct voice and did so for 5 of the 8 men. With the Korean group shown only faces, accuracy plummeted, but more than 20 percent of the subjects selected the same incorrect picture. The Americans demonstrated no accuracy in matching the foreign voices to photos, but they too were consistent in their errors. That disconnect reveals conflicting ideals of physical and vocal beauty. Moon asked people to pick a favorite face and voice. Seventy percent of the Koreans picked one voice, but there was no agreement on a face. Americans didn't agree on either count. And over 65 percent of both Koreans and Americans did not match their favorite face with their favorite voice.

Moon hopes to use software to break voices into components like pitch and hoarseness to narrow down which elements trigger certain mental pictures. "If we can map which characteristics of the voice triggers what kind of image, and it doesn't matter whether that image is the right or wrong one of the actual speaker, then we can create an image through voice," he says. That capacity could help to create computer-synthesized voices tailored to conjure up specific associations— audio books for children that inspire motherly visages, or warning alerts that bring to mind a stern police officer.


Moon presented a talk, titled "Is what you hear what you see, even in a foreign language?" at the 140th Meeting of the Acoustical Society of America, December 3-8, 2000, Newport Beach, CA. See web2/asa/ abstracts/ search.dec00/ asa135.html for an abstract.

Moon's web page is ~sjmoon

© Copyright 2000 The Walt Disney Company. Back to Homepage.