Scientists developes Child-like Synthetic Voice so they don’t sound like Stephen

Sat, Mar 10, 2012

Hawking

Science truly does amazing things when combined with technology. Researchers at the Norwegian University of Science and Technology are extrapolating child voices from a few key phrases for synthetic voice devices. Their goal: to ensure that children around the world using assistive speech devices no longer sound like Stephen Hawking.

Assisted-living computer software companies Lingit and Media LT are collaborating on a device based on the Norwegian University’s research, to put their collected information into practical use.

The Norwegian researchers synthesized a child’s voice by creating a master adult voice, created by combining the recordings of a multitude of adult speakers reciting thousands of phrases – thereby creating a repository of words and sounds. They then created a tiny library of just a single child reciting key words and phrases, in Norwegian of course, that are most essential to their language system. Using the adult voice library as reference points they applied the child’s sounds to the adult words and extrapolated that to re-create the adult repository of words and phrases but with a child’s voice.

“The result sounds rather like a child with unusual elocution skills, but it’s still much better than the voice of an adult,” says Dr. Torbjørn Nordgård of Lingit software company.

The exciting news of this creation is not only the application of a child-like voice to assisted living devices for children, but also the reverse engineering of it in an application for child voice recognition software. Currently voice recognition software is based on adult voices, but the addition of child voices would open up the current repertoire of speech patterns.

Perhaps an important feat in coming years since children younger and younger are being given iPhones and other voice recognition devices. Perhaps young voice recognition will improve enough that homework can be done entirely by dictation, a feat that could kill typing skills entirely–similar to the way that typing killed calligraphy skills.

The child-voice algorithms being created in Norway are impressive, and include allowing for children’s shorter vocal tracts and its effect on frequency distribution and speech energy. The error rate for the program however remains high, at 50 to 70 percent. This is still a vast improvement though over the current adult-oriented software. This research is still in its infancy though and should improve vastly over the next few years. Next stop? Female voices perhaps. Or other languages.

Wonder if Apple will beat them to the punchline with their Siri application. Just kidding – Siri’s not that good.