Gives Animated Computer Graphic Characters Synthesized Speech with Emotions
** Assists in Making Multimedia Content Familiar and Appealing **
NTT has developed a new generation Internet interface, WebMessenger, which provides emotional expression in animated CG with synthesized speech that are capable of flexible prosodic adjustment. WebMessenger is designed to work with popular WWW browsers.
The spread of the Internet has introduced the age of information sharing. Individuals as well as organizations now can easily open various homepages, create content, and send information over the net.
Yet, looking at the ways information is sent, text is still the main medium. Though computer-generated animation is becoming popular, the animation of moving pictures is slow, and the synthesized voices have limited variation. The Internet has a long way to go before it can provide multimedia content that is familiar and attractive.
WebMessenger lets you edit the prosodic qualities of synthesized speech so that it can express emotions realistically. In addition, it allows you to link synthesized speech to animations of CG characters. Furthermore, WebMessenger recreates animated sequences without sending the raw images, i.e. far less data is actually transmitted. Thus, the user can enjoy animated CG images on an Internet terminal without the irritation of slow or jerky replay.
This new technology is an outgrowth of the wealth of software developed by NTT Cyber Space Laboratories to assist in the creation of multimedia content. Such software includes a system for synthesizing speech from text data, as well as Sesign98, a synthesized speech design tool. To those fundamental technologies, we have added a new mechanism for precisely but flexibly synchronizing speech to moving pictures.
The new technology is ideal for such applications as a friendly interface for tele-education systems, and a speaking agent that can read out the text of a homepage for easier understanding.
We will exhibit this new technology at ICCC'99 EXPO (*) from September 14th through 16th.
<System Configuration> (See the System Configuration Diagram.)
Sesign98 offers two modes: automatic conversion and manual adjustment. The latter allows the creator to manually adjust the intonations and the speed of the speech according to his/her taste. Our newly developed content creation tool software, WebMessenger-Creator, creates the content, including the animated pictures and the picture-speech linkage. The speech data can be combined with any animated CG images and adjusted as desired. (The current system has ten CG characters, each with fifty to sixty expression patterns.) Each set of synthetic unit data for a synthesized speech is represented using a synthesis unit index, and each animated CG image has a moving picture index. Once multimedia content is created, just its index information is attached to an HTML document and sent over the net. This way, speech and moving pictures can be transmitted using very small amounts of information. The WebMessenger-Player uses this index information to obtain the speech and moving picture data. Obviously it is necessary for the user's PC to hold the data set of WebMessenger (including synthetic unit data and moving pictures) to allow playback. The speech and pictures are reproduced with good fidelity.
# The "synthetic unit data" is a set of phonemes that is used in synthesizing speech. As the speech synthesis engine grows more sophisticated, this data will be updated to yield more expressive synthesized speech.
<Technological Key Points>
Since WebMessenger is designed to provide a high degree of creative freedom, there are a great number of possible uses. In particular, it is ideally suited for a closed network of users such as a membership system, because the sender and the receiver share the same programs and data. Shown below are two examples of WebMessenger as used in education.
*ICCC'99 EXPO:Sponsored by the International Council for Computer Communication, the International Conference on Computer Communication has been held every other year since the first conference in 1972, and is primarily for communications operators. This year, Japan will host the conference for the first time since 1978. ICCC'99 EXPO is an exposition that accompanies the conference. Its theme is "Various Developments Based on Digital Integration of Computers, Communications, Broadcasting, and Consumer Electronics."
Date of the exposition: 10:00 am to 5:00 pm, Tuesday, September 14th through Thursday, September 16th, 1999
Site: Exhibition Hall, Tokyo International Forum (For further details, visit our Web site at http://www.convention.co.jp/iccc_j/ .)
- System Configuration
For further information, please contact:
NTT NEWS RELEASE