Speech Recognition Conference Transcription System

Technological fields
Content, Application Technologies
Keyword
  • Speech recognition
  • Conference transcription
  • Automatic speech-to-text conversion
Laboratory organization
NTT Cyber Space Laboratories

Download PDF (218KB)


Overview

The Speech Recognition Conference Transcription System support the creation of conference transcripts using speech recognition technology. Compared to the conventional method of manual transcription, this system is more efficient by automatically converting conference speeches to text using the latest speech recognition technology, followed by manual correction of mis-recognized spots. Currently, we are field-testing this system in actual conferences. We plan to go live in 2010.

Features

  • The “VoiceRex*” speech recognition engine recognizes conference speeches quickly and accurately. Vocabulary size is up to 10 million words
  • Possible to finish editing transcript within 30 minutes after conference ends by edting transcripts in parallel with the conference
  • Synchronized playback of text, speech, and video to expediently correct mis-recognized spots
  • Automatic segmentation of each speaker for entering the speaker’s name

Application scenarios

  • Bring proficiency to the creation of text from speech, including transcripts of conferences
  • Conditions to achieve practical accuracy in speech recognition are speech given in a relatively clear manner and a quiet recording environment
  • The accuracy of speech recognition can be improved by customizing the dictionary and language model of each conference
  • System can be flexible scaled from small conferences to large conferences
  • * “VoiceRex” is a registered trademark of Nippon Telegraph and Telephone Corporation.

figure