Distributed Speech Recognition

Introduction

ETSI publishes algorithms for distributed speech recognition (DSR) that enable access to services and communication systems without the need to type or use a keypad. The basic algorithm (to be found in ES 201 108) enables speech to be sent over low quality links such as mobile radio and converted to text for interacting with automated systems. The quality degradation of the links makes it necessary to perform a certain amount of pre-processing at the front end in the mobile terminal and send the results across the link for subsequent processing in the network.

The value of DSR is that it provides substantial recognition performance advantages compared to a conventional mobile voice channel where both the codec compression and channel errors degrade performance. It also enables new mobile multimodal interfaces by allowing the features to be sent simultaneously to other information on a single mobile data channel such as GPRS.

ETSI technical committee for Speech Processing, Transmission and Quality Aspects Working Group AURORA (STQ AURORA) develops and standardizes algorithms for distributed speech recognition (DSR).

The Aurora Concept

The basic distributed speech recognition (DSR) algorithm is in ES 201 108. STQ AURORA has also published two important DSR Standards as extensions which provide the front-end feature extraction suitable for use by speech recognizers.

ES 202 211 is the extension to the DSR mel-cepstrum standard (ES 201 108) and ES 202 212 is the extension to the DSR Advanced Front-end (ES 202 050).

The extensions make use of the same recognition features and give additional functionality. An extra 800 bps of data, providing pitch and voicing class, are sent alongside the 4800 bps for the features.

The publication of these DSR extensions completes standardization of the full DSR capability.

The DSR Extended Advanced Front-end (ES 202 212) is being integrated by 3GPP working group SA4 as the SES (Speech Enabled Services) codec. DSR should have widespread application in future mobile systems and will also be usable over Internet.

The following is a list of recently published and frequently downloaded standards. Please use the ETSI standards search to find further related standards in the public domain or to subscribe for alerts on updates of ETSI standards/specifications.
For work in progress see the ETSI Work Programme on the Portal.

Standard No.Standard Title
ES 201 108 Distributed speech recognition;
Front-end feature extraction algorithm; Compression algorithms
ES 202 211 Distributed speech recognition; Extended front-end feature extraction algorithm;
Compression algorithms; Back-end speech reconstruction algorithm
ES 202 050 Distributed speech recognition;
Advanced front-end feature extraction algorithm; Compression algorithms
ES  202 212 Distributed speech recognition; Extended advanced front-end feature extraction algorithm;
Compression algorithms; Back-end speech reconstruction algorithm
TS 126 243 Digital cellular telecommunications system (Phase 2+); Universal Mobile Telecommunications System (UMTSTM);
ANSI C code for the fixed-point distributed speech recognition extended advanced front-end