| Background and justification Speech recognition systems are being deployed in commercial applications
today, where the whole speech recognition system is typically implemented in a
central place to which all speech signals are routed. In addition to speech recognition, speaker verification plays an important
role as a biometric verification mechanism, as recognized in the IP Networking
and Mediacom 2004 Workshop (Geneva, April 24-27, 2001). Speech recognition and speaker verification systems need to perform a set of
operations, such as signal pre-processing, some sort of front-end extraction of
features or parameters, back-end processing, and higher layer control according
to the constraints of the application. With voice communication over packet based digital networks, such as
Voice-over-IP, becoming popular, elements sitting on the edge of the packet
network are becoming more capable of accomplishing complex signal processing
tasks, such as speech encoding and decoding. With this evolution, there is an
opportunity to enhance the performance and efficiency of speech recognition and
speaker verification systems by moving some of the basic speech signal
processing tasks to the edge of the packet network. Components of a speech recognition or speaker verification system can be
distributed between an edge element (such as a router, gateway or IP telephone)
and a remote application server in a flexible manner. For example, the front-end
may be implemented on a gateway and the back-end on an application server. In
this example, a gateway processor would perform pre-processing and
feature-extraction for speech recognition or speaker verification purposes. The
features would be compressed, packetized and sent to a speech
recognition/speaker verification application server. In turn, the server would
perform the back-end processing and take the appropriate action. Alternatively,
a portion of the front end such as the speech end-pointer may be implemented on
a gateway with the feature extraction and back end being implemented on a
server. One of the key issues to be resolved if Distributed Speech Recognition (DSR)
and Distributed Speaker Verification (DSV) are to become successful is
interoperability between system components at the edge of the packet network and
those on the server, where the edge element and server are produced by different
vendors. This is where standardization is critical. This question will study which standards for DSR and DSV should be adopted
for use over packet-based digital networks, such as IP or ATM networks. Study items 
  
  Develop the overall system architecture for Distributed Speech
  Recognition (DSR) and Distributed Speaker Verification (DSV) systems.Determine which sets of features are appropriate for DSR and DSV
  purposes, taking into consideration that the back-end processing should be
  left as open as possible to allow for improvements in the technologies.Study aspects of the front-end processing and feature extraction
  that should be standardized to ensure interoperability between front-end and
  back-end components of DSR and DSV systems.Define the signalling requirements for communication among
  front-end, back-end, and any intermediate processing elements of DSR and DSV
  systems, and develop a mechanism for negotiating capabilities between these
  elements and selecting a mode of operation.Define the protocol requirements for transport of the extracted
  information over packet based digital networks, and either identify an
  existing or develop a new transport protocol.Consider interoperability issues with existing systems (examples:
  ETSI AURORA and proprietary systems). Specific tasks with expected time-frame of completion This question will study the issues identified above and produce relevant
standards for DSR and DSV systems: late 2002. Relationships 
  Other relevant Questions within Study Group 16 (including Q.B, Q.5,
  Q.2, and Q.3)ITU-T Study Group 12 on end-to-end performance issuesITU-T Study Group 15 on transmission equipment issuesETSI Aurora and TIPHONCommittee T1IETF3GPP, 3GPP-2 |