International Telecommunication Union   ITU
عربي  |  中文  |  Español  |  Français  |  Русский
 
 Advanced Search Advanced Search Site Map Contact us Print Version
 
Home : ITU-T Home : Study Groups : Study Group 16
   
Question 8/16 – Generic sound activity detection
(Continuation of Question 8/16)

Motivation

Voice activity detection (VAD) is widely used in telecommunications networks as a means of differentiating between wanted and unwanted in-band audio signals, for example to obtain trunking efficiency in circuit multiplication equipment, or to ensure correct operation of echo control and other signal enhancement devices, etc.

The proposal for generic sound activity detection (GSAD) is motivated by two problems:

  1. With rapid changes in the telecommunication network environment, more and more multimedia services are being provided. Although the network is evolving from a voice to a multimedia network, most VAD algorithms are still mainly designed to handle voice signals and cannot work properly in the presence of rich audio signals, which include voice, music, background environmental noise, information tones, etc.
  2. Historically, VAD algorithms have been developed separately for individual network elements and applications, and there are currently numerous VAD algorithms. However, they are based on different principles, which make it difficult to provide common performance enhancements across all VADs.
    Therefore it is beneficial to develop a generic sound (rather than voice) activity detector, which can be applied across a range of applications. The benefits from a standardised GSAD are:
    • Enhanced performance to deal with new types of in-band audio signals
    • Reduced development time and cost for new equipment requiring sound activity detection, e.g. codecs, circuit multiplication equipment, echo control, signal enhancement devices, VoIP gateways, terminal adapters, etc
    • Opportunity for use in existing speech and audio coders which do not include VAD

Study items

Study items to be considered include, but are not limited to:

  • Definition and classification of applications and associated performance requirements for generic sound activity detection
  • Definition of algorithm(s) suitable for generic sound activity detection meeting the applications and performance requirements
  • Definition of the test conditions and evaluation procedures to be applied in selecting between candidate algorithms on the basis of objective and subjective performance, in conjunction with SG 12
  • Selection and specification of procedures to be used in verifying the implementation of selected algorithm or algorithms
  • Considerations on how to help measure and mitigate climate changes

Tasks

Tasks include, but are not limited to:

  • Develop Terms of Reference for GSAD algorithms for different applications
  • Assist SG 12 in developing new Recommendations on testing methodologies
  • Solicit proposals and conduct selection test(s) for candidate algorithms to meet these Terms of Reference
  • Develop new Recommendation(s) based on the outcome of the(se) selection test(s)

An up-to-date status of work under this Question is found in the SG 16 work programme (http://itu.int/ITU-T/workprog/wp_search.aspx?isn_sg=554).

 

Relationships

Recommendations:

  • G.700-series speech and audio coding Recommendations
  • G.76X-series circuit multiplication Recommendations
  • G.799.X-series voice over IP gateway Recommendations
  • G.16X-series speech enhancement Recommendations
  • P.800-series methods for objective and subjective assessment of quality Recommendations
  • Q.115.x-series protocols for the control of signal processing network elements and functions

Questions:

  • 7, 9, 10/16 on speech and audio coding
  • 14, 15, 16, 18/16 on network signal processing

Study Groups:

  • ITU-T SG 2 to identify other potential user applications
  • ITU-T SG 9 on applications digital cable systems and IPTV
  • ITU-T SG 11 on signalling requirements and protocols
  • ITU-T SG 12 on speech and audio quality evaluation of specified algorithms
  • ITU-T SG 13 on NGN and on speech and audio coding in IMT
  • ITU-R SG 5 to ensure compatibility with mobile transmission system constraints

Other Bodies:

  • 3GPP, 3GPP2
  • ETSI TISPAN
  • IETF
  • TIA

 

Top - Feedback - Contact Us -  Copyright © ITU 2008 All Rights Reserved
Contact for this page : TSB EDH
Updated : 2008-12-05