ITU-T Study Group 16 - Question 8/16 (Study Period 2009-2012)

عربي | 中文 | Español | Français | Русский

Advanced Search

Home : ITU-T Home : Study Groups : Study Group 16

Question 8/16 – Generic sound activity detection

(Continuation of Question 8/16)

Motivation

Voice activity detection (VAD) is widely used in telecommunications networks as a means of differentiating between wanted and unwanted in-band audio signals, for example to obtain trunking efficiency in circuit multiplication equipment, or to ensure correct operation of echo control and other signal enhancement devices, etc.

The proposal for generic sound activity detection (GSAD) is motivated by two problems:

With rapid changes in the telecommunication network environment, more and more multimedia services are being provided. Although the network is evolving from a voice to a multimedia network, most VAD algorithms are still mainly designed to handle voice signals and cannot work properly in the presence of rich audio signals, which include voice, music, background environmental noise, information tones, etc.
Historically, VAD algorithms have been developed separately for individual network elements and applications, and there are currently numerous VAD algorithms. However, they are based on different principles, which make it difficult to provide common performance enhancements across all VADs.
Therefore it is beneficial to develop a generic sound (rather than voice) activity detector, which can be applied across a range of applications. The benefits from a standardised GSAD are:
- Enhanced performance to deal with new types of in-band audio signals
- Reduced development time and cost for new equipment requiring sound activity detection, e.g. codecs, circuit multiplication equipment, echo control, signal enhancement devices, VoIP gateways, terminal adapters, etc
- Opportunity for use in existing speech and audio coders which do not include VAD

Study items

Study items to be considered include, but are not limited to:

Definition and classification of applications and associated performance requirements for generic sound activity detection
Definition of algorithm(s) suitable for generic sound activity detection meeting the applications and performance requirements
Definition of the test conditions and evaluation procedures to be applied in selecting between candidate algorithms on the basis of objective and subjective performance, in conjunction with SG 12
Selection and specification of procedures to be used in verifying the implementation of selected algorithm or algorithms
Considerations on how to help measure and mitigate climate changes

Tasks

Tasks include, but are not limited to:

Develop Terms of Reference for GSAD algorithms for different applications
Assist SG 12 in developing new Recommendations on testing methodologies
Solicit proposals and conduct selection test(s) for candidate algorithms to meet these Terms of Reference
Develop new Recommendation(s) based on the outcome of the(se) selection test(s)

An up-to-date status of work under this Question is found in the SG 16 work programme (http://itu.int/ITU-T/workprog/wp_search.aspx?isn_sg=554).

Relationships

Recommendations:

G.700-series speech and audio coding Recommendations

G.76X-series circuit multiplication Recommendations

G.799.X-series voice over IP gateway Recommendations

G.16X-series speech enhancement Recommendations

P.800-series methods for objective and subjective assessment of quality Recommendations

Q.115.x-series protocols for the control of signal processing network elements and functions

Questions:

7, 9, 10/16 on speech and audio coding

14, 15, 16, 18/16 on network signal processing

Study Groups:

ITU-T SG 2 to identify other potential user applications

ITU-T SG 9 on applications digital cable systems and IPTV

ITU-T SG 11 on signalling requirements and protocols

ITU-T SG 12 on speech and audio quality evaluation of specified algorithms

ITU-T SG 13 on NGN and on speech and audio coding in IMT

ITU-R SG 5 to ensure compatibility with mobile transmission system constraints

Other Bodies:

3GPP, 3GPP2

ETSI TISPAN

IETF

TIA