Study
Group 16 will start work in a new area, generic sound activity detection
(GSAD).
Voice
activity detection (VAD) is widely used in telecommunications networks as a
means of differentiating between wanted and unwanted in-band audio signals, for
example to obtain trunking efficiency in circuit multiplication equipment; to
ensure correct operation of echo control and other signal enhancement devices
etc.
The
proposal for generic sound activity detection (GSAD) is motivated by two
problems.
1. With rapid changes in the
telecommunication network environment, more and more multimedia services are
being provided. Although the network is evolving from a voice to a multimedia
network, most VAD algorithms are still mainly designed to handle voice signals
and can not work properly in the presence of rich audio signals, which include
voice, music, background environmental noise, information tones etc.
2. Historically, VAD algorithms have been
developed separately for individual network elements and applications, and
there are currently numerous VAD algorithms. However, they are based on
different principles, which make it difficult to provide common performance
enhancements across all VADs.
Therefore
it is seen as beneficial to develop a generic sound (rather than voice)
activity detector, which can be applied across a range of applications. The
benefits from a standardised GSAD are predicted to be:
· Enhanced performance to deal with
new types of in-band audio signals
· Reduced development time and cost
for new equipment requiring sound activity detection, eg codecs, circuit
multiplication equipment, echo control, signal enhancement devices, VoIP
gateways, terminal adapters etc.
·
Opportunity for use in existing speech and
audio coders which do not include VAD.