Committed to connecting the world

Question 15/12

Question 15/12 – Objective assessment of speech and sound transmission performance quality in networks

(Continuation of Question 15/12)

Motivation

The telecommunications industry is working to adopt more flexible infrastructure to control costs and facilitate the introduction of new services. Examples are next generation IP networks (NGN) and 3G mobile networks – both of which exhibit time-varying performance. Measures that predict user-experience are useful in monitoring and managing time-varying performance and help to facilitate the rollout, efficient operation and effective service management of such networks.

The industry is already benefiting from ITU-T recommendations for objective speech quality assessment. Most of the techniques described in these recommendations are signal based and address listening only contexts. Signal processing technology can be used to estimate the contribution of a number of factors affecting the transmission performance of the complete connection.

The accuracy of such methods is high, but this requires a quantity of memory and processing power that does not allow their application in all situations.

However, the current recommendations do not cover some important needs of the industry.

The first need concerns conversational quality. Typical communications involve interactive, two-way, conversations. IP and mobile networks can be particularly deleterious to interactive applications, including voice conversation; for example due to increased delay, which in turn will increase the probability of double-talk and increase the perceptibility of echo.

There is therefore a need for a real-time, or near real-time, conversational speech quality assessment method. Such a method would go much beyond the current methods and could combine "conversational" impairments, such as level, echo and delay, together with 'listening quality' measures to provide an assessment of the overall (conversational) quality perceived by the user at either end of the connection.

It is envisaged that such a method would be developed collaboratively.

The following Recommendations, in force at the time of approval of this Question, fall under its responsibility:

P.56, P.561, P.562, P.564

Question

Study items to be considered include, but are not limited to:

What changes and/or improvements can be made to ITU-T Recommendations P.56, P.561, P562 and P.564?
How can non-intrusive measurements at the IP layers be implemented and improved, for instance by taking into account new services or protocols or transmission layers (e.g. RTCP XR)?
What relationship exists between the subjective responses of users at the terminals and the objective measurements made from the point at which the non-intrusive assessment system is connected?
What are the critical components of conversational speech quality?
What measures give an estimate of the transmission quality of a connection including the accumulated effects of all technologies (e.g. IP, wireless, ATM, etc.)?
How can such measures be used to assess, plan and maintain the transmission quality of networks?
What existing models and measures could be used as inputs to the new methods?
What existing parametric or perceptual models, or components thereof, could be used as building blocks for the new methods?
What subjective test methods should method validation be based on?
How can listening quality and conversational parameters be combined to assess overall perception of conversational quality and how could related Recommendations be developed?
What subjective test data is needed to develop the new methods?
What additional considerations are relevant for wideband speech applications?
How are non-intrusive measurements related to intrusive measurements?
How can talking quality and conversational quality be measured in a non-intrusive way?
Considerations on how to help measure and mitigate climate change.

Tasks

Tasks include, but are not limited to:

What changes and/or improvements can be made to ITU-T Recommendations P.56, P.561, P.562 and P.564?
Define the scope of new models, which will combine multiple objective measurements to provide an assessment of the perceived conversational speech quality in networks
Review existing methods and models, and identify missing components
Define structures of input and output parameters for new models
Identification of relevant subjective test methodologies (in cooperation with Q.7/12)
Identification of existing subjective test data
Collection of new subjective test data
Development of new Recommendation(s) on assessment of conversational speech quality
Development and validation of methods for optimization of quality provided by networks for different conversation scenarios
Development and validation of methods for computing specific parameters values to effectively reflect different conversation scenarios in test samples used for speech quality assessment

An up-to-date status of work under this Question is contained in the SG 12 work programme http://www.itu.int/ITU-T/workprog/wp_search.aspx?q=15/12

Relationships

Recommendations

P.340, P.56, P.561, P.562, P.563, P.564, P.800, P.800.1, P.831, P.832, P.834, P.862, P.863, G.107, G.107.1, G.108, G.115, G.131

Questions

3/12, 6/12, 7/12, 8/12, 9/12, 11/12, 12/12, 13/12, 14/12, 17/12

Study Groups

ITU-T SG 9, SG 16

Standardization bodies

ETSI STQ, IETF (IPPM, XRBLOCK)