Committed to connecting the world

wsis

Question 15

Parametric and E-model-based planning, prediction and monitoring of conversational speech quality

(Continuation of Question 8/12 - E-Model extension in wideband transmission and future telecommunication and application scenarios - and Question 15/12 - Objective assessment of speech and sound transmission performance quality in networks)

Motivation
The telecommunications industry is working to adopt more flexible infrastructure to control costs and facilitate the introduction of new services. Examples are 5G or generally next generation IP-networks which provide flexible transmission bandwidths and user interface connections, however at the expense or quality which varies with the transmission scenario and with time. A proper transmission planning, as well as flexible prediction and monitoring of Quality of Experience (QoE) are useful in managing the efficient operation and the effective services of such networks.

Regarding transmission planning of such scenarios, Study Group 12 has established the E-model, a computational model for use in transmission planning, see Recommendation G.107. This model is now frequently applied to plan traditional, narrow-band and handset-terminated networks, and to an increasing extent also for wideband and packet-based networks, using the extension of the E-model described in Recommendation G.107.1. While being popular, the E-model still shows a considerable number of limitations, namely when applying it in super-wideband and fullband networks, which non-handset terminal equipment, and with speech processing devices (such as echo cancellers, noise reduction, or alike) integrated in the network or in the terminal.

Regarding the quality prediction and monitoring of such scenarios, the industry is already benefiting from ITU-T Recommendations for objective speech quality assessment. However, most of the techniques described in these recommendations are signal based and address listening only contexts. Typical communications involve interactive, two-way, conversations. IP and mobile networks can be particularly deleterious to interactive applications, including voice conversation; for example due to increased delay, which in turn will increase the probability of double-talk and increase the perceptibility of echo. Thus, there is a need for a real-time, or near real-time, conversational speech quality assessment and monitoring.

In the end, what is needed is the integration of listening-only, talking-only and interaction quality on a common scale which could be used for planning, predicting and monitoring conversational quality in real-life networks. Such a scale would allow for an easier interpretation of the QoE provided by the different network and service scenarios, and thus make use of the flexibility offered by the respective networks in order to provide optimum services to the customer.

It is envisaged that new methods under this question would be developed collaboratively.
The following major Recommendations, in force at the time of approval of this Question, fall under its responsibility:
G.107, G.107.1, P.56, P.561, P.562, P.564, P.833, P.833.1, P.834, P.834.1

Question
Study items to be considered include, but are not limited to: Tasks
Tasks include, but are not limited to: An up-to-date status of work under this Question is contained in the SG12 work programme http://www.itu.int/ITU-T/workprog/wp_search.aspx?q=15/12 

Relationships


Recommendations Questions Study Groups Standardization bodies