Connecting the world and beyond

ITU-T work programme

[2025-2028] : [SG 12] : [WP2/12]

[Work programme]

Work group:	Q9/12 (Presentation Web page is available here)
Title:	Perceptual-based objective methods and corresponding evaluation guidelines for voice and audio quality measurements in telecommunication services
Description:	1 Motivation The work of this Question will focus on objective, perceptual and mainly signal-based methods for evaluating quality parameters in telecommunication scenarios. Primarily, the methods under study should concentrate on user-perceived quality characteristics. Consequently, these methods and algorithms include perceptual approaches. They model results and procedures, which are applicable in subjective tests. So that subjective procedures will get an objective counterpart by using the same scaling and basic procedures. An example for that is the successful standardization of Recommendations P.863 up to fullband audio, a perceptual based method which models objectively Listening Only Tests with Absolute Category Rating for the evaluating of the Listening Speech Quality according to Recommendation P.800. Recommendation P.863 is also extended by Recommendation P.863.2 that provides quality predictions for individual quality dimensions. This Question will extend the objective evaluation of Listening Quality - the by far most often measured speech quality parameter up to now - to other quality aspects of voice telephony like talking quality and quality dimensions in no-reference and full-reference setups, including perceptual, signal-based models for objective rating of multi-channel and spatial audio in telecommunication services. Under consideration of new generation telecommunication services, also other media than speech like music should be taken into account. Furthermore, the evaluation of transmitted noise - especially after processing by noise suppression systems - should be covered by the work of this Question, the same as objective prediction of speech intelligibility. This Question also considers new speech and audio processing techniques based on Machine Learning and Artificial Intelligence and the objective, perceptual evaluation of those. This Question analyses and recommends also methods, metrics and procedures for statistical evaluation, qualification and comparison of objective quality prediction models and gives guidance for developing quality prediction models in general (Recommendation P.1401) and especially by means of machine learning and artificial intelligence (Recommendation P.1402). This Question will also continue and finalize the ongoing work on (P.SAMD). The following major deliverables, in force at the time of approval of this Question, fall under its responsibility: G.1029, P.563, P.863, P.863.1, P.863.2, P.1401, P.1402. 2 Question Study items to be considered include, but are not limited to: - An already defined work item is the objective assessment of talking quality. Therefore at first a reliable subjective test method has to be established. In a second step, an objective model can be developed. - In addition to the existing objective models like P.863 or P.563 that are producing single numbers describing the overall quality; a need for additional information about possible quality degradations and quality dimensions are requested by the market. This is studied under P.863.2 (full-reference) and P.SAMD (no-reference). - Furthermore, the objective assessment of audio signals such as music transmitted over telecommunication links like WCDMA, LTE and 5G with modern codecs and terminals should be investigated. - Perceptual, signal-based models for objective rating of multi-channel and spatial audio in telecommunication services are interesting under the scope of this Question. - The determination of the quality of synthesized speech in an instrumental way, e.g., using the objective perceptual methods, is an interesting topic in this Question as well as methods for objective prediction of speech intelligibility. - Perceptual quality prediction of speech processed by audio processing techniques based on Machine Learning and Artificial Intelligence. Examples are ML-based coding, speech enhancements, talker anonymization or deep fake speech. - This Question analyses and recommends methods, metrics and procedures for statistical evaluation, qualification and comparison of objective quality prediction models. These statistics can be applied to objective prediction models which can be translated to an estimated subjective judgment of a dedicated subjective test procedure. This Question discusses frameworks, metrics and example procedures for those statistical analyses and reporting. Furthermore, this question gives guidance to develop quality prediction models in general and specifically by means of machine learning and artificial intelligence as in Recommendation P.1402. 3 Tasks Tasks include, but are not limited to: - maintenance and enhancement of P-series Recommendations with regards to objective quality testing methods and perceptual models as P.863, P.863.1 and P.863.2; - completion of Recommendations on objective estimation of individual quality dimensions as no-reference approach P.SAMD; - development of a Recommendation for objective, perceptual quality prediction of non-speech signals (e.g., music) in telecommunication services; - development of a Recommendation for perceptual, signal-based models for objective qualitative rating the perception of multi-channel and spatial audio in telecommunication services; An up-to-date status of work under this Question is contained in the SG12 work programme at https://itu.int/ITU-T/workprog/wp_search.aspx?sp=18&q=9/12. 4 Relationships Recommendations: - P-series, G.100- and G.1000-series Questions: - 4/12, 6/12, 7/12, 14/12, 15/12, 19/12 Study groups: - ITU-T SG21 Other bodies: - ETSI TC STQ, 3GPP WSIS Action Lines: - C2 Sustainable Development Goals: - 9
Comment:	Continuation of Q9/12

Rapporteur:

Mr.

Jens

BERGER