ITU-T Recommendation database

Search by number:

Others:

Study Groups tree view

ITU-T P.565 (11/2021)

Framework for creation and performance testing of machine learning based models for the assessment of transmission network impact on speech quality for mobile packet-switched voice services

Recommendation ITU-T P.565 provides the output of the framework which is a machine learning based speech quality prediction model that predicts the impact on speech quality from Internet protocol (IP) transport and underlying transport, as well as a standardized or pre-defined jitter buffer in the end client; thus, providing a network centric view on the speech quality service delivered on mobile packet switched networks. This is expressed in terms of a mean opinion score-listening quality objective (MOS-LQO) under the assumption of an otherwise clean transmission, without background noise, non-standard-conformant encoding on sending device, automatic gain control, voice enhancement devices, transcoding, bridging, frequency response, non-standard-conformant jitter-buffer (for IP multimedia systems (IMS) mobile calls) or decoding, clock drift or any other impairment not caused by the IP transport and underlying transport. The models according to this framework can use information on the temporal structure of the reference signal to identify the importance of individual sections of the bitstream with regard to speech quality. These models do not perform any perceptual analysis of the recorded speech signal.
The framework specifies three modules required for the development of these kinds of metrics: the databases generator module, the machine learning module, and the validation module for the trained model. In addition, the database content and the features used by the machine learning algorithm are described. The framework also provides a large set of test vectors, in the form of error (jitter and packet loss) patterns files for learning and validation. This Recommendation specifies the minimum required performance, as well as conditions and requirements for an independent additional validation for models developed based on the framework. This Recommendation also specifies implementation requirements.
The models developed based on the framework enable the assessment of transmission network impact on speech quality for mobile packet-switched voice services, and therefore benefit operators and regulators alike with a fast and easy speech quality trend monitoring/benchmarking and troubleshooting. In addition, if predictors according to this framework are used together with perceptual speech quality metrics such as ITU-T P.863, it is possible to identify if the source of problems resides inside or outside the transport network observed by the predictor according to this framework. Consequently, a more detailed analysis of the situation can be achieved and troubleshooting of less obvious degradations such as the ones occurring outside of the transport network (e.g., emerged from automatic gain control, voice enhancement devices, transcoding or analogue processing) is enabled.
This Recommendation includes electronic attachments containing detailed descriptions of generic jitter files and a reference speech sample (see Annex D).

Citation:	https://handle.itu.int/11.1002/1000/14827
Series title:	P series: Telephone transmission quality, telephone installations, local line networks P.500-P.599: Objective measuring apparatus
Approval date:	2021-11-29
Provisional name:	P.VSQMTF
Approval process:	AAP
Status:	In force
Maintenance responsibility:	ITU-T Study Group 12
Further details:	Patent statement(s) Development history

Ed.	ITU-T Recommendation	Status	Summary	Table of Contents	Download
2	P.565 (11/2021)	In force	here	here	here
1	P.565 (01/2020)	Superseded	here	here	here


ITU-T Supplement	Title	Status	Summary	Table of contents	Download
P Suppl. 10 (11/1988)	Considerations relating to transmission characteristics for analogue handset telephones	In force	-	here	here
P Suppl. 16 (11/1988)	Guidelines for placement of microphones and loudspeakers in telephone conference rooms and for group audio terminals (GATs)	In force	-	here	here
P Suppl. 20 (03/1993)	Examples of measurements of handset receive-frequency responses: dependence on earcap leakage losses	In force	-	here	here
P Suppl. 23 (02/1998)	ITU-T coded-speech database	In force	here	here	here
P Suppl. 24 (10/2005)	Parameters describing the interaction with spoken dialogue systems	In force	here	here	here
P Suppl. 25 (01/2011)	Parameters describing the interaction with multimodal dialogue systems	In force	here	here	here
P Suppl. 26 (09/2017)	Scenarios for the subjective evaluation of audio and audiovisual multiparty telemeeting quality	In force	here	here	here
P Suppl. 28 (09/2020)	Considerations for the development of new QoS and QoE related objective models to be embedded in Recommendations prepared by ITU-T Study Group 12	In force	here	here	here
P Suppl. 31 (01/2025)	Subjective quality evaluation of audiovisual communication in videotelephony services	In force	here	here	here

Title	Approved on	Download
Implementer's guide for Recommendation ITU-T P.565	2022-06-17	here

Title	Approved on	Download
Addition to Section 2.3 of the Handbook on Telephonometry	2000	here
Addition to Section 3 of the Handbook on Telephonometry	2000	here
Additions to the Handbook on Telephonometry	1999	here
Telephonometry	1992	here

Connecting the world and beyond

ITU-T Recommendations

ITU-T P.565 (11/2021)