ITU-T Recommendation database

Search by number:

Others:

Study Groups tree view

ITU-T P.565 (01/2020)

Framework for creation and performance testing of machine learning based models for the assessment of transmission network impact on speech quality for mobile packet-switched voice services

Recommendation ITU-T P.565 provides the output of the framework which is a machine learning
based speech quality prediction model that predicts the impact on the speech quality from the Internet
protocol (IP) transport and underlying transport, as well as the jitter buffer in the end client; thus
providing a network centric view on the speech quality service delivered on mobile packet switched
networks. This is expressed in terms of a mean opinion score-listening quality objective (MOS-LQO)
under the assumption of an otherwise clean transmission, without background noise, automatic gain
control, voice enhancement devices, transcoding, bridging, frequency response, clock drift or any
other impairment not caused by the IP transport and underlying transport. The models according to
this framework use information on the temporal structure of the reference signal to identify the
importance of individual sections of the bitstream with regard to speech quality. These models do not
perform any perceptual analysis of the recorded speech signal.

The framework specifies three modules required for the development of these kinds of metrics: the
databases generator module, the machine learning module, and the validation module for the trained
model. In addition, the database content and the features used by the machine learning algorithm are
described. The framework also provides a large set of test vectors, in the form of error (jitter and packet
loss) patterns files for learning and validation. This Recommendation specifies the minimum required
performance, as well as conditions and requirements for an independent additional validation for
models developed based on the framework. The Recommendation also specifies implementation
requirements.

The models developed based on the framework enable the assessment of transmission network impact
on speech quality for mobile packet-switched voice services, and therefore benefit operators and
regulators alike with a fast and easy speech quality trend monitoring/benchmarking and
troubleshooting. In addition, if predictors according to this framework are used together with
perceptual speech quality metrics like [ITU-T P.863], it is possible to identify if the source of problems
resides inside or outside the transport network observed by the predictor according to this framework.

Consequently, a more detailed analysis of the situation can be achieved and troubleshooting of less
obvious degradations such as the ones occurring outside of the transport network (e.g., emerged from
automatic gain control, voice enhancement devices, transcoding or analogue processing) is enabled.

This Recommendation includes electronic attachments containing detailed descriptions of generic jitter files and a reference speech sample (see Annex F).

Citation:	https://handle.itu.int/11.1002/1000/14152
Series title:	P series: Telephone transmission quality, telephone installations, local line networks P.500-P.599: Objective measuring apparatus
Approval date:	2020-01-13
Provisional name:	P.VSQMTF
Approval process:	AAP
Status:	Superseded
Maintenance responsibility:	ITU-T Study Group 12
Further details:	Patent statement(s) Development history

Ed.	ITU-T Recommendation	Status	Summary	Table of Contents	Download
2	P.565 (11/2021)	In force	here	here	here
1	P.565 (01/2020)	Superseded	here	here	here


ITU-T Supplement	Title	Status	Summary	Table of contents	Download
P Suppl. 10 (11/1988)	Considerations relating to transmission characteristics for analogue handset telephones	In force	-	here	here
P Suppl. 16 (11/1988)	Guidelines for placement of microphones and loudspeakers in telephone conference rooms and for group audio terminals (GATs)	In force	-	here	here
P Suppl. 20 (03/1993)	Examples of measurements of handset receive-frequency responses: dependence on earcap leakage losses	In force	-	here	here
P Suppl. 23 (02/1998)	ITU-T coded-speech database	In force	here	here	here
P Suppl. 24 (10/2005)	Parameters describing the interaction with spoken dialogue systems	In force	here	here	here
P Suppl. 25 (01/2011)	Parameters describing the interaction with multimodal dialogue systems	In force	here	here	here
P Suppl. 26 (09/2017)	Scenarios for the subjective evaluation of audio and audiovisual multiparty telemeeting quality	In force	here	here	here
P Suppl. 28 (09/2020)	Considerations for the development of new QoS and QoE related objective models to be embedded in Recommendations prepared by ITU-T Study Group 12	In force	here	here	here
P Suppl. 31 (01/2025)	Subjective quality evaluation of audiovisual communication in videotelephony services	In force	here	here	here

Title	Approved on	Download
Implementer's guide for Recommendation ITU-T P.565	2022-06-17	here

Title	Approved on	Download
Addition to Section 2.3 of the Handbook on Telephonometry	2000	here
Addition to Section 3 of the Handbook on Telephonometry	2000	here
Additions to the Handbook on Telephonometry	1999	here
Telephonometry	1992	here

Connecting the world and beyond

ITU-T Recommendations

ITU-T P.565 (01/2020)