AAP Recommendation

P.566: Single-ended machine-learning-based models for multi-dimensional speech quality analysis

Study Group
12

Study Period
2025-2028

Consent Date
2026-06-17

Approval Date

Provisional Name
P.SAMD

Input used for Consent
SG12-TD473-R1/GEN (2026-06)

Status
LC

IPR

This Recommendation describes an objective single-ended method for predicting subjective speech quality in telecommunication applications. The algorithm operates directly on a degraded speech signal without requiring a reference or other information about the speech file or its processing, and provides predictions of both overall quality and multiple perceptual dimensions (noisiness, discontinuity, coloration, and sub-optimum loudness) on a 1-5 ACR scale, consistent with ITU-T P.800 listening-only tests for overall quality and according the dimensional scales as defined in the Annex of ITU-T P.863.2. The method is applicable to audio signals up to fullband (FB, 20-20 000Hz) and has been trained and validated on a large set of databases reflecting a wide variety of coding, transport, and enhancement conditions. The model is specifically designed to assess listening quality of conversational speech over a wide range of different speakers, including environmental noise and non-perfect talking conditions. This Recommendation presents a high-level description of the method and advice on how to use it. Implementation and conformance testing data accompany this Recommendation.

TSB Note: For the purpose of the AAP Last Call, the electronic attachment (1.6 GB) is available at https://mycloud.itu.int/f/3106458

AAP Current Status
Step # Action
Start / End
Status Announcement Related documents Comments / Resolution logs