P.566: Single-ended machine-learning-based models for multi-dimensional speech quality analysis
Study Group
12
Study Period
2025-2028
Consent Date
2026-06-17
Approval Date
Provisional Name
P.SAMD
Input used for Consent
SG12-TD473-R1/GEN (2026-06)
Status
LC
IPR
This Recommendation describes an objective single-ended method for predicting subjective speech quality in telecommunication applications. The algorithm operates directly on a degraded speech signal without requiring a reference or other information about the speech file or its processing, and provides predictions of both overall quality and multiple perceptual dimensions (noisiness, discontinuity, coloration, and sub-optimum loudness) on a 1-5 ACR scale, consistent with ITU-T P.800 listening-only tests for overall quality and according the dimensional scales as defined in the Annex of ITU-T P.863.2.
The method is applicable to audio signals up to fullband (FB, 20-20 000Hz) and has been trained and validated on a large set of databases reflecting a wide variety of coding, transport, and enhancement conditions. The model is specifically designed to assess listening quality of conversational speech over a wide range of different speakers, including environmental noise and non-perfect talking conditions.
This Recommendation presents a high-level description of the method and advice on how to use it. Implementation and conformance testing data accompany this Recommendation.
TSB Note: For the purpose of the AAP Last Call, the electronic attachment (1.6 GB) is available at https://mycloud.itu.int/f/3106458
AAP Current Status
| Step # | Action |
Start / End |
Status | Announcement | Related documents | Comments / Resolution logs |
|---|