|
Work item:
|
P.566 (ex P.SAMD)
|
|
Subject/title:
|
Single-ended machine-learning-based models for multi-dimensional speech quality analysis
|
|
Status:
|
Consented on 2026-06-17
|
|
Approval process:
|
AAP
|
|
Type of work item:
|
Recommendation
|
|
Version:
|
New
|
|
Equivalent number:
|
-
|
|
Timing:
|
2026 (Medium priority)
|
|
Liaison:
|
ETSI TC STQ
|
|
Supporting members:
|
Orange, HEAD acoustics, Rohde & Schwarz, TU Berlin
|
|
Summary:
|
This Recommendation describes an objective single-ended method for predicting subjective speech quality in telecommunication applications. The algorithm operates directly on a degraded speech signal without requiring a reference or other information about the speech file or its processing, and provides predictions of both overall quality and multiple perceptual dimensions (noisiness, discontinuity, coloration, and sub-optimum loudness) on a 1-5 ACR scale, consistent with ITU-T P.800 listening-only tests for overall quality and according the dimensional scales as defined in the Annex of ITU-T P.863.2.
The method is applicable to audio signals up to fullband (FB, 20-20 000Hz) and has been trained and validated on a large set of databases reflecting a wide variety of coding, transport, and enhancement conditions. The model is specifically designed to assess listening quality of conversational speech over a wide range of different speakers, including environmental noise and non-perfect talking conditions.
This Recommendation presents a high-level description of the method and advice on how to use it. Implementation and conformance testing data accompany this Recommendation.
|
|
Comment:
|
-
|
|
Reference(s):
|
|
|
Historic references:
|
|
Contact(s):
|
|
| ITU-T A.5 justification(s): |
|
|
|
|
First registration in the WP:
2017-02-01 15:37:51
|
|
Last update:
2026-06-24 14:23:20
|