ITU-T Work Programme

[2025-2028] : [SG12] : [Q9/12]

[Declared patent(s)] - [Associated work]

Work item:

P.566 (ex P.SAMD)

Subject/title:

Single-ended machine-learning-based models for multi-dimensional speech quality analysis

Status:

Consented on 2026-06-17

Approval process:

AAP

Type of work item:

Recommendation

Version:

New

Equivalent number:

Timing:

2026 (Medium priority)

Liaison:

ETSI TC STQ

Supporting members:

Orange, HEAD acoustics, Rohde & Schwarz, TU Berlin

Summary:

This Recommendation describes an objective single-ended method for predicting subjective speech quality in telecommunication applications. The algorithm operates directly on a degraded speech signal without requiring a reference or other information about the speech file or its processing, and provides predictions of both overall quality and multiple perceptual dimensions (noisiness, discontinuity, coloration, and sub-optimum loudness) on a 1-5 ACR scale, consistent with ITU-T P.800 listening-only tests for overall quality and according the dimensional scales as defined in the Annex of ITU-T P.863.2. The method is applicable to audio signals up to fullband (FB, 20-20 000Hz) and has been trained and validated on a large set of databases reflecting a wide variety of coding, transport, and enhancement conditions. The model is specifically designed to assess listening quality of conversational speech over a wide range of different speakers, including environmental noise and non-perfect talking conditions. This Recommendation presents a high-level description of the method and advice on how to use it. Implementation and conformance testing data accompany this Recommendation.

Comment:

Reference(s):

[SG12-TD473-R1/GEN (2026-06)

]