Committed to connecting the world

AI for Good Global Summit

Executive Summary, WP2/12, February 2018

​​Executive summary: Meeting of ITU-T Study Group 12 Working Party 2 (Objective models and tools for multimedia quality), Geneva, 15 February 2018

Note: This is not an official record and is subject to correction or modification


​Working Party 2 of ITU-T Study Group 12 held its third meeting of the study period in Geneva, 15 February. The meeting was attended by 7 participants. The meeting only considered the activities of Question 9/12​​ (Perceptual-based objective methods for voice, audio and visual quality measurements in telecommunication services). The preceding Q9/12 rapporteur group meeting ​on 14 February was attended by 12 participants including 4 ​remote participants.

Two Recommendations were consented (1 revised, 1 Corrigendum).

Major accomplishments

Completed work

ITU-T Rec. No. Question Reference New/Rev. Title AAP/Last Call
P.8639/12 TD38​3R1Rev.Perceptual objective listening quality prediction AAP-29
​P.862 Cor.2​9/12 ​TD384R​1​New​Perceptual evaluation of speech quality (PESQ): An objective method for end-to-end speech quality assessment of narrow-band telephone networks and speech codecs -Corrigendum 2AAP-29

The new Edition 3 of Recommendation ITU-T P.863 does not apply an initial 14 kHz low-pass filter anymore and it is therefore able to consider spectral components above 14 kHz in its analysis now. This expands the scope of application to fullband speech codecs such as OPUS or EVS. Additionally, the ‘shift-jitter’, which could be observed by repeated measurements with slightly differing delay, is decreased. The gain variation introduced by automatic gain control as well as slowly time-varying linear frequency distortions are now adequately considered.

Furthermore, the issues of P.863 Edition 2 (2014/09) documented in the following implementers’ guides were addressed:
Corrigendum 2 to Recommendation ITU-T P.862 addresses systematic under-prediction of subjective scores in Recommendation ITU-T P.862.2. The under-prediction, 0.8 MOS on average, is due to the audio signals being exposed at an incorrect level to the loudness model. The issue leads to degradations being exaggerated and producing lower scores than expected.