Recommendation ITU-T P.565 (11/2021) Framework for creation and performance testing of machine learning based models for the assessment of transmission network impact on speech quality for mobile packet-switched voice services
Summary
History
FOREWORD
Table of Contents
1 Scope
2 References
3 Definitions
4 Abbreviations and acronyms
5 Conventions
6 Applications for models developed based on the framework
7 High level overview of the framework
     7.1 Framework architecture
     7.2 Generic jitter files
     7.3 Reference speech file
8 Learning and validation database generator for IMS mobile EVS use case
     8.1 Simulate network block
     8.2 EVS coding and decoding blocks: EVS codec and codec parameters
     8.3 MOS grading block
     8.4 EVS process jitter file block: processing of the jitter file
          8.4.1 DTX cleaning
          8.4.2 Add–on codec information
     8.5 Learning and validation databases
9 Machine learning module for IMS mobile EVS use case
     9.1 ML algorithm
     9.2 ML features
          9.2.1 ML features creation
          9.2.2 ML features selection
10 Statistical evaluation module
11 Framework's inputs and outputs
12 Aspects related to the run-time of models developed based on the framework
     12.1 Operation mode
     12.2 Reference speech samples
     12.3 Pre-processing at run time
     12.4 The measurement procedure
13 Requirements for models developed based on the framework
     13.1 Mandatory conditions and procedures
     13.2 Minimum performance requirements
Annex A  Example method for ML overfitting/underfitting test
Annex B  Check list of requirements for a model developed based on the framework
Annex C  Conditions and requirements of an additional independent validation  of a model developed based on the framework
     C.1 Conditions and requirements of an independent validation
     C.2 Validation procedure
Annex D  Electronic attachments
     D.1 Reference speech samples (FB and SWB)
     D.2 Generic jitter files data bases
     D.3 Data bases description
Appendix I  Procedure for feature extraction based on machine learning
     I.1 Create statistical features
     I.2 Create jitter buffer-based features
     I.3 Codec based features (rate and channel aware)
     I.4 Create reference speech-based features
          I.4.1 Types of reference speech-based features
          I.4.2 Features' weighting function calculation
Appendix II  Descriptions of generic jitter files creation
     II.1 Generic jitter files for model development and final validation (source Infovista)
          II.1.1 Learning and validation generic jitter files
               II.1.1.2 Live (drive test) data modulated with simulations
               II.1.1.2 Gilbert burst packet loss and burst jitter
               II.1.1.3 Gilbert severe burst jitter
               II.1.1.4 Random packet loss and random jitter
               II.1.1.5 Manually designed test cases
          II.1.2 Unknown validation live data sets description
     II.2 Unknown independent validation live data set description (source Rohde&Schwarz)
Appendix III  Justification of the minimum requirements based  on performance results' analysis
     III.1 IMS mobile EVS use case
     III.2 Results on learning and validation data sets
     III.3 Results on unknown validation live data
     III.4 ML overfitting/underfitting test
Bibliography
<\pre>