The downloadble test signals that are useful for telephonometry
applications are described inRec. ITU-T P.50 Appendix I.
This set of test signals contains:
Artificial voices described in
Rec. ITU-T P.50
, that are mainly used for objective evaluation of speech processing systems or devices,
in which a single-channel signal with continuous activity (i.e. without pauses) is sufficient for measuring characteristics.
The use of artificial voice as the signal test, instead of real speech, has the dual advantages of being more easily generated and having a smaller variability than samples of real voices.
The artificial voice reproduces average characteristics of comprehensive human voice.
When a particular system is tested, the characteristics of the transmission path preceding it are to be considered.
Therefore, the actual test signal must be generated as the convolution of the original signal of the artificial voice with the characteristic transfer function of the transmission path.
Real speech recordings
, the full set of which was used to develop Rec. ITU-T P.50. This set includes 16 recorded sentences in each of 20 languages and sentences recorded in the laboratories of some ITU members.
Other artificial voices, which were submitted by Administrations who consider them useful.
Note that for each set (or subset), half is male talkers records and half is female talkers records.
Although the original files were of the ".16P" format, all signals are presented here in the ".WAV" format .
signals or hear samples
Rec. ITU-T P.501 describes test signals that are applicable to various aspects of telephonometry.
These include technical signals, such as pure and distorted sine waves, and speech-like signals. This Recommendation gives:
principles of the signal construction for each type of test signal
an overview of the typical application of the test signals described
These test signals should be used in combination with objective speech quality evaluation methods.
This speech material does not replace the speech material found in
Supplement 23 to ITU-T P-series Recommendations.
Download signals or hear samples
Rec. ITU-T P.561 specifies requirements for in-service and
non-intrusive measurement devices (INMDs) that are utilised primarily for the measurement of voice-grade parameters,
such as speech-level, noise-level, echo-loss and speech-echo-path delay.
INMDs may also be used to measure parameters associated with digital transmission systems of both circuit-switched and packet-switched networks that impact the performance of the voice-grade channels being transported.
This Recommendation specifies interfaces, measurement ranges, and accuracy requirements for measuring voice-grade transmission parameters, as well as descriptions of optional functions associated with these parameters.
Appendix III contains the digital speech recordings
Rec. ITU-T P.564
specifies the minimum criteria for objective speech quality
assessment models for prediction of the impact of impairments observed in an IP-based network on the one-way
listening quality that may be experienced by an end-user of IP/UDP/RTP-based telephony applications (3.1-kHz
narrow-band in the main body, 7 kHz wideband in Annex B). Models compliant with this
Recommendation predict mean opinion scores (MOS) on the ACR listening quality scale. It is
expected that the primary applications for such models are monitoring of transmission quality for
operations and maintenance purposes, and measurements in support of service level agreements
(SLAs) between service providers and their customers. ITU-T P.564-conformant models may be deployed
both in endpoint locations and at mid-network monitoring points.
This Recommendation includes speech material as a set of four, 8-second duration speech files, which represent speech from four individual talkers, two male and two female, providing a total of 32 seconds of speech material.
The speech material has been selected from the Rec. ITU-T P.501 speech database, by extracting those items
that have a minimum distance from the mean ITU-T P.862.1 score obtained by processing a large speech database
through three common codecs (G.711, G.729a, iLBC). The original 16 kHz sample-rate, 16 bit linear PCM source
material from ITU-T P.501 was modified - IRS-send-filtered, and then, decimated by a factor of two using the Rec. ITU-T
G.191 filter utility. The resulting 8 kHz, 16 bit linear PCM files were then normalized to a level of -26
dB0v according to the procedures of Rec. ITU-T P.56.
The speech material should be encoded and inserted into an IP/UDP/RTP stream using the relevant codec and
packet size settings.
Rec. ITU-T P.834.1 describes an extension of the methodology for deriving equipment impairment factors from instrumental models of Recommendation ITU-T P.834.
It is intended that it primarily be applied to determine wideband equipment impairment factors Ie,wb, capturing the degradation introduced by wideband speech codecs.
The resulting wideband equipment impairment factors derived by this methodology are intended to be used on the extended transmission rating scale underlying the E-model (see Appendix II of Recommendation ITU-T G.107).
They will reflect the auditory impairments of the corresponding equipment in a listening-only mode.
The present methodology makes use of instrumental models (so-called “objective methods”), e.g., of the model defined in Recommendation ITU-T P.862.2.
It is to be considered as supplementary to the methodology based on auditory listening-only tests, described in Recommendation ITU-T P.833.1.
It will provide valid Ie,wb values only for those codecs for which the used instrumental model produces meaningful estimations.
Rec. ITU-T P.862 describes an objective method for predicting the subjective quality of 3.1 kHz
(narrow-band) handsets and narrow-band speech codecs. This Recommendation presents a
high-level description of that method, advice on how to use it, and results from a benchmark carried out in the period 1999-2000.
An ANSI-C reference implementation, described in Annex A, is provided.
A conformance testing procedure is also specified in Annex A to allow a user to validate the correctness of an alternative implementation of the model.
(data subject to distribution restrictions)
Supplement 23 to the P series of ITU-T Recommendations is a collection of coded
and source speech material used in characterization tests of the ITU-T 8 kbit/s codec (Recommendation ITU-T G.729).
The purpose of this collection is to provide source, pre-processed and processed speech material, and related subjective test plans scores for the development of new and revised ITU Recommendations relating to objective voice quality measures.
It consists of speech samples recorded in 16-bit linear PCM (binary) files with a low-byte first format, and corresponds to three different experiments.
It should be noted that the speech files contained in this database are the property of the respective test laboratories:
AT&T (USA), CNET (France), CSELT (Italy), Nortel (formely BNR, Canada), and NTT (Japan).
While permission has been granted for them to be used to develop new and revised ITU-T Recommendations, permission for any other use must be negotiated with the owner of the data in question.