Question 12/12 - Performance evaluation of services based on speech technology [Call for participation]
(New Question)
Motivation
Speech technology devices such as automatic speech recognition, speaker verification, speech synthesis, or spoken dialogue systems, are increasingly used to offer automatic voice-enabled services in wireline and mobile networks. In the past, ITU-T SG 12 has been working on subjective evaluation methods for services relying on such devices, leading to a Recommendation on the subjective evaluation of synthesised speech (P.85), and on the subjective evaluation of services based on spoken dialogue systems (Rec. P.851). Still, subjective evaluation methods are needed which quantify the most important aspects of service quality and usability, taking human factors into account. In addition, the performance of individual speech technology devices used in a service has to be quantified, and their contribution to the overall quality and usability of the service has to be measured. The performance strongly depends on the transmission impairments resulting from the network and the terminal equipment, and on the acoustic situation the service is accessed in (e.g. from a moving car).
It is the aim of this Question to define assessment methods for individual speech technology devices, and to relate their performance to subjective quality judgements obtained using the methods defined in Recommendations P.85 and P.851. As the outcome of this work, it is expected to have new Recommendations on parameters which are related to the quality of such services, and on models which are able to predict overall service quality from these parameters. In particular, the impact of transmission impairments (resulting from the network, the terminal equipment and the acoustic situation) on the performance of speech technology devices and on overall service quality will be studied.
The following major Recommendations, in force at the time of approval of this Question, fall under its responsibility: P.85, P.851.
Question
Study items to be considered include, but are not limited to:
- Which parameters can be used to reliably quantify the performance of speech technology devices in the context of voice-enabled telephone services? How can these parameters be measured?
- Is it possible to determine the quality of synthesised speech in an instrumental way, e.g. using the objective methods developed in Q.9/12 (P.563)?
- What is the influence of transmission impairments encountered in modern networks (non-linear codec distortions, time-variant channel characteristics, circuit and comfort noise, handset/ headset/ HFT characteristics) and in acoustically adverse conditions (e.g. in a moving car) on the performance of speech and speaker recognition devices, and on the quality of synthesised speech?
- How can this influence be described and predicted? Are objective methods and network planning models recommended by the ITU-T able to predict the influence of transmission impairments on recognition performance and synthesised speech quality as well? Are the requirements defined for ensuring a sufficiently high speech communication quality also sufficient to guarantee high recognition accuracy?
- Which quality aspects are important for the users of such services? How can these aspects be quantified with subjective evaluation methods? How far is the user of the service distracted from other tasks (e.g. from driving)?
- How are the subjective quality aspects of the overall service related to the performance of the individual speech technology devices? Is it possible to predict service quality on the basis of measurable parameters?
Tasks
Tasks include, but are not limited to:
- Maintenance and enhancement of the Recommendations defining subjective evaluation methods for synthesised speech (P.85) and for services based on spoken dialogue systems (P.851)
- Set-up of a new Recommendation defining parameters which describe the performance of speech technology devices; first draft expected for 2005
- Set-up of a new Recommendation defining quality prediction models for voice-enabled services; first draft expected for 2006
- Potential set-up of a new Recommendation for subjective usability evaluation of voice-enabled services
It is anticipated that several new Recommendations will be produced in the 2005–2008 Study Period.
Relationships
Recommendations: P.85, P.851, P.800, P.830, Handbook STP, G.107, P.862
Questions: 4/12, 7/12, 8/12, 9/12, 4/2, E/16
Study Groups: SG 2, SG 16
Standardisation bodies: ETSI STQ-AURORA, ETSI HF, NIST, 3GPP, 3GPP2
» List of Questions «
|