|
Work item:
|
F.CETD
|
|
Subject/title:
|
Requirements for the Construction and Evaluation of Fine-Tuning Datasets for the Training of Domain-specific Models
|
|
Status:
|
Under study
|
|
Approval process:
|
AAP
|
|
Type of work item:
|
Recommendation
|
|
Version:
|
New
|
|
Equivalent number:
|
-
|
|
Timing:
|
2027-08 (Medium priority)
|
|
Liaison:
|
ISO/IEC JTC1 SC42, ITU-T SG13, SG17, SG20
|
|
Supporting members:
|
China Telecommunications Corporation, China Mobile Communications Corporation, China Academy of Information and Communications Technology(CAICT),Beijing University of Posts and Telecommunications
|
|
Summary:
|
With the rapid advancement of artificial intelligence, foundation models have emerged as a cornerstone of modern AI deployment. While their general-purpose architecture enables broad applicability, their effective adaptation to specific vertical domains critically depends on the availability of high-quality, domain-relevant fine-tuning datasets. Without such data, foundation models may fail to capture domain-specific semantics, exhibit poor task performance, or generate unreliable outputs, thereby limiting their practical utility.
Currently, there is a lack of standardized methodologies for constructing and evaluating fine-tuning datasets tailored to domain-specific models. This gap results in inconsistent data quality, limited reproducibility, inefficient model adaptation, and challenges in cross-organizational collaboration and benchmarking. To address these issues, this proposal aims to develop the international standard “Requirement for the Construction and Evaluation of Fine-Tuning Datasets for the Training of Domain-specific Models.”
The standard will establish a unified framework governing the entire dataset lifecycle—including data sourcing, preprocessing, annotation, partitioning, quality assessment, and documentation—specifically designed to meet the unique data requirements of domain-specific model adaptation. It will define technical and procedural requirements to ensure datasets are representative, consistent, secure, and compliant, while also providing objective evaluation criteria to assess dataset fitness for fine-tuning purposes.
By doing so, the standard seeks to enhance the reliability, efficiency, and transparency of domain adaptation for domain-specific models. It will reduce redundant efforts in dataset development, lower barriers to model deployment, improve interoperability across systems and organizations, and foster trust in AI applications through accountable data practices. Ultimately, this standard will support the responsible and scalable integration of domain-specific models into specialized domains, accelerating AI-driven innovation and strengthening the global competitiveness of industry and research ecosystems..
|
|
Comment:
|
-
|
|
Reference(s):
|
|
|
Historic references:
|
|
Contact(s):
|
|
| ITU-T A.5 justification(s): |
|
|
|
|
First registration in the WP:
2025-11-17 14:31:04
|
|
Last update:
2025-11-25 17:19:13
|
|