|
Work item:
|
F.748.72 (ex F.MEMVT)
|
|
Subject/title:
|
Requirements and framework of multimodal generative AI enabled multi-view transformation
|
|
Status:
|
Consented on 2025-10-17 [Issued from previous study period]
|
|
Approval process:
|
AAP
|
|
Type of work item:
|
Recommendation
|
|
Version:
|
New
|
|
Equivalent number:
|
-
|
|
Timing:
|
2025-10 (Medium priority)
|
|
Liaison:
|
ITU-T SG20
|
|
Supporting members:
|
Intel Corporation, China Telecommunications Corporation, China Mobile Communications Co. Ltd., State Grid Corporation of China, Beijing, ZTE Corporation
|
|
Summary:
|
Multimodal generative AI can help to deal with different modalities (e.g., images, videos, laser points, map), and solve the problems of traditional AI based multi-view transformation (MVT) system, such as lack of capability to produce obstructed or blocked views and information, and deficient in prior information which is necessary to comprehend spatial relations. This Recommendation specifies framework, use cases and requirements of multimodal generative AI enabled multi-view transformation (MEMVT) system, including information compensation, multi-view fusion and completion, multimodal temporal fusion and application functions. This Recommendation is used to augment the generation capability of multi-view transformation system, thereby enhancing its precision and robustness, and improving the transformation quality and application effectiveness.
|
|
Comment:
|
-
|
|
Reference(s):
|
|
|
Historic references:
|
|
Contact(s):
|
|
| ITU-T A.5 justification(s): |
|
|
|
|
First registration in the WP:
2024-06-20 17:12:51
|
|
Last update:
2025-11-07 16:07:59
|