ITU-T Work Programme

[2022-2024] : [SG16] : [Q5/16]

[Declared patent(s)]

Work item:

F.MEMVT

Subject/title:

Requirements and framework of multimodal generative AI enabled multi-view transformation

Status:

[Carried to next study period]

Approval process:

AAP

Type of work item:

Recommendation

Version:

New

Equivalent number:

Timing:

Liaison:

ITU-T SG20

Supporting members:

Intel Corporation, China Telecommunications Corporation, China Mobile Communications Co. Ltd., State Grid Corporation of China, Beijing, ZTE Corporation

Summary:

Traditional AI based multi-view transformation (MVT) system is widely used in industrial manufacturing, port cargo tracking, traffic awareness, and so on. Currently, multimodal generative AI has reached a level of maturity suitable for standardization, it can help to deal with different modalities (e.g., images, videos, laser points, map), and solve the problems of traditional AI based MVT system, such as lack of capability to produce obstructed or blocked views and information, and deficient in prior information which is necessary to comprehend spatial relations. This recommendation specifies functional framework, use cases and requirements of MEMVT system, including information compensation, multi-view fusion and completion, multimodal temporal fusion, visualization, prediction, tracking and decision-making assistant, target analysis enhancement, etc. The object of this recommendation is to augment the generation capability of MVT system, thereby enhancing its precision and robustness, and improving the transformation quality and application effectiveness.

Comment:

Reference(s):

[RGM-Q5-DOC43-R1 (2024-08)]

Historic references:

[SG16-TD235-R1-2/PLEN (2024-04) (A.1 TD)

]