Page 116 - AI for Good Innovate for Impact
P. 116

AI for Good Innovate for Impact



                      Data Type, Structure, and Format:

                      •    Primary Data: The core dataset consists of millions of digital pathology slide images
                           (Whole Slide Images - WSIs). Initially, these images are largely unlabeled.
                      •    Supporting Data: Tens of thousands of image-text captions, meticulously curated by
                           medical experts (pathologists), linking specific visual features or regions within the
                           pathology images to descriptive text. Thousands of image-based Chain-of-Thought (CoT)
                           data instances. This structured data pairs pathology images with step-by-step reasoning
                           text, designed to train the model in diagnostic thought processes.
                      •    Structure & Format: The system leverages self-supervised learning algorithms on the vast
                           corpus of pathology images to extract robust visual representations without requiring
                           extensive initial manual annotation. The curated captions and CoT data provide structured,
                           multimodal input crucial for aligning visual and textual features and enabling complex
                           reasoning capabilities. Data processing, including WSI tiling/patching and anonymization,
                           was managed by the ModelEngine which is a one-stop AI toolchain provided by DCS
                           AI Solution, implying handling of standard WSI formats and subsequent generation of
                           structured data formats suitable for model training (e.g., image patches linked to text via
                           identifiers in formats like JSON or CSV). 

                      Image Labelling System: A semi-automated image labelling system was employed, facilitated
                      by the Model Engine. This system assists in data engineering tasks, including efficient image
                      annotation. While self-supervised learning reduced the dependency on exhaustive pixel-level
                      labelling for initial feature extraction, expert-driven annotation was critical for creating the high-
                      quality image-text captions and the image-based CoT data used for fine-tuning and aligning
                      modalities. 

                      Knowledge Transfer and Update Format:  The system features a modular multimodal
                      architecture comprising distinct components: a Visual Projector, an Image-Language Projector,
                      and a Deep Reasoning Language Model module. Each module has well-defined input and
                      output data format standards, allowing them to be trained independently (decoupled training).
                      This modularity provides a clear format for knowledge transfer and updates. When new
                      research findings, treatment mechanisms, or diagnostic criteria emerge, the knowledge base
                      can be updated flexibly by two ways: 1) Incremental Training: Retraining specific modules
                      with supplementary data incorporating the new knowledge (e.g., updating the language
                      model with new medical texts, or the image-language projector with new image-caption
                      pairs reflecting new findings). 2) Plugin Replacement: Replacing existing modules with newer
                      versions incorporating updated algorithms or knowledge.

                      Explanation of Prognosis/Diagnosis to Experts: The integration of Chain-of-Thought (CoT)
                      technology is key to providing explainability. The model explicitly generates and presents
                      the reasoning steps leading to its diagnostic conclusions, mirroring a pathologist's thought
                      process. This transparent reasoning pathway was presented to the expert pathologist.
                      Furthermore, the system supports multi-turn, in-depth conversational interaction, allowing
                      pathologists to query the model, ask follow-up questions about specific reasoning steps or
                      image features, thereby significantly enhancing the interpretability and trustworthiness of the
                      AI's suggestions. (Note: The source primarily mentions diagnosis and reasoning; explanation
                      related to prognosis would follow similar principles if trained on relevant data).

                      Feedback and Learnings from Ruijin Hospital Deployment: The RuiPath model is currently
                      deployed for pilot testing at Ruijin Hospital across 11 subspecialties, including breast, prostate,
                      and thyroid pathology. Initial feedback based on performance metrics indicates promising
                      results, with the system achieving over 90% accuracy in common tasks such as cancer subtype




                  80
   111   112   113   114   115   116   117   118   119   120   121