Page 163 - Kaleidoscope Academic Conference Proceedings 2024
P. 163
Innovation and Digital Transformation for a Sustainable World
potential of our approach to bridge the gap between general-
Can I take Ibuprofen is a nonsteroidal anti-inflammatory purpose language models and domain-specific health
ibuprofen drug (NSAID) used to treat pain and applications.
while on inflammation, while lisinopril is an ACE
lisinopril for inhibitor used to treat hypertension and other The representative examples in Table 2 illustrate the system's
my blood ability to provide personalized, actionable recommendations
pressure? cardiovascular conditions. Taking ibuprofen
while on lisinopril can potentially decrease based on users' specific health contexts and needs. By
the effectiveness of lisinopril and increase leveraging knowledge retrieval, the system can tailor its
your risk of kidney problems. If you need outputs to individual users while maintaining alignment with
pain relief, acetaminophen is generally established clinical guidelines and best practices. This level
considered a safer option. However, it's of personalization is critical for engaging users and
always best to consult with your doctor or promoting behavior change, as generic, one-size-fits-all
pharmacist before taking any new health advice is often less effective. However, our work also
highlights important limitations and challenges that need to
medications to avoid potential drug be addressed. One key issue is the potential for biased or
interactions. They can recommend the most inconsistent outputs, particularly when dealing with complex
appropriate treatment options based on your or ambiguous health queries. While our retrieval-augmented
individual health needs and medical history.
generation approach helps mitigate this risk by grounding
outputs in verified knowledge, there may still be cases where
the model generates inappropriate or misleading responses.
Table 2 illustrates two representative user interactions, Developing more robust methods for controlling and
showing how the RAG incorporates retrieved knowledge to aligning model outputs, such as adversarial training, value
personalize and strengthen its responses compared to a learning, or human-in-the-loop oversight, is an important
baseline without augmentation. For example, when the user direction for future work [42], [43]. Another challenge is the
asks about lifestyle changes for hypertension, the system need to continuously monitor and update the system's
draws on authoritative guidelines like DASH to suggest knowledge bases to keep pace with the rapidly evolving
tailored diet and exercise tips. The drug interaction query health landscape. As new research findings, treatment
triggers a safety warning and recommendation to consult a guidelines, and public health recommendations emerge, it is
doctor, based on structured information from a medical critical that the system's underlying knowledge is updated
database. These examples highlight how our framework accordingly. This requires ongoing curation and
enables more informed, actionable, and context-aware health maintenance efforts, as well as mechanisms for detecting and
advice by dynamically integrating relevant domain mitigating potential inconsistencies or conflicts between
knowledge into the generative process. The personalized different knowledge sources. Privacy and security
outputs also establish a meaningful user dialogue, while the considerations are also paramount when deploying AI
retrieved facts help maintain clinical validity. systems in the health domain. While our approach does not
directly use or store personal health data for model training
5. DISCUSSION or inference, there may still be risks of sensitive information
being inadvertently revealed through user interactions.
Our results demonstrate the potential of generative AI and Techniques for privacy-preserving AI, such as federated
knowledge retrieval to enable personalized digital health learning, differential privacy, and homomorphic encryption,
services. By combining the strengths of large language could help mitigate these risks and ensure compliance with
models, which can engage in fluent, contextual interactions, data protection regulations [44]. It is important to recognize
with curated health knowledge bases, which provide verified, that our system is intended to supplement, rather than replace,
domain-specific information, our proposed system can human healthcare providers. While generative models can
provide users with relevant, reliable, and actionable health provide valuable information and support, they should not be
support. The automated evaluation results suggest that our used for definitive diagnosis, treatment planning, or
system can generate high-quality, accurate responses to user emergency response. Ensuring appropriate use and setting
health queries. The low perplexity and high BLEU and realistic expectations for both users and providers is critical
ROUGE scores indicate that the generated text is fluent, for the safe and effective deployment of AI in healthcare.
coherent, and aligned with human-written references. The There are also vital considerations around responsible
factual accuracy of 92% is particularly encouraging, as it development practices, model interpretability, and
shows that the system's outputs are grounded in verified stakeholder involvement that require ongoing
health information. This is a critical consideration for any AI multidisciplinary collaboration to address. Domain experts
system deployed in the health domain, where inaccurate or such as clinicians, patient advocates, ethicists, and regulators
misleading information could have serious consequences. should be engaged throughout the research and development
The user study results further validate the system's utility and lifecycle to align system capabilities with real-world needs,
usability. Both lay users and healthcare professionals values, and constraints. This includes proactive risk
reported high satisfaction with the generated responses' assessment and mitigation strategies around safety, fairness,
relevance, usefulness, and clarity. The positive ratings from transparency, and accountability. Policymakers and health
medical experts also suggest that the system's outputs are system leaders will also need to establish governance
clinically valid and complete. These findings underscore the
– 119 –