Evidence-Grounded Vision–RAG Framework for Clinically Reliable Visual Reasoning in Chest X-Ray Analysis

Shahd Ahmed; Norhan Alaa; Alaa Khaled; Sama Ali; Lara Zein; Neelakrishnan Subramanian

Evidence-Grounded Vision–RAG Framework for Clinically Reliable Visual Reasoning in Chest X-Ray Analysis

Authors

Shahd Ahmed Beni-Suef University Author
Norhan Alaa Beni-Suef University Author
Alaa Khaled Beni-Suef University Author
Sama Ali Nahda University Author
Lara Zein Nahda University Author
Neelakrishnan Subramanian PSG College of Technology Author

Keywords:

Medical Vision-Language Models, Retrieval-Augmented Reasoning, Visual Evidence Retrieval, Chest X-ray Interpretation, Clinical Decision Support

Abstract

Vision–language models have shown potential for medical image understanding tasks such as visual question answering (VQA); however, their clinical adoption is limited by diagnostic ambiguity, limited supervision, and the risk of generating hallucinated or clinically unsafe responses. To address these challenges, this paper proposes an evidence-grounded Vision Retrieval-Augmented Generation (Vision–RAG) framework for reliable visual reasoning in chest X-ray analysis. The framework integrates visual retrieval with evidence-aware language generation to support clinically grounded reasoning without task-specific supervised training. A pretrained vision encoder retrieves semantically similar chest X-ray images and corresponding radiology reports from the MIMIC-CXR dataset, providing external clinical evidence to guide the vision–language model. The retrieval index is built from the training split, and evaluation is performed on a held-out validation set for unbiased assessment. The system is evaluated using approximately 2,000 automatically generated clinical questions. Results demonstrate effective evidence retrieval, achieving a Recall@1 of 66.88%, while yes/no question accuracy reaches 56.8%, reflecting the inherent challenge of unsupervised medical reasoning. Concept-level analysis shows clear separation between normal and infectious cases, with most ambiguity occurring between overlapping conditions such as pleural effusion and consolidation. Importantly, the model exhibits conservative prediction behavior with low false-positive tendencies, highlighting clinical safety. These findings indicate that evidence-grounded Vision–RAG provides an interpretable and reliable paradigm for medical visual reasoning in chest X-ray analysis, supporting decision-making in clinical workflows rather than replacing human expertise.

Downloads

Download data is not yet available.

References

Downloads

Published

08-02-2026

Issue

Vol. 2 No. 2 (2026): February-2026

Section

Original Research

License

This work is licensed under a Creative Commons Attribution 4.0 International License.

Journal of Smart Algorithms and Applications (JSAA) content is published under a Creative Commons Attribution License (CCBY). This means that content is freely available to all readers upon publication, and content is published as soon as production is complete.

Journal of Smart Algorithms and Applications (JSAA) seeks to publish the most influential papers that will significantly advance scientific understanding. Selected articles must present new and widely significant data, syntheses, or concepts. They should merit recognition by the wider scientific community and the general public through publication in a reputable scientific journal.

How to Cite

Evidence-Grounded Vision–RAG Framework for Clinically Reliable Visual Reasoning in Chest X-Ray Analysis. (2026). Journal of Smart Algorithms and Applications (JSAA), 2(2), 49-60. https://pub.scientificirg.com/index.php/JSAA/article/view/48

Download Citation

Evidence-Grounded Vision–RAG Framework for Clinically Reliable Visual Reasoning in Chest X-Ray Analysis

Authors

Keywords:

Abstract

Downloads

References

Downloads

Published

Issue

Section

Categories

License

How to Cite

Most read articles by the same author(s)

Similar Articles

Make a Submission

Announcements

Information

Share

Latest publications

Browse

Visitors