A Robust Two-Stage Retrieval-Augmented Vision-Language Framework for Knowledge-Intensive Multimodal Reasoning and Alignment. (2026). Computational Discovery and Intelligent Systems (CDIS), 2(2), 42-52. https://pub.scientificirg.com/index.php/CDIS/article/view/40