Abstract
Deep learning has improved chest X-ray (CXR) diagnosis, but poor uncertainty quantification and calibration limit clinical adoption. We present a novel Bayesian framework integrating a hierarchical Bayesian encoder, epistemic-aleatoric decomposition, consistency validation, and adaptive calibration. The encoder propagates uncertainty across anatomical scales; specialized networks separate uncertainty sources; an agent enforces inter-disease consistency; and a hybrid pipeline combines Platt and temperature scaling. Using three CXR datasets, our method reduces expected calibration error by approximately 60% (0.017) without compromising AUC (0.858). Uncertainty analysis shows epistemic dominance in rare conditions (0.207) and aleatoric dominance in pleural disorders (0.167), highlighting dataset and disease-specific patterns. Our framework not only maintains accuracy but also to identifies uncertain cases effectively. These characteristics make it well-suited for reliable integration into clinical workflows and practical deployment.