Leveraging Foundation Models for Clinically Instructed Tumor Image Synthesis in Renal Cell Carcinoma｜BIIC Lab - NTHU

Leveraging Foundation Models for Clinically Instructed Tumor Image Synthesis in Renal Cell Carcinoma

IEEEXplore

Abstract

Renal cell carcinoma tumor images are utilized in various fields for critical functions. Application is constrained in unique scenarios requiring specific tumor imaging, which is often difficult to obtain due to rarity or privacy concerns. While general tumor synthesis has been successful as a data acquisition solution, specific domains demand precise control and accurate depiction of tumor characteristics. Our study addresses these limitations by integrating RENAL score guidelines into the synthesis process, enabling clinically instructed tumor synthesis tailored to specific medical demands. In this work, we introduce a generative framework that begins by decoding a segmentation mask from the textual outputs of a multimodal large language model, using clinical descriptions from RENAL score descriptors. The decoded mask is then integrated into a latent diffusion model, transforming a healthy volume into a tumor-bearing one. Our results demonstrate a high degree of alignment between the textual queries and the generated tumors, and the synthetic tumors closely replicate those found in other synthetic and real-world sources.

Figures

Overview of clinically instructed tumor synthesis framework. Starting with healthy CT images and corresponding RENAL score descriptors, a multimodal LLM generates text outputs. The last layer embeddings for the segmentation token are integrated with the image embeddings, facilitating a diffusion process tailored by the textual information

Keywords

renal cell carcinoma ｜ tumor synthesis ｜ large language models ｜ renal nephrometry score

Authors

Publication Date

2025/04/14

Conference

IEEE ISBI 2025

DOI

10.1109/ISBI60581.2025.10980760

RESEARCH

Related Research