RESEARCH

HOME RESEARCH
Health Analytics
Leveraging Foundation Models for Clinically Instructed Tumor Image Synthesis in Renal Cell Carcinoma
Abstract
Renal cell carcinoma tumor images are utilized in various fields for critical functions. Application is constrained in unique scenarios requiring specific tumor imaging, which is often difficult to obtain due to rarity or privacy concerns. While general tumor synthesis has been successful as a data acquisition solution, specific domains demand precise control and accurate depiction of tumor characteristics. Our study addresses these limitations by integrating RENAL score guidelines into the synthesis process, enabling clinically instructed tumor synthesis tailored to specific medical demands. In this work, we introduce a generative framework that begins by decoding a segmentation mask from the textual outputs of a multimodal large language model, using clinical descriptions from RENAL score descriptors. The decoded mask is then integrated into a latent diffusion model, transforming a healthy volume into a tumor-bearing one. Our results demonstrate a high degree of alignment between the textual queries and the generated tumors, and the synthetic tumors closely replicate those found in other synthetic and real-world sources.
Figures
Overview of clinically instructed tumor synthesis framework. Starting with healthy CT images and corresponding RENAL score descriptors, a multimodal LLM generates text outputs. The last layer embeddings for the segmentation token are integrated with the image embeddings, facilitating a diffusion process tailored by the textual information
Keywords
renal cell carcinoma | tumor synthesis | large language models | renal nephrometry score
Authors
Publication Date
2025/04/14
Conference
IEEE ISBI 2025
DOI
10.1109/ISBI60581.2025.10980760