Abstract
Identification of minimal residual disease (MRD) is important in assessing the prognosis of acute myeloid leukemia (AML) and myelodysplastic syndrome (MDS). The current best clinical practice relies heavily on Flow Cytometry (FC) examination. However, the current FC diagnostic examination requires trained physicians to perform lengthy manual interpretation on high-dimensional FC data measurements of each specimen. The difficulty in handling idiosyncrasy between interpreters along with the time-consuming diagnostic process has become one of the major bottlenecks in advancing the treatment of hematological diseases. In this work, we develop an automatic MRD classifications (AML, MDS, normal) algorithm based on learning a deep phenotype representation from a large cohort of retrospective clinical data with over 2000 real patients’ FC samples. We propose to learn a cytometric deep embedding through cell-level autoencoder combined with specimen-level latent Fisher-scoring vectorization. Our method achieves an average AUC of 0.943 across four different hematological malignancies classification tasks, and our analysis further reveals that with only half of the FC markers would be sufficient in obtaining these high recognition accuracies.