Abstract
Objectives: Flow cytometry (FC) is critical for the diagnosis and monitoring of hematologic malignancies. Machine learning (ML) methods rapidly classify multidimensional data and should dramatically improve the efficiency of FC data analysis. We aimed to build a model to classify acute leukemias, including acute promyelocytic leukemia (APL), and distinguish them from nonneoplastic cytopenias. We also sought to illustrate a method to identify key FC parameters that contribute to the model’s performance.
Methods: Using data from 531 patients who underwent evaluation for cytopenias and/ or acute leukemia, we developed an ML model to rapidly distinguish among APL, acute myeloid leukemia/not APL, acute lymphoblastic leukemia, and nonneoplastic cytopenias. Unsupervised learning using gaussian mixture model and Fisher kernel methods were applied to FC listmode data, followed by supervised support vector machine classification.
Results: High accuracy (ACC, 94.2%; area under the curve [AUC], 99.5%) was achieved based on the 37-parameter FC panel. Using only 3 parameters, however, yielded similar performance (ACC, 91.7%; AUC, 98.3%) and highlighted the significant contribution of light scatter properties.
Conclusions: Our findings underscore the potential for ML to automatically identify and prioritize FC specimens that have critical results, including APL and other acute leukemias.