A Chunking-for-Pooling Strategy for Cytometric Representation Learning for Automatic Hematologic Malignancy Classification｜BIIC Lab - NTHU

Diagnosis

Other: Computation Methods for Health

A Chunking-for-Pooling Strategy for Cytometric Representation Learning for Automatic Hematologic Malignancy Classification

IEEE Xplore

Abstract

Differentiating types of hematologic malignancies is vital to determine therapeutic strategies for the newly-diagnosed patients. Flow cytometry (FC) can be used as diagnostic indicator by measuring the multi-parameter fluorescent markers on thousands of antibody-bound cells, but the manual interpretation of large scale flow cytometry data has long been a time-consuming and complicated task for hematologists and laboratory professionals. Past studies have led to the development of representation learning algorithms to perform sample-level automatic classification. In this work, we propose a chunking-for-pooling strategy to include large-scale FC data into a supervised deep representation learning procedure for automatic hematologic malignancy classification. The use of discriminatively-trained representation learning strategy and the fixed-size chunking and pooling design are key components of this framework. It improves the discriminative power of the FC sample-level embedding and simultaneously addresses the robustness issue due to an inevitable use of down-sampling in conventional distribution based approaches for deriving FC representation. We evaluated our framework on two datasets. Our framework outperformed other baseline methods and achieved 92.3% unweighted average recall (UAR) for fourclass recognition on the UPMC dataset and 85.0% UAR for fiveclass recognition on the hema.to dataset. We further compared the robustness of our proposed framework with that of the traditional downsampling approach. Analysis of the effects of the chunk size and the error cases revealed further insights about different hematologic malignancy characteristics in the FC data.

Figures

Flow cytometry data with fluorescence-antibody combinations are measured as a data matrix which is then split into chunks.

Keywords

representation learning ｜ ensemble ｜ flow cytometry ｜ hematologic malignancy ｜ pooling

Authors