RESEARCH

HOME RESEARCH
Trustworthy AI
Clinical Attributes
States and Traits
A Cluster-based Personalized Federated Learning Strategy for End-to-End ASR of Dementia Patients
Abstract
Automatic speech recognition (ASR) is crucial for all users, but adapting it for Alzheimer’s disease (AD) faces challenges due to irregular speech patterns and privacy concerns. Feder- ated learning (FL), a privacy-preserving algorithm, is a solu- tion. However, FL ASR suffers from acoustic and text hetero- geneities. While advanced model-based and cluster-based FL methods aim to address the issue, they lack a direct mechanism for high intra-speaker heterogeneity exhibited by AD individ- uals and ASR-related properties. This study presents cluster- based personalized federated learning (CPFL), a strategy miti- gating heterogeneity by clustering ASR output token using the proposed CharDiv, a metric for pause and word usage distri- butions. Evaluation on the ADReSS challenge dataset shows a 3.6% improvement in word error rate (WER). Analysis of per- cluster WER improvements and CharDiv distributions indicates reduced heterogeneity, emphasizing pause usage as a potential key factor in AD-oriented ASR.
Authors
Journal
INTERSPEECH 2024