Data-driven deep learning has been considered a promising method for building powerful models for medical data, which often requires a large amount of diverse data to be sufficiently effective. However, the expensive cost of collecting and the privacy constraints lead to the fact that existing medical datasets are small-scale and distributed. Federated learning via model distillation is a data-private collaborative learning where the model can leverage all available data without direct sharing. The data knowledge is shared by distillation through the multi-site average prediction scores on the public dataset. However, the average consensus is suboptimal to individual client due to data domain shift in MRI data caused by acquisition protocols, recruitment criteria, etc. In this work, we propose a federated conditional mutual learning (FedCM) to improve the performance by considering the clients' local performance and the similarity between clients. This work is the first federated learning on multi-dataset Alzheimer's disease classification by 3DCNN using T1w MRI. Our method achieves the best recognition rates comparing with FedMD and other frameworks. Further visualization and relevance ranking on the region of interests (ROI) in human brains implies that the left hemisphere may have greater relevance than the right hemisphere does. Several potential regions are listed for future investigation.