Diagnosis of Alzheimer’s disease using transfer learning with multi-modal 3D Inception-v4

Zengbei Yuan; Na Qi; Zirong Zhou; Jie Ding; Xing Chen; Junhao Wu; Jie Wang; Jun Zhao

doi:10.21037/qims-24-1577

Original Article

Diagnosis of Alzheimer’s disease using transfer learning with multi-modal 3D Inception-v4

Zengbei Yuan^1#, Na Qi^1#, Zirong Zhou^1#, Jie Ding¹, Xing Chen¹, Junhao Wu², Jie Wang², Jun Zhao¹

¹Department of Nuclear Medicine, Shanghai East Hospital, Tongji University School of Medicine, Shanghai, China; ²Department of Nuclear Medicine & Positron Emission Tomography Center, Huashan Hospital, Fudan University, Shanghai, China

Contributions: (I) Conception and design: Z Yuan, N Qi; (II) Administrative support: J Zhao; (III) Provision of study materials or patients: Z Zhou; (IV) Collection and assembly of data: X Chen; (V) Data analysis and interpretation: J Ding, J Wu, J Wang; (VI) Manuscript writing: All authors; (VII) Final approval of manuscript: All authors.

^#These authors contributed equally to this work.

Correspondence to: Prof. Jun Zhao, MD. Department of Nuclear Medicine, Shanghai East Hospital, Tongji University School of Medicine, No. 150 Jimo Rd., Shanghai 200120, China. Email: petcenter@126.com.

Background: Deep learning (DL) technologies are playing increasingly important roles in computer-aided diagnosis in medicine. In this study, we sought to address issues related to the diagnosis of Alzheimer’s disease (AD) based on multi-modal features, and introduced a multi-modal three-dimensional Inception-v4 model that employs transfer learning for AD diagnosis based on magnetic resonance imaging (MRI) and clinical score data.

Methods: The multi-modal three-dimensional (3D) Inception-v4 model was first pre-trained using data from the Alzheimer’s Disease Neuroimaging Initiative (ADNI) database. Subsequently, independent validation data were used to fine-tune the model with pre-trained weight parameters. The model was quantitatively evaluated using the mean values obtained from five-fold cross-validation. Further, control experiments were conducted to verify the performance of the model patients with AD, and in the study of disease progression.

Results: In the AD diagnosis task, when a single image marker was used, the average accuracy (ACC) and area under the curve (AUC) were 62.21% and 71.87%, respectively. When transfer learning was not employed, the average ACC and AUC were 75.74% and 83.13%, respectively. Conversely, the combined approach proposed in this study achieved an average ACC of 87.84%, and an average AUC of 90.80% [with an average precision (PRE) of 87.21%, an average recall (REC) of 82.52%, and an average F1 of 83.58%].

Conclusions: In comparison with existing methods, the performance of the proposed method was superior in terms of diagnostic accuracy. Specifically, the method showed an enhanced ability to accurately distinguish among various stages of AD. Our findings show that multi-modal feature fusion and transfer learning can be valuable resources in the treatment of patients with AD, and in the study of disease progression.

Keywords: Alzheimer’s disease (AD); deep learning (DL); magnetic resonance imaging (MRI); mini-mental state examination (MMSE)

Submitted Aug 04, 2024. Accepted for publication Dec 28, 2024. Published online Jan 20, 2025.

doi: 10.21037/qims-24-1577

Introduction

Alzheimer’s disease (AD) was first described by German pathologist Alois Alzheimer in 1906. Clinically, AD is characterized by impairments in cognitive and executive functions, with AD patients often experiencing progressive decline in their memory, behavior, and analytical abilities (1). The progression of AD is slow, and early stage diagnosis is challenging due to the lack of specific diagnostic markers and is easily confused with natural aging. Mild cognitive impairment (MCI) is considered a prodromal stage of AD. The progression rate of individuals with MCI to AD is 10 times higher than that of normal individuals, and approximately 70% of MCI patients advance to AD within five years of being diagnosed (2).

Magnetic resonance imaging (MRI) is a non-invasive examination method that effectively characterizes the structure and anatomical information of brain tissue (3,4). Three-dimensional (3D) T1-weighted imaging (T1WI) provides detailed morphological features of brain structures with high image resolution, short imaging times, and a high signal-to-noise ratio. Moreover, 3D T1WI is suitable for tissue structure extraction and brain structure analysis based on gray- and white-matter features. Information on cortical changes in relevant brain regions provided by 3D T1WI data can serve as an important reference for studying the progression of MCI to AD (5). Studies have shown that changes in structures such as the entorhinal cortex and hippocampus have high sensitivity and specificity in predicting AD progression (6). Significant atrophy in brain regions such as the parahippocampal gyrus, amygdala, and prefrontal cortex is evident in MCI patients, and 3D T1WI data can be used to assess the extent of cortical changes in these areas to evaluate disease progression. Additionally, the corpus callosum shows clear atrophy during AD progression, and significant differences have been observed between different gender groups (7).

Preclinical AD is associated with both cognitive and imaging changes; thus, the combination of neuropsychological and neuroimaging data is meaningful for the diagnosis of AD. After a diagnosis of AD, a combination of cognitive tests and imaging can be used to track disease progression and the treatment response (8). Various clinical scores derived from cognitive function tests play a crucial role in the diagnosis of AD. This study used the mini-mental state examination (MMSE), a clinical scoring tool used for the rapid screening of AD (9,10). The MMSE, which has a maximum total score of 30 points, provides a convenient and quick assessment of a patient’s cognitive function, with different score ranges indicating different cognitive states. For normal individuals, scores generally range from 24 to 30, whereas for AD patients, scores typically range from 20 and 26, with lower scores indicating more severe cognitive impairment (11).

Studies have reported impressive results for the diagnosis of AD using computer-aided algorithms; however, no practical diagnostic methods are currently available in clinical settings (12). Researchers are seeking more effective early diagnosis methods for AD, and the advancement of artificial intelligence technology offers new perspectives for the early detection of AD. Deep learning (DL) models, such as convolutional neural networks (CNNs), can automatically extract crucial information from data without the need for explicitly defined features. When handling high-dimensional medical data, DL models convert input data into lower-dimensional features through nonlinear processing, effectively capturing essential information. Researchers have developed DL models based on various data modalities to mine diverse feature information for AD diagnosis (13-15). Mggdadi et al. (16) developed a DL model based on MRI data to diagnose AD, which had an accuracy (ACC) of 70.3%. Using MRI data, Farooq et al. (17) established a diagnostic model to distinguish between individuals with AD, MCI, late MCI, and normal control (NC), which had a predictive ACC of 98.8%. Using 3D T1-MRI data, Fan et al. (18) employed a U-Net model to distinguish between NCs, and early MCI, late MCI, and AD patients, which had a classification ACC of 86.47%. Using white- and gray-matter data, Ji et al. (19) used a DL model to diagnose AD, which had an AD vs. NC classification ACC of 88.37%. Bi et al. (20) proposed a fully unsupervised DL model for the diagnosis of AD based on MRI, which had an AD vs. MCI ACC of 95.52%, and a MCI vs. NC ACC of 90.63%.

Transfer learning is an effective machine-learning (ML) technique for addressing data scarcity issues. When a pre-training task has some similarity with a downstream fine-tuning task, transfer learning can use the features learned in the pre-training task to promote learning in the new domain. This process enhances model performance and training efficiency by generalizing and promoting knowledge (21,22). In image classification tasks, training deep neural networks typically requires large amounts of data and computational resources. Transfer learning allows the use of pre-trained models on large datasets, freezing certain parameters to prevent them from updating in subsequent training sessions. This enables the application of optimized weight parameters to tasks with smaller sample sizes. The primary benefit of transfer learning lies in its ability to leverage pre-trained models, which serve as a robust initialization. This approach accelerates convergence, reduces the risk of overfitting, and lowers the overall training costs by using pre-existing knowledge and reducing the computational effort required for fine-tuning. The training steps for transfer learning are as follows: (I) pre-training the model on a large-scale dataset (23); (II) loading the pre-trained model weights; (III) freezing the initial layers of the pre-trained model to ensure their parameters do not update during subsequent training (24); (IV) modifying the output layer according to the specific task (25); and (V) fine-tuning the model with the target data.

This study sought to use transfer learning to diagnose AD. The model was initially pre-trained on data from a public database and subsequently fine-tuned with validation data. Additionally, control and ablation experiments were conducted to assess the performance of the model. The results showed that the model had excellent diagnostic performance, and transfer learning significantly improved the accuracy of the model, even when dealing with small sample sizes of imaging data. This is a critical advantage in the field of medical imaging, where obtaining large datasets can be challenging. We present this article in accordance with the TRIPOD+AI reporting checklist (available at https://qims.amegroups.com/article/view/10.21037/qims-24-1577/rc).

Methods

Enrolled participants

A total of 636 participants were included in this study, of whom, 538 were sourced from the publicly available Alzheimer’s Disease Neuroimaging Initiative (ADNI) database (26) (https://adni.loni.usc.edu/), and 98 were sourced from an independent validation dataset. The ADNI participants were scanned using 3T scanners manufactured by Philips, Siemens, and General Electric (GE), to acquireT1WI images of the head. The protocol parameters were as follows: slice thickness: 1.2 mm; scanning matrix: 256×256; repetition time: 2,300 ms; echo time: 2.98 ms; field of view: 240×240 mm²; flip angle: 90 degrees; and reconstruction matrix: 256×256.

The independent validation dataset was sourced from The Chinese Preclinical Alzheimer’s disease Study (C-PAS) database (27), and images were acquired using a 3T uPMR790 TOF scanner (United-Imaging, China). C-PAS is an observational longitudinal study being carried out in Shanghai, China, which commenced in April 2019. The participants were clinically diagnosed by experienced neurologists according to the 2011 National Institute on Aging and Alzheimer’s Association diagnostic criteria for suspected AD. In addition, MCI was diagnosed if the participants met one of the following criteria: (I) had at least one impaired cognitive domain; that is, impaired scores [>1 standard deviation (SD) below the age-corrected normative mean] on all neuropsychological tests in the same domain; or (II) had impaired scores (>1 SD) in each of the three cognitive domains. The participants in the NC group were all recruited from communities in Shanghai. The NC participants were identified as per our previous study (28); specifically, those who did not meet the criteria for AD or MCI were considered NC participants. The study was conducted in accordance with the Declaration of Helsinki (as revised in 2013). The study was approved by the Institutional Ethics Review Board of Huashan Hospital (No. HS-KY-2017-406). All patients signed informed consent forms.

During image acquisition, the participants were in a supine position with cotton balls inserted in both ears to reduce noise, and sponge pads placed between the head and the coil to limit head movement. The participants were instructed to remain quiet and keep their eyes closed during imaging. The imaging parameters were as follows: slice thickness: 1.0 mm; scan matrix: 256×256; repetition time: 2,300 ms; echo time: 3.00 ms; field of view: 230×230 mm²; flip angle: 10 degrees; and reconstruction matrix: 256×256. The demographic information of the included participants is provided in Table 1.

Table 1

The demographic information of the enrolled participants

Dataset	Groups	Age (years)	Gender (M/F)	MMSE
ADNI	NC	75.85±6.39	112/119	28.96±1.78
	MCI	75.40±8.27	94/72	26.74±3.41
	AD	74.42±8.35	77/64	23.06±2.04
C-PAS	NC	62.27±10.37	8/18	29.04±0.94
	MCI	69.25±6.00	19/21	26.70±2.05
	AD	67.94±9.30	11/21	17.50±5.67

Data are presented as mean ± standard deviation or n. ADNI, Alzheimer’s Disease Neuroimaging Initiative; C-PAS, Chinese Preclinical Alzheimer’s disease Study; NC, normal control; MCI, mild cognitive impairment; AD, Alzheimer’s disease; M, male; F, female; MMSE, mini-mental state examination.

Study methods

Based on the imaging and clinical score features, a multi-modal 3D Inception-v4 model for diagnosing AD was constructed. As Figure 1 shows, the main process of this experiment comprised: (I) data preprocessing: this involves preprocessing the original MRI data, including image registration and the removal of non-brain tissues; (II) pre-training: the model is pre-trained using multi-modal features (MRI and MMSE) from the ADNI database; (III) model performance validation: transfer learning is employed using the pre-trained multi-modal 3D Inception-v4 model trained on the ADNI database. The trained model is then fine-tuned on an independent validation dataset using the learned weight parameters; and (IV) performance evaluation: the performance of the model is evaluated using the mean under five-fold cross-validation as a quantitative metric, as well as control and ablation experiments.

Figure 1 The pipeline of AD diagnosis. 3D MRI, three-dimensional magnetic resonance imaging; ADNI, Alzheimer’s Disease Neuroimaging Initiative; C-PAS, Chinese Preclinical Alzheimer’s disease Study; Val, validation; 3D Inception-v4, three-dimensional Inception-v4; AD, Alzheimer’s disease; MCI, mild cognitive impairment; NC, normal control.

Data preprocessing

In this study, all T1WI images were preprocessed using the CAT12 toolbox based on SPM (https://www.nitrc.org/projects/cat/). The preprocessing workflow comprised several steps. First, the raw images underwent initial preprocessing, including denoising filtering using resampling methods to adapt to improve image resolution, and intensity non-uniformity correction to remove brightness and contrast variations. Subsequently, the Adaptive Probability Region-Growing method was employed to remove the non-brain tissues from the images while the preserving intracranial structures. Next, all images were registered to the MNI152 space using the Diffeomorphic Anatomical Registration Through Exponentiated Lie Algebra method (29-31). This approach ensures the precise alignment of anatomical structures while minimizing distortions and preserving morphological information (32). Finally, the images were normalized to obtain standardized 3D MR images with dimensions of 113×137×113 voxels, with each voxel measuring 1.5×1.5×1.5 mm³.

Model construction of multi-modal 3D Inception-v4

Inception-v4, developed by the Google team (33), builds on the core principle of the Inception series (i.e., it applies convolutional kernels of various sizes within a single layer to capture multi-scale features, thus obviating the need for complex, manually designed multi-scale feature extraction). By integrating Inception modules with residual structures, Inception-v4 effectively addresses the gradient vanishing problem that arises in deeper networks. Additionally, its merging strategies reduce the computational load while maintaining information integrity. Designed to enhance network performance and generalization, Inception-v4 is versatile and applicable to a range of computer vision tasks. The multi-modal 3D Inception-v4 model can efficiently handle volumetric data and capture hierarchical patterns through multiple parallel convolutions. This architecture is particularly suited to multi-modal learning, where the integration of various imaging modalities is essential for accurate AD diagnosis.

This study introduced a multi-modal 3D Inception-v4 model (Figure 2), which comprises three fundamental modules: Stem, Inception, and Reduction. The Stem module initially extracts features from 3D images with dimensions of 113×137×113 voxels. An alternating sequence of three Inception and Reduction modules processes the output, which is then directed to the average pool and fully connected (FC) layers, culminating in the feature fusion module. The Stem module, the network’s initial component, performs preliminary processing and feature extraction from the input image, generating an initial feature map. It employs asymmetric convolutional kernels while maintaining a parallel structure. By incorporating 1×1 convolutions between convolutional layers and asymmetric convolutions, it reduces the number of feature channels and progressively downsizes the spatial dimensions to capture multi-scale information more effectively. The Inception module, which forms the core of Inception-v4, includes Inception-A, Inception-B, and Inception-C (Figure 2A-2C). It comprises convolutional and pooling layers of varying sizes, enabling the network to focus on multi-scale information. The Reduction module reduces the feature map size and includes convolutional and pooling layers with larger strides. Inception-v4 employs two types of Reduction modules: Reduction-A and Reduction-B, as illustrated in Figure 2D,2E. The Stem structure is depicted in Figure 2F.

Figure 2 The architecture of the multi-modal 3D Inception-v4 model. conv, Conv 3D + BN + ReLU; Avg pool, average pool; 3D, three-dimensional; concat, concatenate; conv, convolutional; BN, batch normalization; ReLU, rectified linear unit; FC, fully connected.

The feature fusion module, as depicted in Figure 3, employs the Concatenate technique to integrate multi-modal features, enabling the model to learn and incorporate diverse feature information. The concatenate fusion method combines features from multiple modalities, which are then processed by FC layers to learn the relationships between these modalities. A key advantage of concatenate feature fusion is that it does not incur additional computational costs. Further, since data from different modalities come from distinct sources, any negative information inherent in the data does not affect the model’s final decision-making process. This approach effectively prevents the accumulation of erroneous information and mitigates its impact on the model’s output.

Figure 3 Chart of the multi-feature fusion layer. MMSE, mini-mental state examination; MRI, magnetic resonance imaging; Conv, convolutional; FC, fully connected; ReLU, rectified linear unit.

Experimental setup

The parameter settings for the model training process were as follows: Pre-training with the ADNI Database: epochs: 120; batch size: 4; optimizer: Adan (34); loss function: cross-entropy loss; and learning rate schedule: a warm-up phase followed by cosine learning rate decay. Specifically, over the first 20 epochs, the learning rate linearly increased from 1e−6 to 3e−3. Subsequently, for the remaining 100 epochs, it followed a cosine decay pattern. The parameter settings for the fine-tuning phase of the model were as follows: epochs: 20; batch size: 4; learning rate: 1e−4; optimizer: Adan; loss function: cross-entropy loss.

This study was conducted using an Ubuntu 18.04 operating system, with Python 3.9, PyTorch 2.0, and CUDA 11.8. The hardware specifications included an Intel Xeon Silver 4210R CPU, and an NVIDIA GeForce RTX 3090 GPU.

Results

This study used imaging and clinical score data as the model inputs to achieve AD classification based on transfer-learning techniques. The experimental results are presented in Table 2, and the receiver operating characteristic (ROC) curves using the One-vs-Rest (OvR) strategy are shown in Figure 4. The evaluation metrics selected in this study included ACC, recall (REC), precision (PRE), the ROC curve, and the area under the curve (AUC). The ROC curve plots the true positive rate (sensitivity) on the y-axis against the false positive rate (1 − specificity) on the x-axis. A high true positive rate signifies that the model accurately identifies a large proportion of positive cases, while a low false positive rate indicates that the model is less likely to incorrectly classify negative cases as positive. All results were averaged over five-fold cross-validation. Additionally, the proposed model underwent comparative and ablation studies, with both experiments maintaining consistent training parameters, environment, samples, and evaluation metrics, using the average results from the five-fold cross-validation. The model had an ACC of 87.84% for the C-PAS dataset, and an ACC of 89.20% for the ADNI dataset.

Table 2

Diagnosis results under five-fold cross-validation

Fold	ACC (%)	PRE (%)	REC (%)	F1 (%)	AUC (%)
Fold 0	90.00	92.59	92.59	91.67	93.67
Fold 1	90.00	93.33	91.67	91.53	96.00
Fold 2	75.00	58.57	53.33	55.11	80.67
Fold 3	94.74	96.67	91.67	93.48	98.67
Fold 4	89.47	94.87	83.33	86.11	85.00
Mean	87.84	87.21	82.52	83.58	90.80

ACC, accuracy; PRE, precision; REC, recall; AUC, area under the curve.

Figure 4 The ROC curves for the model under five-fold cross-validation. ROC, receiver operating characteristic; AD, Alzheimer’s disease; AUC, area under the curve; MCI, mild cognitive impairment; NC, normal control.

To select the baseline models, we referred to previous studies that have examined the efficacy of these models in the context of AD diagnosis, and selected classical CNN models (i.e., ConvNeXt, DenseNet, EfficientNet, Inception-ResNet-v2, ResNet50, and ResNeXt50-32×4d) as the control groups (35-40). The baseline models were trained and evaluated under the same experimental conditions as the proposed method to ensure a fair and unbiased comparison. All the control models were provided with both imaging and clinical score features as input. The results of the comparative experiments are summarized in Table 3. Our findings showed that the proposed model significantly outperformed the control models, achieving an average ACC improvement of up to 15.42%, an average PRE improvement of up to 16.86%, an average REC improvement of up to 12.74%, and an average F1 improvement of up to 15.24%.

Table 3

Performance comparison of different CNN models

Model	ACC (%)	PRE (%)	REC (%)	F1 (%)
ConvNeXt	81.63	81.13	80.55	78.70
DenseNet	78.58	79.11	80.62	77.22
EfficientNet	76.53	76.69	75.66	71.83
Inception-ResNet-v2	77.58	75.84	72.64	73.16
ResNet50	72.42	70.35	69.78	68.34
ResNeXt50-32×4d	75.58	70.70	70.91	69.98
Proposed model	87.84	87.21	82.52	83.58

CNN, convolutional neural network; ACC, accuracy; PRE, precision; REC, recall.

To better understand how the various components influenced the effectiveness of the model, we conducted ablation experiments by removing each component sequentially and in combination. In the group without MMSE data, the model was trained and evaluated without incorporating the MMSE scores to determine the added value of this clinical assessment. In the group without transfer-learning data, the model was trained from scratch without using pre-trained weights to assess the value added by transfer learning. These ablation studies provided insights into the individual contributions of the MMSE scores and transfer learning to the model’s performance. The ROC curves are shown in Figure 5. The results of ablation studies are presented in Table 4 and Figure 6. These Figures 5,6 also show the outcomes of the ablation analysis involving different components of our model. The findings showed that compared to the model without transfer learning, the average ACC improved by 12.1% and the average AUC improved by 7.67%. Additionally, compared to the model without clinical score features, the average ACC improved by 25.63% and the average AUC improved by 18.93%. These results validated the effectiveness of transfer learning and multi-modal feature fusion.

Figure 5 Results of an ablation analysis removing different components of our model. (A) The ROC curves of the model removing MMSE. (B) The ROC curves of the model removing transfer learning. ROC, receiver operating characteristic; AD, Alzheimer’s disease; AUC, area under the curve; MCI, mild cognitive impairment; NC, normal control; MMSE, mini-mental state examination.

Table 4

Ablation experimental results

Fold	Removing MMSE					Removing transfer learning
Fold	ACC (%)	PRE (%)	REC (%)	F1 (%)	AUC (%)	ACC (%)	PRE (%)	REC (%)	F1 (%)	AUC (%)
Fold 0	65.00	76.67	68.39	68.67	82.00	70.00	69.44	63.89	62.34	78.00
Fold 1	65.00	59.68	58.33	58.41	72.33	65.00	64.44	70.83	65.58	76.33
Fold 2	60.00	47.18	41.48	42.24	68.33	70.00	58.33	49.63	53.59	82.00
Fold 3	57.89	70.00	62.04	60.00	80.67	84.21	82.22	79.63	80.45	84.00
Fold 4	63.16	54.17	46.97	46.91	56.00	89.47	86.67	93.94	89.26	95.33
Mean	62.21	61.54	55.44	55.25	71.87	75.74	72.22	71.58	70.25	83.13

MMSE, mini-mental state examination; ACC, accuracy; PRE, precision; REC, recall; AUC, area under the curve.

Figure 6 Box plot of the metrics from the ablation experiments showing model performance. (A) The ACC of the ablation experimental model. (B) The AUC of the ablation experimental model. ACC, accuracy; MMSE, mini-mental state examination; AUC, area under the curve.

Discussion

In this study, we introduced a method based on the multi-modal 3D Inception-v4 model that leverages transfer-learning techniques for AD diagnosis using imaging and clinical score features. Our proposed model had the highest overall classification ACC across diagnostic tasks compared with other CNN models.

To address the issue of limited sample sizes in medical imaging, transfer learning was employed to pre-train the model on a large publicly available dataset, and the acquired knowledge was then applied to small-sample medical data. During training, we compared our method with six other state-of-the-art approaches, and our proposed method significantly outperformed these methods. The model’s performance was assessed using five-fold cross-validation. We reported three-way classification results (AD vs. MCI vs. NC), achieving state-of-the-art ACC, and proving the robustness of the proposed method. This improvement is attributed to the multi-modal 3D Inception-v4 model’s ability to capture multi-scale image features through the parallel use of different-sized convolutional kernels, enhancing classification accuracy. Additionally, the modular design allows for flexible adjustments in depth and width, improving feature representation and reducing the risk of overfitting. The multi-modal feature fusion structure further enriches the model’s extracted features from various modalities, optimizing the decision-layer outputs of the multi-modal 3D Inception-v4 model. Moreover, the ablation study results confirmed the importance of transfer learning and the inclusion of clinical scores in AD diagnosis. The transfer learning results showed the generalizability of our AD diagnosis model, which was trained on the ADNI dataset in a Chinese AD population (C-PAS). Our findings emphasize the significance of taking population-specific factors into account when developing AD diagnostic tools.

This study had several limitations, which suggest directions for future research. First, only MRI data were used for the model training and optimization. A previous study (41) showed that incorporating additional imaging modalities, such as positron emission tomography and computed tomography, can further enhance classification accuracy. Future research should consider integrating multiple modalities to enrich the model’s feature set. Second, the robustness of the DL algorithm is inherently limited by the clinical distribution of the ADNI training set. The algorithm performed well on a small independent test set with a population structure significantly different from that of the ADNI test set; however, its performance and robustness needs to be assessed in prospective, unselected, real-life patient cohorts. Further validation with larger and prospective external test sets must be performed before any clinical use. Future studies should incorporate additional databases, such as AlzData (42) and Open Access Series of Imaging Studies (43), for cross-validation, and increase the sample size of the independent validation sets to improve the model’s generalizability. In addition, the identification of individuals with MCI at risk of progressing to AD is of critical importance in clinical settings. Future research should seek to use longitudinal data to better predict the progression of AD (44).

Overall, our study showed that DL algorithms can diagnose AD using MRI and clinical score features with high accuracy and robustness based on publicly available test data. This study proposed an effective transfer-learning method and a CNN model, which was validated on an independent dataset, laying the foundation for further model improvements. The validation on two independent datasets indicates that the model could serve as a valuable decision support tool, aiding radiology researchers and clinicians in the early diagnosis of AD using MRI and clinical score information. In addition, it provides a reference for DL models based on multi-modal data to identify highly sensitive and specific AD biomarkers (45).

Conclusions

This study established a DL model for diagnosing AD. The model can effectively extract critical feature information from multiple modalities, and its use of transfer-learning techniques ensures stable performance on small-sample datasets. Compared with existing methods, the performance of the proposed method was superior in terms of classification accuracy. The method showed outstanding performance in accurately discriminating among various stages of AD, and thus may have potential in clinical applications. Our findings showed that computer-aided diagnosis has significant value in AD. These findings will also advance the application of DL in early AD diagnosis, providing more accurate and reliable tools for clinical practice.

Acknowledgments

None.

Footnote

Reporting Checklist: The authors have completed the TRIPOD+AI reporting checklist. Available at https://qims.amegroups.com/article/view/10.21037/qims-24-1577/rc

Funding: This work was supported by the National Key Research and Development Program of China (No. 2022YFC2406900) and the Key Discipline Construction Project of Shanghai Pudong New Area Health Commission (No. PWZxk2022-12). The data collection and sharing for this project were funded by the Alzheimer’s Disease Neuroimaging Initiative (ADNI; National Institutes of Health Grant (No. U01 AG024904) and the Chinese Preclinical Alzheimer’s Disease Study (No. C-PAS, HS-KY-2017-406).

Conflicts of Interest: All authors have completed the ICMJE uniform disclosure form (available at https://qims.amegroups.com/article/view/10.21037/qims-24-1577/coif). The authors have no conflicts of interest to declare.

Ethical Statement: The authors are accountable for all aspects of the work in ensuring that questions related to the accuracy or integrity of any part of the work are appropriately investigated and resolved. The study was conducted in accordance with the Declaration of Helsinki (as revised in 2013). The study was approved by the Institutional Ethics Review Board of Huashan Hospital (No. HS-KY-2017-406). All patients signed informed consent forms.

Open Access Statement: This is an Open Access article distributed in accordance with the Creative Commons Attribution-NonCommercial-NoDerivs 4.0 International License (CC BY-NC-ND 4.0), which permits the non-commercial replication and distribution of the article with the strict proviso that no changes or edits are made and the original work is properly cited (including links to both the formal publication through the relevant DOI and the license). See: https://creativecommons.org/licenses/by-nc-nd/4.0/.

References

Alzheimer's Association. 2018 Alzheimer's disease facts and figures. Alzheimer's & Dementia 2018;14:367-429. [Crossref]
Anderson ND. State of the science on mild cognitive impairment (MCI). CNS Spectr 2019;24:78-87. [Crossref] [PubMed]
Salvatore C, Battista P, Castiglioni I. Frontiers for the Early Diagnosis of AD by Means of MRI Brain Imaging and Support Vector Machines. Curr Alzheimer Res 2016;13:509-33. [Crossref] [PubMed]
Basher A, Kim BC, Lee KH, Jung HY. Volumetric feature-based Alzheimer’s disease diagnosis from sMRI data using a convolutional neural network and a deep neural network. IEEE Access 2021;9:29870-82.
Yin TT, Cao MH, Yu JC, Shi TY, Mao XH, Wei XY, Jia ZZ. T1-Weighted Imaging-Based Hippocampal Radiomics in the Diagnosis of Alzheimer's Disease. Acad Radiol 2024;31:5183-92. [Crossref] [PubMed]
Végh MJ, Heldring CM, Kamphuis W, Hijazi S, Timmerman AJ, Li KW, van Nierop P, Mansvelder HD, Hol EM, Smit AB, van Kesteren RE. Reducing hippocampal extracellular matrix reverses early memory deficits in a mouse model of Alzheimer's disease. Acta Neuropathol Commun 2014;2:76. [Crossref] [PubMed]
Lee SH, Bachman AH, Yu D, Lim J, Ardekani BA. Predicting progression from mild cognitive impairment to Alzheimer's disease using longitudinal callosal atrophy. Alzheimers Dement (Amst) 2016;2:68-74. [Crossref] [PubMed]
Stonnington CM, Chu C, Klöppel S, Jack CR Jr, Ashburner J, Frackowiak RS. Predicting clinical scores from magnetic resonance scans in Alzheimer's disease. Neuroimage 2010;51:1405-13. [Crossref] [PubMed]
Choe YM, Lee BC, Choi IG, Suh GH, Lee DY, Kim JW. MMSE Subscale Scores as Useful Predictors of AD Conversion in Mild Cognitive Impairment. Neuropsychiatr Dis Treat 2020;16:1767-75. [Crossref] [PubMed]
Pozueta A, Rodríguez-Rodríguez E, Vazquez-Higuera JL, Mateo I, Sánchez-Juan P, González-Perez S, Berciano J, Combarros O. Detection of early Alzheimer's disease in MCI patients by the combination of MMSE and an episodic memory test. BMC Neurol 2011;11:78. [Crossref] [PubMed]
Murdaca G, Banchero S, Tonacci A, Nencioni A, Monacelli F, Gangemi S. Vitamin D and Folate as Predictors of MMSE in Alzheimer's Disease: A Machine Learning Analysis. Diagnostics (Basel) 2021;11:940. [Crossref] [PubMed]
Ebrahimighahnavieh MA, Luo S, Chiong R. Deep learning to detect Alzheimer's disease from neuroimaging: A systematic literature review. Comput Methods Programs Biomed 2020;187:105242. [Crossref] [PubMed]
Suk HI, Lee SW, Shen D. Hierarchical feature representation and multimodal fusion with deep learning for AD/MCI diagnosis. Neuroimage 2014;101:569-82. [Crossref] [PubMed]
Lin W, Gao Q, Du M, Chen W, Tong T. Multiclass diagnosis of stages of Alzheimer's disease using linear discriminant analysis scoring for multimodal data. Comput Biol Med 2021;134:104478. [Crossref] [PubMed]
Ding Y, Sohn JH, Kawczynski MG, Trivedi H, Harnish R, Jenkins NW, Lituiev D, Copeland TP, Aboian MS, Mari Aparici C, Behr SC, Flavell RR, Huang SY, Zalocusky KA, Nardo L, Seo Y, Hawkins RA, Hernandez Pampaloni M, Hadley D, Franc BL. A Deep Learning Model to Predict a Diagnosis of Alzheimer Disease by Using (18)F-FDG PET of the Brain. Radiology 2019;290:456-64. [Crossref] [PubMed]
Mggdadi E, Al-Aiad A, Al-Ayyad MS, Darabseh A. Prediction Alzheimer's disease from MRI images using deep learning. In 2021 12th International Conference on Information and Communication Systems (ICICS); 2021:120-5.
Farooq A, Anwar S, Awais M, Rehman S. A deep CNN based multi-class classification of Alzheimer's disease using MRI. In 2017 IEEE International Conference on Imaging systems and techniques (IST); 2017:1-6.
Fan Z, Li J, Zhang L, Zhu G, Li P, Lu X, et al. U-net based analysis of MRI for Alzheimer’s disease diagnosis. Neural Computing and Applications 2021;33:13587-99. [Crossref]
Ji H, Liu Z, Yan WQ, Klette R. Early diagnosis of Alzheimer's disease using deep learning. In Proceedings of the 2nd International Conference on Control and Computer Vision 2019:87-91.
Bi X, Li S, Xi B, Li Y, Wang G, Ma X. Computer aided Alzheimer's disease diagnosis by an unsupervised deep learning technology. Neurocomputing 2020;392:296-304. [Crossref]
Oquab M, Bottou L, Laptev I, Sivic J. Learning and transferring mid-level image representations using convolutional neural networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition 2014:1717-24.
Sarraf S, Tofighi G. Classification of alzheimer's disease using fmri data and deep learning convolutional neural networks. arxiv preprint arxiv:1603.08631, 2016.
Russakovsky O, Deng J, Su H, Krause J, Satheesh S, Ma S, et al. ImageNet large scale visual recognition challenge. International Journal of Computer Vision 2015;115:211-52. [Crossref]
Li S, Yuan G, Dai Y, Zhang Y, Wang Y, Tang X. Smartfrz: An efficient training framework using attention-based layer freezing. arxiv preprint arxiv:2401.16720, 2024.
Salman S, Liu X. Overfitting mechanism and avoidance in deep neural networks. arxiv preprint arxiv:1901.06566, 2019.
Zhao Y, Zhang J, Chen Y, Jiang J. A Novel Deep Learning Radiomics Model to Discriminate AD, MCI and NC: An Exploratory Study Based on Tau PET Scans from ADNI. Brain Sci 2022;12:1067. [Crossref] [PubMed]
Cui L, Huang L, Pan FF, Wang Y, Huang Q, Guan YH, Lo CZ, Guo YH, Chan AS, Xie F, Guo QH. Chinese Preclinical Alzheimer's Disease Study (C-PAS): Design and Challenge from PET Acceptance. J Prev Alzheimers Dis 2023;10:571-80. [Crossref] [PubMed]
Huang Y, Pan FF, Huang L, Guo Q. The Value of Clock Drawing Process Assessment in Screening for Mild Cognitive Impairment and Alzheimer's Dementia. Assessment 2023;30:364-74. [Crossref] [PubMed]
Komatsu J, Matsunari I, Samuraki M, Shima K, Noguchi-Shinohara M, Sakai K, Hamaguchi T, Ono K, Matsuda H, Yamada M. Optimization of DARTEL Settings for the Detection of Alzheimer Disease. AJNR Am J Neuroradiol 2018;39:473-8. [Crossref] [PubMed]
Goto M, Abe O, Aoki S, Hayashi N, Miyati T, Takao H, Iwatsubo T, Yamashita F, Matsuda H, Mori H, Kunimatsu A, Ino K, Yano K, Ohtomo KJapanese Alzheimer's Disease Neuroimaging Initiative. Diffeomorphic Anatomical Registration Through Exponentiated Lie Algebra provides reduced effect of scanner for cortex volumetry with atlas-based method in healthy subjects. Neuroradiology 2013;55:869-75. [Crossref] [PubMed]
Klöppel S, Stonnington CM, Chu C, Draganski B, Scahill RI, Rohrer JD, Fox NC, Jack CR Jr, Ashburner J, Frackowiak RS. Automatic classification of MR scans in Alzheimer's disease. Brain 2008;131:681-9. [Crossref] [PubMed]
Ashburner J. A fast diffeomorphic image registration algorithm. Neuroimage 2007;38:95-113. [Crossref] [PubMed]
Szegedy C, Ioffe S, Vanhoucke V, Alemi A. Inception-v4, inception-resnet and the impact of residual connections on learning. In Proceedings of the AAAI Conference on Artificial Intelligence 2017;31. doi: 10.1609/aaai.v31i1.11231.10.1609/aaai.v31i1.11231
Xie X, Zhou P, Li H, Lin Z, Yan S. Adan: Adaptive Nesterov Momentum Algorithm for Faster Optimizing Deep Models. IEEE Trans Pattern Anal Mach Intell 2024;46:9508-20. [Crossref] [PubMed]
Techa C, Ridouani M, Hassouni L, Anoun H. Automated Alzheimer’s disease classification from brain MRI scans using ConvNeXt and ensemble of machine learning classifiers. In International Conference on Soft Computing and Pattern Recognition. Cham: Springer Nature Switzerland; 2022:382-91.
Solano-Rojas B, Villalón-Fonseca R. A Low-Cost Three-Dimensional DenseNet Neural Network for Alzheimer's Disease Early Discovery. Sensors (Basel) 2021;21:1302. [Crossref] [PubMed]
Zheng B, Gao A, Huang X, Li Y, Liang D, Long X. A modified 3D EfficientNet for the classification of Alzheimer's disease using structural magnetic resonance images. IET Image Processing 2023;17:77-87. [Crossref]
Bhardwaj S, Kaushik T, Bisht M, Gupta P, Mundra S. Detection of Alzheimer Disease Using Machine Learning. In Smart Systems: Innovations in Computing: Proceedings of SSIC 2021. Springer Singapore; 2022:443-50.
Fulton LV, Dolezel D, Harrop J, Yan Y, Fulton CP. Classification of Alzheimer's Disease with and without Imagery using Gradient Boosted Machines and ResNet-50. Brain Sci 2019;9:212. [Crossref] [PubMed]
Li X, Hao Z, Li D, Jin Q, Tang Z, Yao X, Wu T. Brain age prediction via cross-stratified ensemble learning. Neuroimage 2024;299:120825. [Crossref] [PubMed]
Zhang F, Li Z, Zhang B, Du H, Wang B, Zhang X. Multi-modal deep learning model for auxiliary diagnosis of Alzheimer’s disease. Neurocomputing 2019;361:185-95. [Crossref]
Zou T, Zhou X, Wang Q, Zhao Y, Zhu M, Zhang L, Chen W, Abuliz P, Miao H, Kabinur K, Alimu K. Associations of serum DNA methylation levels of chemokine signaling pathway genes with mild cognitive impairment (MCI) and Alzheimer's disease (AD). PLoS One 2023;18:e0295320. [Crossref] [PubMed]
Marcus DS, Wang TH, Parker J, Csernansky JG, Morris JC, Buckner RL. Open Access Series of Imaging Studies (OASIS): cross-sectional MRI data in young, middle aged, nondemented, and demented older adults. J Cogn Neurosci 2007;19:1498-507. [Crossref] [PubMed]
Khvostikov A, Aderghal K, Benois-Pineau J, Krylov A, Catheline G. 3D CNN-based classification using sMRI and MD-DTI images for Alzheimer disease studies. arXiv preprint arXiv:1801.05968, 2018. doi: 10.48550/arXiv.1801.05968.
Lee G, Nho K, Kang B, Sohn KA, Kim D. for Alzheimer’s Disease Neuroimaging Initiative. Predicting Alzheimer's disease progression using multi-modal deep learning approach. Sci Rep 2019;9:1952. Erratum in: Sci Rep 2023;13:12466. [Crossref] [PubMed]

Cite this article as: Yuan Z, Qi N, Zhou Z, Ding J, Chen X, Wu J, Wang J, Zhao J. Diagnosis of Alzheimer’s disease using transfer learning with multi-modal 3D Inception-v4. Quant Imaging Med Surg 2025;15(2):1455-1467. doi: 10.21037/qims-24-1577

Diagnosis of Alzheimer’s disease using transfer learning with multi-modal 3D Inception-v4

Introduction

Methods

Enrolled participants

Table 1

Study methods

Data preprocessing

Model construction of multi-modal 3D Inception-v4

Experimental setup

Results

Table 2

Table 3

Table 4

Discussion

Conclusions

Acknowledgments

Footnote

References

Article Options

Download Citation

Share