Predicting malignant cerebral edema after acute ischemic stroke: a machine-learning model with multi-region radiomics

Lingfeng Zhang; Yue Zhang; Chunyan Yang; Yi Zhang; Gang Xie; Danni Wang; Kang Li

doi:10.21037/qims-2024-2751

Original Article

Predicting malignant cerebral edema after acute ischemic stroke: a machine-learning model with multi-region radiomics

Lingfeng Zhang^1,2#, Yue Zhang^2,3#, Chunyan Yang^3,4, Yi Zhang^3,5, Gang Xie⁶, Danni Wang², Kang Li^1,2

¹Department of Radiology, North Sichuan Medical College, Nanchong, China; ²Department of Radiology, Chongqing General Hospital, Chongqing University, Chongqing, China; ³Department of Radiology, Chongqing Medical University, Chongqing, China; ⁴Department of Radiology, Chongqing Wulong People’s Hospital, Chongqing, China; ⁵Department of Radiology, Chongqing Jiangjin Second People’s Hospital, Chongqing, China; ⁶Department of Radiology, Chengdu Third People’s Hospital, Chengdu, China

Contributions: (I) Conception and design: L Zhang; (II) Administrative support: D Wang, K Li; (III) Provision of study materials or patients: L Zhang, Yue Zhang, C Yang, Yi Zhang, G Xie; (IV) Collection and assembly of data: L Zhang, Yue Zhang; (V) Data analysis and interpretation: L Zhang, Yue Zhang, K Li; (VI) Manuscript writing: All authors; (VII) Final approval of manuscript: All authors.

^#These authors contribute equally to this work.

Correspondence to: Kang Li, MD, PhD. Department of Radiology, Chongqing General Hospital, Chongqing University, 118 Xingguang Avenue, Yubei District, Chongqing 401121, China; Department of Radiology, North Sichuan Medical College, Nanchong, China. Email: lkrmyydoctor@126.com.

Background: Malignant cerebral edema (MCE) is a severe complication of acute ischemic stroke (AIS) that is associated with poor outcomes or death. The study sought to develop a predictive machine learning (ML)-based model for MCE following AIS using radiomics features from non-contrast computed tomography images of the infarct lesion (IL), the affected hemisphere (AH), and the whole brain (WB).

Methods: A total of 219 AIS patients from four centers were included in this study. Patients from Centers 1, 2, and 3 were allocated to a training cohort and a test cohort by stratified randomization at a ratio of 8:2, while those from Center 4 were allocated to an independent external validation cohort. Radiomics features of the IL, the AH, and the WB were extracted. After the feature selection process, the radiomics features related to MCE were identified. Using seven distinct ML algorithms, an IL model based solely on IL radiomics features, and a combined IWA model that incorporated IL, AH, and WB radiomics features were developed. The performance of the models were assessed by calculating the area under the curve (AUC) value.

Results: The IWA model demonstrated effectiveness in predicting MCE risk, with the multilayer perceptron-based model achieving particularly high performance. The IWA model had a higher AUC than the IL model (0.927 vs. 0.865, P<0.05).

Conclusions: This study developed a novel IWA model that was able to effectively predict the risk of MCE following AIS and was superior to the IL model. It is expected that our model will provide more precise guidance recommendations for clinical treatment in the future.

Keywords: Malignant cerebral edema (MCE); affected hemisphere (AH); whole brain (WB); radiomics; machine learning (ML)

Submitted Dec 05, 2024. Accepted for publication Mar 25, 2025. Published online Jun 03, 2025.

doi: 10.21037/qims-2024-2751

Introduction

Acute ischemic stroke (AIS) is characterized by the narrowing or occlusion of cerebral arteries, resulting in insufficient blood supply to brain tissues, which can lead to ischemia, hypoxia, and necrosis, as well as disability and sometimes death. AIS is the leading cause of adult mortality and disability worldwide (1). Cerebral edema frequently complicates AIS, and is often linked to ion channel and transporter deficits (2). Once cerebral edema progresses to a midline shift greater than 5 mm, it is classified as malignant cerebral edema (MCE) (3). Cerebral edema, a severe complication of AIS, is strongly associated with poor outcomes or death, particularly in cases of severe stroke with malignant progression (4,5). Although the incidence of MCE is lower than previously reported, its mortality rate remains as high as 80%, and it is closely correlated with a risk of severe disability; even with the most effective intensive care, the mortality rate remains alarmingly high (6-8). In such cases, neurological function deteriorates rapidly, often becoming life-threatening, requiring prompt decompressive craniectomy (DC) as a critical intervention (9). Thus, the early and accurate identification of MCE patients requiring DC is crucial for survival and prognosis.

Current MCE monitoring methods predominantly rely on clinical symptom observation and imaging analysis (10). However, these methods often lack sufficient timeliness, delaying patient treatment. Thus, accurate and timely prediction methods need to be established. Following advancements in radiomics and machine learning (ML) methodologies, these methodologies have been extensively used in stroke research (11). Numerous studies have integrated AIS images and their derived features through ML or deep learning (DL) methodologies, and ML and DL models have shown superiority over conventional models in predicting MCE (10,12-16). Further, studies extracting radiomics features from non-contrast computed tomography (NCCT), diffusion-weighted imaging (DWI), and fluid-attenuated inversion recovery (FLAIR) images have shown strong predictive performance for MCE (17-21). Nevertheless, most research has primarily focused on infarct lesion (IL), while neglecting important information about the affected hemisphere (AH) and whole brain (WB). AIS typically results from focal vascular occlusion, which can trigger a WB response (22); thus, further research needs to be conducted into the relationship between MCE and the radiomics features of the AH and WB.

In this study, we derived radiomics features from the IL, AH, and WB to address these issues, and proposed a feature fusion strategy. Our goal was to develop ML models to analyze the predictive performance of radiomics features from IL, AH, and WB in assessing the risk of MCE post-AIS. This approach aimed to enhance early MCE prediction accuracy, ultimately aiding clinicians in making timely therapeutic decisions. We present this article in accordance with the TRIPOD+AI reporting checklist (available at: https://qims.amegroups.com/article/view/10.21037/qims-2024-2751/rc).

Methods

Clinical characteristics

The study was conducted in accordance with the Declaration of Helsinki and its subsequent amendments. The study was approved by the Ethics Committee of Chongqing General Hospital (No. KY S2023-077-01), and the requirement of informed consent was waived due to the retrospective nature of the study. All participating hospitals were informed and agreed with the study.

We collected the clinical data and NCCT images of patients from four hospitals (i.e., Affiliated Hospital of North Sichuan Medical College, Chongqing General Hospital, Chongqing Wulong People’s Hospital, and Chongqing Jiangjin Second People’s Hospital) from July 2016 to June 2024. A total of 554 AIS patients were included in the analysis. To be eligible for inclusion in the study, the patients had to meet the following inclusion criteria: (I) have been diagnosed with AIS; (II) have undergone a head computed tomography (CT) scan within one day of symptom onset and before treatment initiation; (III) have undergone follow-up imaging within 72 hours of the initial imaging; and (IV) have an IL detectable by NCCT. Patients with head trauma, primary cerebral hemorrhage, a history of brain tumor, lacunar infarction, symptomatic cerebral hemorrhage after admission, insufficient imaging data, or severe artifacts on NCCT images were excluded from the study.

Ultimately, 219 patients (152 without MCE and 67 with MCE) were enrolled in the study. The data were collected from the above-mentioned four hospitals. Patients from Centers 1, 2, and 3 were allocated to the training cohort (n=150) and test cohort (n=38) in a stratified randomized manner at a ratio of 8:2. While patients from Center 4 were allocated to the independent external validation cohort (n=31). Figure 1 shows the patient screening flowchart.

Figure 1 Flow chart showing patient inclusion and exclusion criteria. Center 1, Affiliated Hospital of North Sichuan Medical College; Center 2, Chongqing General Hospital; Center 3, Chongqing Wulong People’s Hospital; Center 4, Chongqing Jiangjin Second People’s Hospital. AIS, acute ischemic stroke; CT, computed tomography.

Clinical data collection

Demographic and clinical laboratory data, such as age, sex, stroke onset time, treatment, hypertension, diabetes, hyperlipidemia, atrial fibrillation, and heart failure, were collected. Patients’ National Institutes of Health Stroke Scale (NIHSS) score, smoking status, and drinking habits were evaluated. The clinical data of the patients are presented in Table 1.

Table 1

Clinical baseline data of the patients

Variables	All (n=219)	Non-MCE (n=152)	MCE (n=67)	P
Sex				0.522
Female	96 (43.84)	62 (40.79)	34 (50.75)
Male	123 (56.16)	90 (59.21)	33 (49.25)
Age (years)				0.655
≤49	5 (2.28)	4 (2.63)	1 (1.49)
50–59	57 (26.03)	36 (23.68)	21 (31.34)
60–69	53 (24.2)	35 (23.03)	18 (26.87)
70–79	58 (26.48)	43 (28.29)	15 (22.39)
≥80	46 (21.01)	34 (22.37)	12 (17.91)
Smoking				>0.999
No	139 (63.47)	93 (61.18)	46 (68.66)
Yes	80 (36.53)	59 (38.82)	21 (31.34)
Alcohol				0.523
No	157 (71.69)	102 (67.11)	55 (82.09)
Yes	62 (28.31)	50 (32.89)	12 (17.91)
Hypertension				0.172
No	89 (40.64)	60 (39.47)	29 (43.28)
Yes	130 (59.36)	92 (60.53)	38 (56.72)
Diabetes				0.692
No	151 (68.95)	108 (71.05)	43 (64.18)
Yes	68 (31.05)	44 (28.95)	24 (35.82)
Hyperlipidemia				0.102
No	159 (72.6)	107 (70.39)	52 (77.61)
Yes	60 (27.4)	45 (29.61)	15 (22.39)
Atrial fibrillation				0.162
No	133 (60.73)	97 (63.82)	36 (53.73)
Yes	86 (39.27)	55 (36.18)	31 (46.27)
Heart failure				0.313
No	98 (44.75)	83 (54.61)	15 (22.39)
Yes	121 (55.25)	69 (45.39)	52 (77.61)
HMCAS				>0.999
No	160 (73.06)	126 (82.89)	34 (50.75)
Yes	59 (26.94)	26 (17.11)	33 (49.25)
Massive stroke				0.823
No	122 (55.71)	109 (71.71)	13 (19.4)
Yes	97 (44.29)	43 (28.29)	54 (80.6)
Treatment				0.243
Non-reperfusion	128 (58.45)	81 (53.29)	47 (70.15)
IVT	39 (17.81)	32 (21.05)	7 (10.45)
MT	36 (16.44)	29 (19.08)	7 (10.45)
IVT with MT	16 (7.30)	10 (6.58)	6 (8.95)
Age, years	69.07 (59.0, 78.0)	69.89 (60.0, 78.25)	67.19 (58.0, 76.0)	0.119
Stroke onset time, h	8.71 (3.0, 12.0)	8.84 (3.0, 12.0)	8.41 (3.0, 11.0)	0.728
NIHSS score	12.97 (8.0, 17.0)	11.84 (7.0, 17.0)	15.54 (12.0, 18.0)	<0.001*
Infarct volume, cm³	85.63 (24.41, 123.84)	56.35 (13.61, 84.75)	152.06 (95.6, 180.6)	<0.001*
ASPECTS	6.71 (5.0, 8.0)	7.38 (6.0, 9.0)	5.21 (4.0, 7.0)	<0.001*

Data are presented as n (%) or median (interquartile range). ASPECTS, Alberta Stroke Program Early Computed Tomography Score; HMCAS, hyperdense middle cerebral artery sign; IVT, intravenous thrombolysis; MCE, patients with malignant cerebral edema; MT, mechanical thrombectomy; NIHSS, National Institutes of Health Stroke Scale; Non-MCE, patients without malignant cerebral edema.

Image acquisition and evaluation

A baseline NCCT scan was performed on admission, and details of the equipment and scanning parameters are outlined in Table S1. The imaging data were acquired from the picture archiving and communication system of the four hospitals. Two experienced neuroradiologists, blinded to the clinical details, analyzed all the imaging data with a focus on the hyperdense middle cerebral artery sign (HMCAS), massive stroke, Alberta Stroke Program Early Computed Tomography Score (ASPECTS), and identifying MCE on follow-up imaging. Subsequently, the evaluations were reviewed and refined as necessary by physicians with 15 years of diagnostic neuroradiology experience. Any inconsistent interpretations between the raters were resolved via discussion. The following criteria were evaluated: (I) the HMCAS, which was defined by an increased density of non-calcified middle cerebral arteries on the infarcted side compared to the contralateral side (23); (II) massive stroke, which was defined as an IL involving more than one-third of the affected cerebral hemisphere or measuring over 80 cm³ in volume (24); (III) the ASPECTS, which consists of 10 anatomical regions, and has a total possible score of 10 points (1 point is deducted for each region showing cerebral infarction, resulting in hypodensity) (25); and (IV) the development of MCE, which was defined as a midline shift of over 5 mm on follow-up imaging (3).

Image segmentation, and radiomics feature extraction, selection, and combination

Region of interest (ROI) segmentation was performed for the IL, AH, and WB using the three-dimensional (3D)-Slicer (https://www.slicer.org/) software. Baseline NCCT images were imported in 3D-slicer in Digital Imaging and Communications in Medicine format. The NCCT images were reconstructed with a voxel size of 1 mm × 1 mm × 1 mm, and discretized in grayscale to normalize images from different CT scanners. The bin width of the image grayscale was then fixed to 25. The 3D-Slicer software was used to determine the boundaries of the IL by adjusting the gray values of the NCCT images, followed by semi-automatic segmentation to obtain the 3D ROI of the IL. Subsequently, the 3D ROIs of the AH and WB were manually segmented from the NCCT images. The images were subsequently smoothed using a Gaussian filter.

We strictly adhered to the standardized process of radiomics analysis and employed the Pyradiomics software package (version 3.0.1, https://github.com/AIM-Harvard/pyradiomics/releases/tag/v3.0.1), which complies with Imaging Biomarker Standardization Initiative guidelines, to conduct the comprehensive and systematic extraction of radiomics features. This included the following eight categories of radiomics features: shape features, first-order features, gray-level co-occurrence matrix features, gray-level dependence matrix features, gray-level run length matrix (GLRLM) features, gray-level size zone matrix (GLSZM) features, neighboring gray tone difference matrix (NGTDM) features, and wavelet-based features.

After 2 weeks, the same reader was randomly assigned the NCCT images of 50 patients, and re-identified the ROIs to evaluate intra-reader agreement. An intra-class correlation coefficient (ICC) >0.8 was considered acceptable. Next, the k-nearest neighbor (KNN) method was used to impute the Nan values in the feature values before applying Z-score normalization (Eq. [1]) to standardize the units of measurement across features, thus preventing the model results from being influenced by variations in feature scales. Eq. [1] is expressed as follows:

$z_{i} = \frac{x_{i} + \bar{x}}{s}$ [1]

where z is the normalized value, x is the original data, $\bar{x}$ is the mean of the data, s is the standard deviation of the data, and i is the order of the features.

To mitigate the potential overfitting of the normalized radiomics features, the following feature screening process was employed: (I) Levene’s test and t-test were first used to select the features with P>0.05; and (II) least absolute shrinkage and selection operator (LASSO) and 10-fold cross-validation were performed on the remaining features to retain the best texture features associated with MCE.

To compare the results of this study, the following four models were developed based on different feature combination strategies: (I) the IL model, which contains only the IL radiomics features obtained through the feature screening described earlier; (II) the IA model, which combines the IL and AH radiomics features; (III) the IW model, which combines the IL and WB radiomics features; and (IV) the IWA model, which integrates the IL, AH, and WB radiomics features. Seven ML algorithms were used to build predictive models for the IL and IWA feature groups (see the “Modeling and performance evaluation” section below), and the area under the curve (AUC) of each model was measured. The algorithm with the highest AUC was selected for the final feature subset modeling algorithm. Additionally, to further compare the study results, ML models for IA and IW were constructed using the best-performing ML algorithms identified in the IL and IWA models.

Modeling and performance evaluation

The following seven ML algorithms were used to construct models for predicting MCE using feature data from the IL and IWA groups using the training cohort: support vector classification, decision tree, random forest, KNN, logistic regression, gradient boosting machine, and multilayer perceptron (MLP). The performance of these models was then compared using the training cohort. All the model training procedures were conducted in Python (version 3.11.9, https://www.python.org/). During the model training phase, we applied a 10-fold cross-validation method. Specifically, we randomly divided the training set into 10 subsets (of which, 9 were used for training and 1 was used for validation). This process was repeated 10 times to ensure the model’s robustness. Additionally, the subsets were analyzed to fine-tune the hyperparameters and build the predictive models. Finally, the final model with the best performance was selected and validated using both a test cohort and an external validation cohort.

To evaluate the performance of the ML models, this study employed a receiver operating characteristic (ROC) curve analysis. The AUC, accuracy, sensitivity, specificity, negative predictive value (NPV), and positive predictive value (PPV) were employed as the performance metrics. Additionally, the consistency between the model’s predicted probabilities and actual outcomes was assessed using calibration curves and the Hosmer-Lemeshow test, and the model’s net clinical benefit was analyzed by a decision curve analysis (DCA). The workflow of the ROI radiomics analysis and model development is illustrated in Figure 2.

Figure 2 Workflow for building a machine-learning model for predicting the combined IL, AH, and WB radiomics features of MCE after AIS. AUC, area under the curve; AIS, acute ischemic stroke; GLCM, gray-level co-occurrence matrix; GLDM, gray-level dependence matrix; GLRLM, gray-level run length matrix; GLSZM, gray-level size zone matrix; IL, infarct lesion; IWA, combination of infarct lesion, affected hemisphere, and whole brain; MCE, malignant cerebral edema; NGTDM, neighboring gray tone difference matrix.

Statistical analysis

All the statistical analyses for this study were conducted using Python. The categorical variables are reported as the frequency and percentage (%). The continuous variables with a normal distribution are summarized as the mean ± standard deviation ( $\bar{x} \pm s$ ), while those with a skewed distribution are summarized as the median (interquartile range). The continuous variables were analyzed using either the paired t-test or the Wilcoxon signed-rank test, while the categorical variables were analyzed using the chi-square test or Fisher’s exact test. All the statistical tests were two-sided, and a P value <0.05 was considered statistically significant.

Results

Patient demographics, clinical characteristics, and routine radiologic features

A total of 219 patients (123 male, 96 female) with AIS were enrolled in the study. Table 1 summarizes the baseline demographic, clinical, and routine radiologic features of the patients with MCE (+) and MCE (–), respectively. Of the 219 patients, 67 experienced MCE. These patients had a median age of 69 years (59.0, 78.0 years). The results of the univariate analysis showed that a higher NIHSS score, larger infarct volume, and lower ASPECTS were significantly correlated with MCE (P<0.001).

Identification of radiomics features

Initially, 1,688 radiomics features were extracted from the reconstructed NCCT images of the IL, AH, and WB experimental groups, resulting in a total of 5,064 features per patient. Intra-class reproducibility was used to assess feature reliability; of the extracted features, 1,555 IL, 1,493 AH, and 1,452 WB features were retained. Levene’s test and the t-test were then performed on the features retained after the ICC analysis to identify the highly correlated features, and mitigate redundancy and multicollinearity. Of the above features, those with a P value <0.05 were retained for subsequent processing, including 363 IL, 19 AH, and 80 WB features. Finally, the LASSO analysis and 10-fold cross-validation were conducted (Figure 3A-3F), and the five most significant features with the largest absolute values of the feature coefficients in the IL, AH, and WB experimental groups were retained. Figure S1A-S1C presents the five selected features and their respective importance in the IL, AH, and WB experimental groups, and Figure 4 presents the combined features of the IWA model and their respective importance.

Figure 3 Radiomics features selection. (A,B) Figures represent the IL experimental group; (C,D) represent the AH experimental group, and (E,F) represent the WB experimental group. AH, affected hemisphere; IL, infarct lesion; LASSO, least absolute shrinkage and selection operator; WB, whole brain.

Figure 4 Importance of radiomics features in the IWA model. IWA, combination of infarct lesion, affected hemisphere, and whole brain; SHAP, SHapley Additive exPlanation.

ML model building and performance evaluation

This study analyzed the performance metrics, including AUC, accuracy, sensitivity, specificity, PPV, and NPV, of two classification (IL and IWA) models to evaluate their efficacy in predicting the risk of MCE. To evaluate the predictive capabilities of the ML models, we primarily focused on the interpretation of the AUC values.

The IL and IWA radiomics models were trained on the training cohort with seven ML algorithms. The specific results are shown in Table 2. The ROC curves of the IL and IWA models for the seven ML algorithms in the training cohort are set out in Figure 5A,5B. The average AUCs for the IL and IWA models were 0.850±0.013 and 0.893±0.034, respectively. The MLP model achieved an optimal AUC of 0.865 in the IL group and 0.927 in the IWA group, exceeding that of the IL (MLP) model. Therefore, MLP was chosen as the ML algorithm for the final modeling.

Table 2

Diagnostic performance of each machine-learning model in the training cohort for the IL and IWA models

Model	Method	AUC (95% CI)	Accuracy	Sensitivity	Specificity	PPV	NPV
IL	SVC	0.852 (0.786–0.909)	0.700	1.000	0.022	0.698	1.000
	DT	0.847 (0.779–0.899)	0.773	0.788	0.739	0.872	0.607
	RF	0.863 (0.805–0.918)	0.773	0.798	0.717	0.865	0.611
	KNN	0.849 (0.785–0.910)	0.813	0.933	0.543	0.822	0.781
	LR	0.853 (0.786–0.908)	0.753	0.769	0.717	0.860	0.579
	GBM	0.823 (0.714–0.918)	0.933	0.981	0.826	0.927	0.950
	MLP	0.865 (0.804–0.916)	0.807	0.904	0.587	0.832	0.730
IWA	SVC	0.885 (0.819–0.938)	0.800	0.990	0.370	0.780	0.944
	DT	0.875 (0.814–0.925)	0.820	0.856	0.739	0.881	0.694
	RF	0.896 (0.840–0.948)	0.833	0.827	0.848	0.925	0.684
	KNN	0.844 (0.777–0.906)	0.807	0.952	0.478	0.805	0.815
	LR	0.886 (0.830–0.941)	0.800	0.827	0.739	0.878	0.654
	GBM	0.826 (0.738–0.899)	0.880	0.875	0.891	0.948	0.759
	MLP	0.927 (0.875–0.966)	0.847	0.923	0.674	0.865	0.795

AUC, area under the curve; CI, confidence interval; DT, decision tree; GBM, gradient boosting machines; IL, infarct lesion; IWA, combination of infarct lesion, affected hemisphere, and whole brain; KNN, k-nearest neighbor; LR, logistic regression; MLP, multilayer perceptron; PPV, positive predictive value; NPV, negative predictive value; RF, random forest; SVC, support vector classification.

Figure 5 ROC curves for each machine-learning model in the training cohort for the IL model (A) and IWA model (B), respectively. AUC, area under the curve; IL, infarct lesion; IWA, combination of infarct lesion, affected hemisphere, and whole brain; KNN, k-nearest neighbor; ROC, receiver operating characteristic; SVC, support vector classification.

The IWA prediction model based on the MLP algorithm outperformed the IL model in the training cohort, achieving an accuracy of 0.847, a sensitivity of 0.923, a specificity of 0.674, a PPV of 0.865, and an NPV of 0.795. Additionally, for the same ML model, the AUC of the IL model was generally lower than that of the IWA model, with only the KNN model showing a slight, but not statistically significant, improvement in the IL group (P>0.05) (Table 2). Further, the AUC of the IWA model constructed using the MLP algorithm exceeded that of the IL model (0.872 vs. 0.833) in the test cohort (see Table 3).

Table 3

Diagnostic performance of MLP-based IL and IWA models in the training, test, and external validation cohorts

Model	Method	AUC (95% CI)	Accuracy	Sensitivity	Specificity	PPV	NPV
IL (training)	MLP	0.865 (0.804–0.916)	0.807	0.904	0.587	0.832	0.730
IL (test)	MLP	0.833 (0.687–0.958)	0.816	0.923	0.583	0.828	0.778
IL (external)	MLP	0.848 (0.685–0.965)	0.774	0.773	0.778	0.895	0.583
IWA (training)	MLP	0.927 (0.875–0.966)	0.847	0.923	0.674	0.865	0.795
IWA (test)	MLP	0.872 (0.716–0.992)	0.842	0.885	0.750	0.885	0.750
IWA (external)	MLP	0.859 (0.603–1.000)	0.806	0.864	0.667	0.864	0.667

AUC, area under the curve; CI, confidence interval; IL, infarct lesion; IWA, combination of infarct lesion, affected hemisphere, and whole brain; MLP, multilayer perceptron; NPV, negative predictive value; PPV, positive predictive value.

To assess the practical applicability of the models, the MLP models for the IL and IWA groups were validated on an independent external validation cohort. In the external validation cohort, the AUC of the MLP-based IWA model was higher than that of the IL model (0.859 vs. 0.848), and the IWA model had an accuracy, sensitivity, and specificity of 0.806, 0.864, and 0.667, respectively (Table 3).

The AUCs of the two models were compared using the DeLong test. A significant difference in the AUCs of the IL and IWA models was observed in the training cohort (P=0.009), but no significant differences in the model’s predictive performance were observed in the test or external validation cohorts (P>0.05). The effect sizes from the DeLong test for the training set, test set, and external validation set were –0.062, –0.038, and –0.010, respectively. Additionally, the calibration curves showed that the IWA model exhibited a higher agreement between the predicted risk and actual risk than the IL model (Figure 6A-6C), and the IWA model had lower Brier scores across the training, test, and external validation cohorts (0.098 vs. 0.135, 0.137 vs. 0.141, and 0.108 vs. 0.147, respectively). The results of the Hosmer-Lemeshow test showed that the P values for the IL model in the training, test, and external validation sets were 0.135, 0.255, and 0.190, respectively. For the IWA model, the P values in the training, test, and external validation sets were 0.262, 0.190, and 0.001, respectively. These results showed that the IWA model had higher P values in the training set, suggesting a good fit to the internal datasets. Although the P values for the IL model in both the test and external validation sets were slightly higher than the IWA model, the comparison with the Brier score clearly showed the IWA model’s superior predictive accuracy. The clinical DCA results indicated that the IWA model provided a greater net clinical benefit than the IL model (Figure 7A-7C).

Figure 6 Calibration curves for the IL and IWA models in the training (A), test (B), and external validation (C) cohorts, respectively. IL, infarct lesion; IWA, combination of infarct lesion, affected hemisphere, and whole brain.

Figure 7 DCA results for the IL and IWA experimental models in the training (A), test (B), and external validation (C) cohorts, respectively. DCA, decision curve analysis; IL, infarct lesion; IWA, combination of infarct lesion, affected hemisphere, and whole brain.

Finally, to further validate the predictive potential of the AH and WB, we combined the IL with the AH and WB radiomics features in the IA and IW models, respectively, and constructed the MCE prediction model by combining the optimal MLP algorithm. As Figure 8A-8C show, the AUCs of the IA model were 0.894 [95% confidence interval (CI): 0.840–0.940], 0.840 (95% CI: 0.680–0.963), and 0.854 (95% CI: 0.603–1.000), while those of the IW model were 0.905 (95% CI: 0.847–0.954), 0.891 (95% CI: 0.774–0.985), and 0.914 (95% CI: 0.795–1.000) across the training, test, and external validation cohorts, respectively; all of which were higher than those of the IL model (AUCs: 0.865, 0.833, and 0.848, respectively).

Figure 8 ROC curves for the IL, IW, and IA models in the training (A), test (B), and external validation (C) cohorts, respectively. AUC, area under the curve; IL, infarct lesion; IW, combination of infarct lesion and whole brain; IA, combination of infarct lesion and affected hemisphere; ROC, receiver operating characteristic.

Performance of the IWA model in different subgroups

This study conducted an analysis of the IWA model’s performance in different subgroups. The model’s accuracy in predicting MCE in AIS patients undergoing different treatment modalities ranged from 79.5% to 94.4%. Specifically, the model had accuracies of 73.2% and 92.6% for patients with and without massive AIS, respectively. In addition, it had accuracies of 82.8% and 85.0% in predicting MCE within 4.5 hours and more than 4.5 hours after stroke onset, respectively. The subgroup analysis revealed no statistically significant differences in the model’s predictive performance across these different subgroups (P>0.05). Overall, the results indicate that the model had robust predictive accuracy for MCE, reaching 84.7%, 84.2%, and 80.6% across the training, test, and external validation cohorts, respectively.

Discussion

The accurate assessment of MCE risk in patients with AIS is crucial for developing treatment strategies, predicting disease progression, and optimizing patient prognosis. This study constructed a novel ML prediction model by combining radiomics features from NCCT images of the IL, AH, and WB, which accurately predicted the risk of developing MCE in AIS patients with different treatment modalities, infarct volume, and stroke onset time. Previous studies, such as those that have developed radiomics models based on high-attenuation imaging markers and middle cerebral artery infarct regions (17,21), have primarily focused on assessing the occurrence of MCE after endovascular thrombectomy (EVT). Unlike the models described in these previous studies, our proposed model can also be used to evaluate the risk of MCE after non-reperfusion therapies.

We first conducted a baseline analysis of the demographic, clinical, and routine radiographic characteristics of all the AIS patients, and we found that those who developed MCE typically had higher baseline NIHSS score, larger infarct volume, and lower ASPECTS. Previous studies have also found that higher NIHSS score, larger areas of infarct hypodensity on CT (26), and lower ASPECTS (27) are associated with a higher risk of developing MCE, which aligns with the findings of this study.

Clinical data-based models may not fully capture subtle variations in imaging data. Texture feature analysis, which is widely used in radiomics, enables the quantification of lesion heterogeneity beyond what is visually perceptible, facilitating the identification and classification of different tissue changes. Hu et al. (17) developed a radiomics model based on high-attenuation imaging markers and demonstrated its utility in identifying high-risk MCE patients after EVT (AUCs =0.999 and 0.938 in the training and testing sets, respectively). Jiang et al. (19) incorporated radiomics features from cerebrospinal fluid along with IL, and established a model that had an AUC of 0.86 in predicting MCE risk. Wen et al. (21) constructed a radiomics model based on features from the entire middle cerebral artery region that had AUCs of 0.924 and 0.879 in the training and testing sets, respectively. However, these radiomics models may not fully account for the comprehensive effects of multi-regional joint features.

The present study focused on evaluating the performance of an ML model, which was developed by integrating the radiomics features of the IL, AH, and WB, to predict MCE. The performance of the model was compared with that of a model developed using IL radiomics features only. To mitigate potential bias related to relying on a single ML algorithm, we developed and evaluated seven algorithms, and found that the MLP model outperformed the other models. The superior performance of the MLP model may be attributed to its capacity to efficiently manage complex non-linear relationships in stroke, and its ability to capture interactions among multiple features. Moreover, its feature extraction and combination capabilities enable it to automatically identify key pathological features in images. Additionally, MLP is well suited for handling large-scale, multi-dimensional data, particularly in the analysis of brain images and the evaluation of early neurological deterioration following intravenous thrombolysis (28,29). Consequently, we developed separate IWA and IL models using the MLP algorithm, and the results demonstrated that the IWA model provided superior predictions for the risk of MCE development after AIS, achieving an optimal AUC of 0.927, which represented an improvement of 0.062 (P<0.05) over the IL-only model (AUC: 0.865). Although the differences in predictive performance between the two models were not statistically significant in the test and external validation cohorts (P>0.05), the combined predictive ability of the IWA model consistently surpassed that of the IL model across the test and external validation cohorts. To further substantiate our conclusions, we combined IL with AH and WB to form the IA and IW experimental groups, respectively, and developed IA and IW predictive models using the MLP algorithm. The results indicated that the AUCs of the IA and IW models exceeded those of the IL model. Thus, incorporating AH and WB texture information into the IL-based models enhanced the accuracy of predicting MCE risk. This may be because AIS affects not only the cerebral blood flow localized to the IL, but also the blood flow in surrounding regions and the contralateral brain (22,30). This suggests that local vascular obstruction in AIS could induce structural remodeling of the cerebral network, further influencing the textural characteristics of the entire brain tissue (31).

The radiomics features extracted from the AH and WB differed significantly from those extracted from the IL, and were crucial in predicting the risk of MCE. The top five radiomics features from IL, AH, and WB, respectively, which were selected via LASSO analysis and 10-fold cross-validation, were retained and used to construct the IWA group features. Specifically, the IL group included two GLRLMs, two GLSZMs, and one NGTDM; the AH group included two first-order features and three GLSZMs; and the WB group included three GLSZMs, one GLRLM, and one first-order feature. In the IWA model, IL’s wavelet-HHH_glrlm_RunLengthNonUniformity, WB’s log-sigma-4-0-mm-3D_glrlm_RunLengthNonUniformity, and AH’s log-sigma-2-0-mm-3D_firstorder_Entropy exhibited the strongest correlation with MCE. Specifically, wavelet-HHH_glrlm_RunLengthNonUniformity, reflected the inhomogeneity in grayscale value variations. The increased feature values may indicate the complexity of the grayscale distribution caused by MCE, suggesting a higher degree of edema. The log-sigma-4-0-mm-3D_glrlm_RunLengthNonUniformity reflected the grayscale inhomogeneity in the WB tissue across different scales. The effect of AIS on the structure and function of WB tissue may lead to greater grayscale differences between healthy and affected tissue. This feature may indicate the risk of MCE by capturing the inhomogeneity of WB texture resulting from post-stroke edema. The log-sigma-2-0-mm-3D_firstorder_Entropy measured the complexity and uncertainty in the grayscale value distribution of AH. As edema spreads and damages surrounding tissue, its value increases, which may indicate the development or progression of MCE. In summary, the radiomics features of IL, AH, and WB were all strongly associated with the risk of developing MCE.

Further, a previous study on predicting AIS onset time found that a model that incorporated WB radiomics features and IL features showed the best predictive performance among the combined radiomics models, across the apparent diffusion coefficient, FLAIR, DWI, and combined sequences (32). The study has shown that comprehensive analyses integrating radiomics features of both infarcted and non-infarcted brain tissue can enhance prediction model performance and support the demands of clinical precision medicine.

This study represents a novel and significant attempt to develop a combined model of radiomics features (the IWA model). The results showed that this model had greater accuracy in predicting the early risk of MCE in AIS patients than the models based solely on IL features. Compared to magnetic resonance imaging (MRI) and perfusion imaging, NCCT has a number of advantages, such as shorter examination times and fewer contraindications, making it the preferred imaging method for AIS patients. Given that this study analyzed NCCT images, its findings may have broader applicability, and could potentially benefit AIS patients at a higher risk of MCE to a greater extent. However, this study had several limitations exist. First, differences in parameters across various devices are inevitable. To ensure consistency in clinical imaging data and minimize variations caused by different devices, we employed standardized preprocessing methods (e.g., window width adjustment, resampling, and normalization), used LASSO to reduce feature collinearity, and combined cross-validation to enhance the model’s reproducibility. Second, patients from four hospitals were included in the study; however, as a retrospective study with a limited sample size, there is a risk of selection bias. Additionally, the model may not be applicable to hyper-AIS patients in cases where the infarct boundary cannot be delineated by grayscale adjustment. Therefore, future studies may choose to include larger sample sizes, and use CT- or MRI-based brain perfusion imaging features to provide more meaningful results. We also intend to further optimize the existing MLP model and explore DL approaches, such as convolutional neural networks and transformers, to improve accuracy and robustness in the future. Additionally, we intend to integrate multimodal data (e.g., blood biochemistry and genomic data) to further enhance the predictive performance of the model. In terms of its clinical application, we intend to incorporate this model into decision support systems, and use interpretable ML techniques to improve physicians’ understanding and trust of the model.

Conclusions

This study developed a novel ML prediction model by integrating radiomics features from the IL, the AH, and the WB using NCCT images. This model can more precisely predict the risk of MCE after AIS, assist clinicians in identifying patients requiring DC during AIS treatment, and determine the optimal timing for surgery.

Acknowledgments

None.

Footnote

Reporting Checklist: The authors have completed the TRIPOD+AI reporting checklist. Available at https://qims.amegroups.com/article/view/10.21037/qims-2024-2751/rc

Funding: None.

Conflicts of Interest: All authors have completed the ICMJE uniform disclosure form (available at https://qims.amegroups.com/article/view/10.21037/qims-2024-2751/coif). The authors have no conflicts of interest to declare.

Ethical Statement: The authors are accountable for all aspects of the work in ensuring that questions related to the accuracy or integrity of any part of the work are appropriately investigated and resolved. The study was conducted in accordance with the Declaration of Helsinki and its subsequent amendments. The study was approved by the Ethics Committee of Chongqing General Hospital (No. KY S2023-077-01), and the requirement of informed consent was waived due to the retrospective nature of the study. All participating hospitals were informed and agreed with the study.

Open Access Statement: This is an Open Access article distributed in accordance with the Creative Commons Attribution-NonCommercial-NoDerivs 4.0 International License (CC BY-NC-ND 4.0), which permits the non-commercial replication and distribution of the article with the strict proviso that no changes or edits are made and the original work is properly cited (including links to both the formal publication through the relevant DOI and the license). See: https://creativecommons.org/licenses/by-nc-nd/4.0/.

References

Campbell BCV, Khatri P. Stroke. Lancet 2020;396:129-42. [Crossref] [PubMed]
Stokum JA, Gerzanich V, Simard JM. Molecular pathophysiology of cerebral edema. J Cereb Blood Flow Metab 2016;36:513-38. [Crossref] [PubMed]
Huang X, Yang Q, Shi X, Xu X, Ge L, Ding X, Zhou Z. Predictors of malignant brain edema after mechanical thrombectomy for acute ischemic stroke. J Neurointerv Surg 2019;11:994-8. [Crossref] [PubMed]
Bar B, Biller J. Select hyperacute complications of ischemic stroke: cerebral edema, hemorrhagic transformation, and orolingual angioedema secondary to intravenous Alteplase. Expert Rev Neurother 2018;18:749-59. [Crossref] [PubMed]
Kimberly WT, Dutra BG, Boers AMM, Alves HCBR, Berkhemer OA, van den Berg L, Sheth KN, Roos YBWEM, van der Lugt A, Beenen LFM, Dippel DWJ, van Zwam WH, van Oostenbrugge RJ, Lingsma HF, Marquering H, Majoie CBLM. MR CLEAN Investigators. Association of Reperfusion With Brain Edema in Patients With Acute Ischemic Stroke: A Secondary Analysis of the MR CLEAN Trial. JAMA Neurol 2018;75:453-61. [Crossref] [PubMed]
Heinsius T, Bogousslavsky J, Van Melle G. Large infarcts in the middle cerebral artery territory. Etiology and outcome patterns. Neurology 1998;50:341-50. [Crossref] [PubMed]
Vahedi K, Hofmeijer J, Juettler E, Vicaut E, George B, Algra A, Amelink GJ, Schmiedeck P, Schwab S, Rothwell PM, Bousser MG, van der Worp HB, Hacke W. DECIMAL, DESTINY, and HAMLET investigators. Early decompressive surgery in malignant infarction of the middle cerebral artery: a pooled analysis of three randomised controlled trials. Lancet Neurol 2007;6:215-22. [Crossref] [PubMed]
Berrouschot J, Sterker M, Bettin S, Köster J, Schneider D. Mortality of space-occupying ('malignant') middle cerebral artery infarction under conservative intensive care. Intensive Care Med 1998;24:620-3. [Crossref] [PubMed]
Minnerup J, Wersching H, Ringelstein EB, Heindel W, Niederstadt T, Schilling M, Schäbitz WR, Kemmling A. Prediction of malignant middle cerebral artery infarction using computed tomography-based intracranial volume reserve measurements. Stroke 2011;42:3403-9. [Crossref] [PubMed]
Foroushani HM, Hamzehloo A, Kumar A, Chen Y, Heitsch L, Slowik A, Strbian D, Lee JM, Marcus DS, Dhar R. Accelerating Prediction of Malignant Cerebral Edema After Ischemic Stroke with Automated Image Analysis and Explainable Neural Networks. Neurocrit Care 2022;36:471-82. [Crossref] [PubMed]
Chen Q, Xia T, Zhang M, Xia N, Liu J, Yang Y. Radiomics in Stroke Neuroimaging: Techniques, Applications, and Challenges. Aging Dis 2021;12:143-54. [Crossref] [PubMed]
Cao L, Ma X, Huang W, Xu G, Wang Y, Liu M, Sheng S, Mao K. An Explainable Artificial Intelligence Model to Predict Malignant Cerebral Edema after Acute Anterior Circulating Large-Hemisphere Infarction. Eur Neurol 2024;87:54-66. [Crossref] [PubMed]
Hoffman H, Wood JS, Cote JR, Jalal MS, Masoud HE, Gould GC. Machine learning prediction of malignant middle cerebral artery infarction after mechanical thrombectomy for anterior circulation large vessel occlusion. J Stroke Cerebrovasc Dis 2023;32:106989. [Crossref] [PubMed]
Zeng W, Li W, Huang K, Lin Z, Dai H, He Z, Liu R, Zeng Z, Qin G, Chen W, Wu Y. Predicting futile recanalization, malignant cerebral edema, and cerebral herniation using intelligible ensemble machine learning following mechanical thrombectomy for acute ischemic stroke. Front Neurol 2022;13:982783. [Crossref] [PubMed]
Foroushani HM, Hamzehloo A, Kumar A, Chen Y, Heitsch L, Slowik A, Strbian D, Lee JM, Marcus DS, Dhar R. Quantitative Serial CT Imaging-Derived Features Improve Prediction of Malignant Cerebral Edema after Ischemic Stroke. Neurocrit Care 2020;33:785-92. [Crossref] [PubMed]
Chen R, Deng Z, Song Z. The prediction of malignant middle cerebral artery infarction: a predicting approach using random forest. J Stroke Cerebrovasc Dis 2015;24:958-64. [Crossref] [PubMed]
Hu S, Hong J, Liu F, Wang Z, Li N, Wang S, Yang M, Fu J. An integrated nomogram combining clinical and radiomic features of hyperattenuated imaging markers to predict malignant cerebral edema following endovascular thrombectomy. Quant Imaging Med Surg 2024;14:4936-49. [Crossref] [PubMed]
Fu B, Qi S, Tao L, Xu H, Kang Y, Yao Y, Yang B, Duan Y, Chen H. Image Patch-Based Net Water Uptake and Radiomics Models Predict Malignant Cerebral Edema After Ischemic Stroke. Front Neurol 2020;11:609747. [Crossref] [PubMed]
Jiang L, Zhang C, Wang S, Ai Z, Shen T, Zhang H, Duan S, Yin X, Chen YC. MRI Radiomics Features From Infarction and Cerebrospinal Fluid for Prediction of Cerebral Edema After Acute Ischemic Stroke. Front Aging Neurosci 2022;14:782036. [Crossref] [PubMed]
Wen X, Li Y, He X, Xu Y, Shu Z, Hu X, Chen J, Jiang H, Gong X. Prediction of Malignant Acute Middle Cerebral Artery Infarction via Computed Tomography Radiomics. Front Neurosci 2020;14:708. [Crossref] [PubMed]
Wen X, Hu X, Xiao Y, Chen J. Radiomics analysis for predicting malignant cerebral edema in patients undergoing endovascular treatment for acute ischemic stroke. Diagn Interv Radiol 2023;29:402-9. [Crossref] [PubMed]
Wang C, Miao P, Liu J, Li Z, Wei Y, Wang Y, Zhang Y, Wang K, Cheng J. Validation of cerebral blood flow connectivity as imaging prognostic biomarker on subcortical stroke. J Neurochem 2021;159:172-84. [Crossref] [PubMed]
Manelfe C, Larrue V, von Kummer R, Bozzao L, Ringleb P, Bastianello S, Iweins F, Lesaffre E. Association of hyperdense middle cerebral artery sign with clinical outcome in patients treated with tissue plasminogen activator. Stroke 1999;30:769-72. [Crossref] [PubMed]
Hua X, Liu M, Wu S. Definition, prediction, prevention and management of patients with severe ischemic stroke and large infarction. Chin Med J (Engl) 2023;136:2912-22. [Crossref] [PubMed]
Mokin M, Primiani CT, Siddiqui AH, Turk AS. ASPECTS (Alberta Stroke Program Early CT Score) Measurement Using Hounsfield Unit Values When Selecting Patients for Stroke Thrombectomy. Stroke 2017;48:1574-9. [Crossref] [PubMed]
Wu S, Yuan R, Wang Y, Wei C, Zhang S, Yang X, Wu B, Liu M. Early Prediction of Malignant Brain Edema After Ischemic Stroke. Stroke 2018;49:2918-27. [Crossref] [PubMed]
MacCallum C, Churilov L, Mitchell P, Dowling R, Yan B. Low Alberta Stroke Program Early CT score (ASPECTS) associated with malignant middle cerebral artery infarction. Cerebrovasc Dis 2014;38:39-45. [Crossref] [PubMed]
Sachdeva J, Mittal R, Mehta J, Jain R, Ranjan A. Resolving autism spectrum disorder (ASD) through brain topologies using fMRI dataset with multi-layer perceptron (MLP). Psychiatry Res Neuroimaging 2024;343:111858. [Crossref] [PubMed]
Wen R, Wang M, Bian W, Zhu H, Xiao Y, Zeng J, He Q, Wang Y, Liu X, Shi Y, Zhang L, Hong Z, Xu B. Machine learning-based prediction of early neurological deterioration after intravenous thrombolysis for stroke: insights from a large multicenter study. Front Neurol 2024;15:1408457. [Crossref] [PubMed]
Hernandez DA, Bokkers RP, Mirasol RV, Luby M, Henning EC, Merino JG, Warach S, Latour LL. Pseudocontinuous arterial spin labeling quantifies relative cerebral blood flow in acute stroke. Stroke 2012;43:753-8. [Crossref] [PubMed]
Grefkes C, Fink GR. Connectivity-based approaches in stroke and recovery of function. Lancet Neurol 2014;13:206-16. [Crossref] [PubMed]
Lu J, Guo Y, Wang M, Luo Y, Zeng X, Miao X, Zaman A, Yang H, Cao A, Kang Y. Determining acute ischemic stroke onset time using machine learning and radiomics features of infarct lesions and whole brain. Math Biosci Eng 2024;21:34-48. [Crossref] [PubMed]

Cite this article as: Zhang L, Zhang Y, Yang C, Zhang Y, Xie G, Wang D, Li K. Predicting malignant cerebral edema after acute ischemic stroke: a machine-learning model with multi-region radiomics. Quant Imaging Med Surg 2025;15(6):5188-5203. doi: 10.21037/qims-2024-2751

Predicting malignant cerebral edema after acute ischemic stroke: a machine-learning model with multi-region radiomics

Introduction

Methods

Clinical characteristics

Clinical data collection

Table 1

Image acquisition and evaluation

Image segmentation, and radiomics feature extraction, selection, and combination

Modeling and performance evaluation

Statistical analysis

Results

Patient demographics, clinical characteristics, and routine radiologic features

Identification of radiomics features

ML model building and performance evaluation

Table 2

Table 3

Performance of the IWA model in different subgroups

Discussion

Conclusions

Acknowledgments

Footnote

References

Article Options

Download Citation

Share