Development of a machine learning model in prediction of the rapid progression of interstitial lung disease in patients with idiopathic inflammatory myopathy

Yuhui Qiang; Hongyi Wang; Yifei Ni; Jianping Wang; Anqi Liu; Haoyu Yang; Linfeng Xi; Yanhong Ren; Bingbing Xie; Shiyao Wang; Min Liu; Chen Wang; Huaping Dai

doi:10.21037/qims-24-595

Original Article

Development of a machine learning model in prediction of the rapid progression of interstitial lung disease in patients with idiopathic inflammatory myopathy

Yuhui Qiang^1,2#, Hongyi Wang^2,3#, Yifei Ni^3,4, Jianping Wang^1,4, Anqi Liu^3,4, Haoyu Yang⁵, Linfeng Xi^1,2, Yanhong Ren², Bingbing Xie², Shiyao Wang², Min Liu⁴, Chen Wang^1,2,3, Huaping Dai^1,2,3

¹Capital Medical University, Beijing, China; ²National Center for Respiratory Medicine, State Key Laboratory of Respiratory Health and Multimorbidity, National Clinical Research Center for Respiratory Diseases, Institute of Respiratory Medicine, Chinese Academy of Medical Sciences, Department of Pulmonary and Critical Care Medicine, Center of Respiratory Medicine, China-Japan Friendship Hospital, Beijing, China; ³China-Japan Friendship Hospital (Institute of Clinical Medical Sciences), Chinese Academy of Medical Sciences & Peking Union Medical College, Beijing, China; ⁴Department of Radiology, China-Japan Friendship Hospital, Beijing, China; ⁵Department of Radiology, Peking University China-Japan Friendship School of Clinical Medicine, Beijing, China

Contributions: (I) Conception and design: M Liu, H Dai, C Wang; (II) Administrative support: H Dai; (III) Provision of study materials or patients: Y Ren, B Xie, S Wang, H Dai, C Wang; (IV) Collection and assembly of data: Y Qiang, H Wang, Y Ni, J Wang, A Liu, H Yang, Y Ren, B Xie, S Wang; (V) Data analysis and interpretation: H Wang, Y Qiang; (VI) Manuscript writing: All authors; (VII) Final approval of manuscript: All authors.

^#These authors contributed equally to this work.

Correspondence to: Huaping Dai, MD, PhD; Chen Wang, MD, PhD. National Center for Respiratory Medicine, State Key Laboratory of Respiratory Health and Multimorbidity, National Clinical Research Center for Respiratory Diseases, Institute of Respiratory Medicine, Chinese Academy of Medical Sciences, Department of Pulmonary and Critical Care Medicine, China-Japan Friendship Hospital, No. 2 Yinghua Dong Street, Chaoyang District, Beijing 100029, China; Capital Medical University, Beijing, China; China-Japan Friendship Hospital (Institute of Clinical Medical Sciences), Chinese Academy of Medical Sciences & Peking Union Medical College, Beijing, China. Email: daihuaping@ccmu.edu.cn; cyh-birm@263.net.

Background: Rapidly progressive interstitial lung disease (RP-ILD) significantly impacts the prognosis of patients with idiopathic inflammatory myopathies (IIM). High-resolution computed tomography (HRCT) is a crucial noninvasive technique for evaluating interstitial lung disease (ILD). Utilizing quantitative computed tomography (QCT) enables accurate quantification of disease severity and evaluation of prognosis, thereby serving as a crucial computer-aided diagnostic method. This study aimed to establish and validate a machine learning (ML) model to predict RP-ILD in patients with idiopathic inflammatory myopathy-related interstitial lung disease (IIM-ILD) based on QCT and clinical features.

Methods: A total of 514 patients (367 females, median age 54 years) with IIM-ILD in the China-Japan Friendship Hospital were retrospectively included, out of which 249 cases (165 females, median age 55 years) were identified as having RP-ILD. To extract the quantitative features on HRCT, deep learning (DL) methods were employed, along with demographic factors, pulmonary function test results, and blood gas analysis results; these factors were integrated into a final prediction model.

Results: Logistic regression was chosen as the final model due to its superior area under the curve (AUC) and explainability compared to the other seven ML models. The validation dataset yielded an AUC of 0.882 [95% confidence interval (CI): 0.797–0.967], indicating that the combined QCT and clinical features model outperformed both the QCT-only model and the clinically-only model. In calibration and clinical decision curve analysis, the final model demonstrated minimal prediction bias (concordance index: 0.887, 95% CI: 0.800–0.974, P<0.001) and provided greater net benefit across most thresholds. The nomogram encompassed the incorporation of the following variables: subtype, gender, forced expiratory volume in one second (FEV₁%), diffusing capacity for carbon monoxide (DL_CO%), oxygenation index (OI), and quantitative ground-glass opacities (GGOs), consolidation, pulmonary vascular, and branches on HRCT.

Conclusions: When utilizing ML techniques, the baseline QCT has the potential to predict rapid progression in patients with IIM-ILD. The prediction performance will be further improved by incorporating clinical data alongside HRCT features.

Keywords: Idiopathic inflammatory myopathy (IIM); rapidly progressive interstitial lung disease (RP-ILD); high-resolution computed tomography (HRCT); machine learning (ML); quantitative computed tomography (QCT)

Submitted Mar 24, 2024. Accepted for publication Sep 10, 2024. Published online Nov 08, 2024.

doi: 10.21037/qims-24-595

Introduction

Idiopathic inflammatory myopathies (IIM) encompass a collection of uncommon systemic disorders prone to pulmonary complications, commonly manifested as interstitial lung disease (ILD) existing in 20–78% of IIM in adult patients (1). Approximately 38–71% of patients with IIM-ILD experience the development of rapidly progressive interstitial lung disease (RP-ILD) within 3 months after the onset of respiratory symptoms (2,3). The mortality rate among patients with IIM-ILD is reported to be 70–90% among those with RP-ILD (4-6). Chinese adult data have demonstrated significant time-dependent variations in RP-ILD and mortality risk in anti-melanoma differentiation-associated protein 5 dermatomyositis (MDA5+ DM), suggesting that we need to closely monitor the status of patients for 6 months after diagnosis, which is the high-risk time window for poor prognosis, but also the optimal time window for aggressive treatment (7-10). Therefore, prediction of RP-ILD is important for developing a rational treatment plan, improving patient prognosis, and reducing mortality (11,12).

High-resolution computed tomography (HRCT) has been widely for the assessment of IIM-ILD. According to Walsh et al. (13), interobserver agreement for the current American Thoracic Society (ATS), European Respiratory Society (ERS), Japanese Respiratory Society (JRS), and Latin American Thoracic Society (LATS) computed tomography (CT) criteria for usual interstitial pneumonia (UIP) among thoracic radiologists, regardless of their experience, exhibits only a moderate level. The diversity remains consistent across patient age and multidisciplinary diagnosis, suggesting that variations in perception among individuals are independent of their training and experience, posing potential challenges for resolution. This presents an opportunity for objective and automated methods to support clinical decision-making from HRCT. In recent years, quantitative computed tomography (QCT) based on artificial intelligence (AI) has emerged rapidly, providing a potential quantitative evaluation of lung diseases (14,15). Initial QCT approaches, for example, visual scores and radiomics, have been used to predict adverse outcomes in patients with IIM-ILD (16-18). Machine learning (ML) models have realized the segmentation and classification of ILD lesions (19-29). Danieli et al. (30) established an ML prediction model to forecast the prognosis of patients using IIM-ILD clinical scores. Nonetheless, combining image features and data from multiple sources can improve the accuracy of deep learning (DL) algorithms and make the model’s diagnostic performance and prediction ability more convincing (27,31). However, currently, prediction models for RP-ILD in IIM patients with QCT and clinical features based on AI have not been developed, therefore, we aimed to develop and validate a predictive model of RP-ILD using QCT features based on ML to help clinicians to diagnose the disease early and provide guidance for clinical decision-making. We present this article in accordance with the TRIPOD+AI reporting checklist (available at https://qims.amegroups.com/article/view/10.21037/qims-24-595/rc).

Methods

Study cohort and design

The study was conducted in accordance with the Declaration of Helsinki (as revised in 2013). The study was approved by the Ethics Board of China-Japan Friendship Hospital (No. 2017-25) and the requirement for informed consent was waived for this retrospective study. This study cohort retrospectively included patients diagnosed with ILD related to antisynthetase syndrome (ASS) or MDA5+ DM between January 2016 and December 2021 in China-Japan Friendship Hospital. IIM-ILD was diagnosed in accordance with the criteria for diagnosis and classification of interstitial pneumonias through multi-disciplinary discussion (1,32,33). The diagnosis of dermatomyositis (DM) was based on the Bohan and Peter criteria, and 239th European Neuro Muscular Centre International Workshop guidelines (34,35). The diagnosis of ASS was confirmed through testing for anti-aminoacyl-tRNA synthase (ARS) antibodies, accompanied by the presence of at least 1 triad finding, which includes myositis, arthritis, and ILD (36). Anti-ARS antibody and anti-MDA5 antibody were tested using commercially available kits (EUROIMMUN, Lübeck, Germany) according to the manufacturer’s instructions. The inclusion criteria were as follows: patients with an age ≥18 years; clinical diagnosis of ASS or MDA5+ DM; baseline HRCT before treatment at China-Japan Friendship Hospital. The exclusion criteria were as follows: missing HRCT or HRCT with poor image quality at the initial visit; incomplete pulmonary function tests (PFTs); patients who were lost in follow-up. Figure 1 illustrates a flowchart outlining the participant selection process.

Figure 1 Flow diagram of eligibility criteria. IIM-ILD, idiopathic inflammatory myopathy related interstitial lung disease; HRCT, high-resolution computed tomography; RP-ILD, rapidly progressive interstitial lung disease.

In this study, the criteria for identifying RP-ILD are based on the international consensus modified by the ATS declaration regarding idiopathic pulmonary fibrosis (IPF) (37,38). RP-ILD is determined when at least 2 of the following indicators are present: (I) worsening of symptoms, such as exertional dyspnea, (II) physiological changes, indicated by a decrease of 10% in vital capacity (VC) or a 1.33 kPa decrease in arterial oxygen pressure (PaO₂) in patients with IIM-ILD, and (III) increased pulmonary opacification on HRCT including ground-glass opacities (GGOs), consolidation, reticular pattern (RE), and honeycombing (HC) within 3 months of symptom onsets (2,3). Meanwhile, chronic IIM-ILD represents the non-RP-ILD category, which exhibits a gradual decline in the disease after 3 months or remains relatively stable for an extended period. Figure 2 shows the progression of the patient’s disease and the model building diagram.

Figure 2 The progression of IIM-ILD and the model building diagram. HRCT and clinical features of patients after diagnosis of IIM-ILD and before disease progression were used to establish predictive models. This figure was created on www.figdraw.com and has been authorized for use. IIM, idiopathic inflammatory myopathy; IIM-ILD, idiopathic inflammatory myopathy related interstitial lung disease; HRCT, high-resolution computed tomography; RP-ILD, rapidly progressive interstitial lung disease.

PFTs

Pulmonary function was assessed within 1 week before or after HRCT. All participants underwent PFTs (MasterScreen; Vyaire Medical GmbH, Hoechberg, Germany) and the collected measurements included the percentage of predicted forced vital capacity (FVC%), percentage of forced expiratory volume in one second (FEV₁%), FEV₁/FVC%, percentage of predicted total lung capacity (TLC%), and percentage of predicted diffusing capacity for carbon monoxide (DL_CO%). The tests were performed according to the standards of ERS/ATS (39,40). We derived the PaO₂ and arterial carbon dioxide pressure (PaCO₂) based on a different fraction of inhaled oxygen (FiO₂) from the blood gas analysis results, utilizing the instruments and procedures previously specified (38).

HRCT

All patients underwent HRCT on multi-detector CT systems [LightSpeed VCT/64, GE Healthcare (Chicago, IL, USA); Aquilion ONE TSX-301C/320, Toshiba (Tokyo, Japan); iCT/256, Philips (Amsterdam, Netherlands); FLASH Dual Source CT, Siemens (Erlangen, Germany)] in a single, breath-hold scan from the supine position. Acquisition and reconstruction parameterization was in adherence with the prescribed CT criteria, comprising tube voltage within the range of 100–120 kV, tube current within 100–300 mAs, and slice thickness from 0.625–1 mm. Scanning table movement was measured at 39.37 mm/s, with a gantry revolution rate of 0.8 seconds. Reconstruction was carried out progressively with up to 1–1.25 mm slice thicknesses.

Quantitative CT analysis

FACT Medical Imaging System (http://www.dexhin.com) approved by Food and Drug Administration (FDA) and China Food and Drug Administration (CFDA) were used to quantitative analysis on HRCT (41,42). HRCT images in Digital Imaging and Communications in Medicine (DICOM) format were transferred to a 3-dimensional (3D) in-home AI workstation (FACT AI + digitalLung V1.0; Shenzhou Dexin Medical Imaging Technology Co., Ltd., Shanxi, China), and then lung, pulmonary segment, pulmonary vasculature, broncho-vascular structures, as well as GGOs and consolidation were automatically segmented with a DL-based algorithm. The accuracy of the segmentation was confirmed by 2 radiologists, with 9 and 15 years of respective experience, who were blinded to the actual diagnosis of the patients.

The lungs were divided into 3 equal vertical parts (right upper, middle, and lower parts, and left upper, and lower parts), and the lesion volume was automatically computed in each location. The percentage of the lesion based on the total lung volume was also derived. Pulmonary vasculature was determined automatically using integrated and automated techniques similar to previous protocols (43). Figure 3 shows the process of QCT analysis.

Figure 3 Quantitative CT analysis diagram. From left to right, the segmentation of lung lobe, lung interstitial lesions, and pulmonary vasculature were observed. CT, computed tomography.

Establishment and validation of ML models

The first step: In order to identify the features most closely related to RP-ILD diagnosis in patients with IIM-ILD for our subsequent ML modeling, the least absolute shrinkage and selection operator (LASSO) was used to select variables (44). We firstly examined the presence of any missing values in the entire dataset of all QCT attributes (table available at https://cdn.amegroups.cn/static/public/qims-24-595-1.xlsx). Features with more than 5% missing value were then eliminated (table available at https://cdn.amegroups.cn/static/public/qims-24-595-2.xlsx). To substitute the missed value in the retained QCT features, we applied a mean interpolation process. To ensure that the data distribution remained consistent before and after interpolation, we compared the QCT feature distribution both before and after the interpolation (table available at https://cdn.amegroups.cn/static/public/qims-24-595-3.xlsx). Then, we retained all characteristics with a variance that was greater than zero (table available at https://cdn.amegroups.cn/static/public/qims-24-595-4.xlsx). After normalizing the data, a LASSO regression model was established in random 70% data as a training set, and we calculated the mean squared error in the test set (30%) for optimization. The lambda parameter was also optimized using a 5-fold cross-validation in the training set (Figure S1) and the least mean squared error in the test set. We preserved the features of which the LASSO coefficients were not eliminated (Table S1).

In order to identify the most explainable ML model that predicts the RP-ILD diagnosis based on QCT features, 8 ML algorithms including Naïve Bayes (45), logistic regression (46), K-Nearest neighbors (47), random forests (48), decision trees (49), gradient-boosting trees (50), support vector machines (51), and multilayer perceptron (specific parameters in the Appendix 1) (52) were used to establish the models. For each model, the overall dataset was randomly partitioned into training and test sets in a 7:3 ratio. Next, we trained the models on the training set and utilized 5-fold cross-validation with grid search to identify the most favorable parameters. Since the test set was not involved in the training, we used the test set for further validation.

The third step: With the selected ML model, we included both clinical data and QCT features to build the final model. We only included cases with complete clinical and HRCT information and adopted a 7:3 scheme to randomly divide the training and test sets. As both GGOs and consolidation on HRCT are often considered related to RP-ILD outcomes (53,54) we specified that the model must incorporate QCT features related to GGOs and consolidation. Similarly, DL_CO%, FEV₁%, and oxygenation index (OI) are clinical indicators known to be related to RP-ILD. The validation process was the same as that of the second step.

Statistical analysis

Our analysis was conducted using R (version 4.2.2; R Foundation for Statistical Computing, Vienna, Austria) with the default settings. For continuous characteristics, the Kolmogorov-Smirnov test was used to test the normality of the distribution. T-tests were used to compare variables normally distributed, and these data were expressed as the mean ± standard deviation (SD). Otherwise, the Mann-Whitney U test was used, and the data were expressed as the median (interquartile range). For categorical characteristics, the Chi-squared test or Fisher’s test was used, and these data were expressed as counts (%). To evaluate the efficiency and performance of the models, we derived confusion matrixes to calculate accuracy, sensitivity, specificity, precision, F1 score, positive predictive value, negative predictive value, Jacobian index, net reclassification index (NRI), integrated discrimination improvement (IDI), and the area under the curve (AUC) for the receiver operating characteristic (ROC) curve in the test set. Furthermore, we plotted the ROC for each model. For the final model evaluation, we also used the accuracy, sensitivity, specificity, precision, F1 score, positive predictive value, negative predictive value, Jacobian index, and the AUC based on the prediction value generated from the logistic regression model. Furthermore, we compared the performance differences between the only clinical model, the only HRCT model, and the final combined model using ROC plot, NRI, and IDI. Additionally, we drew calibration curves, nomograms, and clinical decision-making curves to further evaluate the application potentials of the model. A P value <0.05 was considered statistically significant.

Results

Clinical characteristics

A total of 514 patients (367 females, median age 54 years) were included in this study, of which 249 patients were diagnosed with RP-ILD (165 females, median age 55 years). The training group consisted of 359 patients, whereas the test group consisted of 155 patients (Table S2). The final optimized model comprised 270 cases, with 189 patients in the training set and 91 patients in the test set (Figure 1). There was no evidence of selection bias regarding variables between the training and test set in terms of clinical features (Table S3).

Table 1 presents demographic data, antibody profiles, results of PFTs, and arterial blood gas analysis. Significant differences were observed in antibody profiles, pulmonary function, and arterial blood gas analysis between RP-ILD and non-RP-ILD patients. The proportion of RP-ILD patients was notably higher in the subgroup of MDA5+ DM. RP-ILD patients demonstrated impaired lung ventilation, volume, and diffusion function, as well as compromised oxygenation. These trends were consistent in both the training set (Table S5) and the test set (Table S6).

Table 1

Characteristics of included patients with idiopathic inflammation myopathy

Characteristics	Overall	Non-RP-ILD group	RP-ILD group	P value
n	514	265	249
Subtype, n (%)				<0.001
ASS	357 (69.46)	211 (41.05)	146 (28.40)
MDA5+ DM	157 (30.54)	54 (10.51)	103 (20.04)
Female, n (%)	367 (71.40)	202 (39.30)	165 (32.10)	0.013
Age (years), median (IQR)	54.00 (46.00, 61.00)	54.00 (46.00, 60.00)	55.00 (47.50, 63.00)	0.043
BMI (kg/m²), median (IQR)	24.03 (22.16, 26.23)	23.75 (21.50, 26.25)	24.37 (22.59, 26.19)	0.102
Smoke, n (%)				0.005
Current	9 (5.08)	3 (2.63)	6 (9.52)
Former	10 (5.65)	3 (2.63)	7 (11.11)
Never	158 (89.27)	108 (94.74)	50 (79.37)
Duration of the disease (month), median (IQR)	3.67 (1.37, 14.27)	4.50 (1.17, 17.53)	3.13 (1.38, 11.73)	0.286
VC%, median (IQR)	73.20 (61.23, 85.05)	76.65 (64.775, 89.45)	68.40 (57.10, 79.00)	0.003
FVC%, median (IQR)	74.1 (62.33, 87.18)	77.95 (65.40, 91.08)	69.65 (56.33, 81.55)	<0.001
FEV₁%, median (IQR)	73.70 (60.33, 82.50)	76.10 (63.10, 86.88)	67.45 (55.925, 79.75)	0.002
FEV₁%/FVC%, median (IQR)	80.84 (76.48, 85.058)	80.55 (76.368, 84.86)	81.28 (77.02, 86.15)	<0.001
TLC%, median (IQR)	70.40 (59.00, 81.20)	73.3 (62.95, 83.25)	65.35 (54.83, 77.65)	0.002
DL_CO%, median (IQR)	57.20 (45.90, 69.60)	61.70 (51.05, 71.65)	50.55 (43.15, 60.28)	<0.001
FiO₂, median (IQR)	0.21 (0.21, 0.21)	0.21 (0.21, 0.21)	0.21 (0.21, 0.21)	0.53
PaO₂ (mmHg), median (IQR)	84.00 (74.80, 93.00)	87.60 (80.00, 95.00)	80.25 (71.43, 91.65)	0.006
PaCO₂ (mmHg), median (IQR)	37.10 (34.35, 40.30)	37.90 (35.20, 40.80)	36.45 (33.70, 39.78)	<0.001
The time from respiratory symptom onset to treatment initiation (month), median (IQR)	0.83 (0.00, 3.12)	0.82 (0.00, 4.04)	0.88 (0.00, 2.36)	0.971
Initial treatment strategy, n (%)				<0.001
Glucocorticoid	161 (38.42)	85 (40.09)	76 (36.71)
Glucocorticoid + immunosuppressant (dual)	174 (41.53)	100 (47.17)	74 (35.75)
Glucocorticoid + immunosuppressants (triple combination)	10 (2.39)	6 (2.83)	4 (1.93)
Glucocorticoid + immunoglobulin	17 (4.06)	3 (1.42)	14 (6.76)
Glucocorticoid + immunosuppressant(s) + immunoglobulin	55 (13.13)	18 (8.49)	37 (17.87)
Glucocorticoid + biologics	2 (0.48)	0 (0.00)	2 (0.97)
Initial antifibrotic therapy, n (%)	26 (6.18)	16 (7.55)	10 (4.78)	0.239

The differences between ASS and MDA5+ DM groups were compared in Table S4. RP-ILD, rapid progressive interstitial lung disease; ASS, anti-synthetase syndrome; MDA5+ DM, anti-melanoma differentiation-associated protein 5 dermatomyositis; IQR, interquartile range; BMI, body mass index; VC, vital capacity; VC%, the proportion of actual value to the expected value for vital capacity; FVC, forced vital capacity; FVC%, the proportion of actual value to the expected value for forced vital capacity; FEV₁, forced expiratory volume in the first second; FEV₁%, the proportion of actual value to the expected value for forced expiratory volume in the first second; FEV₁/FVC, the proportion of forced expiratory volume in the first second to the forced vital capacity; TLC, total lung capacity; TLC%, the proportion of actual value to the expected value for total lung capacity; DL_CO, diffusing capacity for carbon monoxide; DL_CO%, the proportion of actual value to the expected value for diffusing capacity for carbon monoxide; FiO₂, fraction of inhaled oxygen; PaO₂, arterial oxygen pressure; PaCO₂, arterial carbon dioxide pressure.

QCT feature extraction and selection

After applying LASSO and standard difference filter techniques (Figures S1,S2), 14 features of the initial 972 features from QCT (table available at https://cdn.amegroups.cn/static/public/qims-24-595-4.xlsx, Tables S1) were selected. Table S1 provides specific coefficients for these selected features. Importantly, there was no indication of selection bias regarding variables in QCT features between the training and test sets (Tables S2,S3).

The performance of different ML models

As shown in Table 2, there were no significant differences in performance between the logistic regression model and the other models including the random forest model; logistic regression was more interpretable than the other models (55), making it easier for clinicians to understand and calculate scores for clinical application. The coefficients in a logistic regression model directly reflect the influence of each variable (i.e., DL_CO%) on the RP-ILD risk, allowing every step of the model’s decision-making process to be traced and explained. Therefore, we selected logistic regression as the final model based on its performance in the test set, which yielded an AUC of 0.752 [95% confidence interval (CI): 0.670–0.834]. This result was obtained by utilizing all 14 QCT features and employing 5-fold cross-validation (Table 2, Figure 4A). Compared to alternative models, logistic regression demonstrated superiority (Tables 2,3). Delong’s test confirmed that the logistic regression model performed better than the decision tree model (Z=2.688, P=0.007), whereas its performance was comparable to that of the other 6 models (P>0.05). The NRI index also indicated that the logistic regression model outperformed the support vector machine model (NRI =0.874, Z=5.893, P<0.001) and the multilayer perceptron model (NRI =0.246, Z=2.775, P=0.006), with the IDI index supporting this trend (IDI =0.160, Z=2.101, P=0.036).

Table 2

Performance of eight machine learning models predicting RP-ILD in the same test dataset (n=155)

Model	Sensitivity	Specificity	Accuracy	Positive prediction	Negative prediction	Jacobian	Precision	F1 score	AUC (95% CI)
Naïve Bayes	0.716	0.693	0.703	0.640	0.763	0.410	0.640	0.710	0.722 (0.638–0.807)
Logistic regression	0.567	0.875	0.742	0.776	0.726	0.442	0.776	0.643	0.752 (0.670–0.834)
K-nearest neighbor	0.821	0.591	0.690	0.604	0.813	0.412	0.604	0.750	0.749 (0.669–0.828)
Random forest	0.567	0.864	0.735	0.760	0.724	0.431	0.760	0.640	0.752 (0.671–0.833)
Decision tree	0.776	0.511	0.626	0.547	0.750	0.287	0.547	0.693	0.639 (0.564–0.713)
Gradient boosting tree	0.612	0.739	0.684	0.641	0.714	0.351	0.641	0.646	0.695 (0.609–0.780)
Support vector machine	0.716	0.716	0.716	0.658	0.768	0.432	0.658	0.716	0.742 (0.660–0.824)
Multilayer perceptron	0.299	0.898	0.639	0.690	0.627	0.196	0.690	0.407	0.562 (0.467–0.657)

RP-ILD, rapid progressive interstitial lung disease; AUC, area under the curve; CI, confidence interval.

Figure 4 Performance of a series of models for predicting RP-ILD. (A) ROC of eight machine learning models; (B) ROC of logistic regression; (C-E) calibration curve of logistic regression; (F) clinical decision curve of logistic regression; (G) prognostic nomogram. TPR, true positive rate; AUC, area under the ROC curve; KNN, k-nearest neighbor; SVM, support vector machine; MLP, multilayer perceptron; FPR, false positive rate; HRCT, high-resolution computed tomography; ASS, antisynthetase syndrome; MDA5, melanoma differentiation-associated protein 5; FEV₁%, forced expiratory volume in 1 second as a percentage of the predicted value; DL_CO%, diffusing capacity of the lung for carbon monoxide as a percentage of the predicted value; OI, oxygenation index; RP-ILD, rapidly progressive interstitial lung disease; ROC, receiver operating characteristic.

Table 3

Delong test, NRI, and IDI results of eight machine learning models based on the same test dataset (n=155)

Model1	Model2	Z for Delong	P for Delong	NRI	Z for NRI	P for NRI	IDI	Z for IDI	P for IDI
Naive Bayes	Logistic regression	−0.751	0.453	−0.033	−0.467	0.640	−0.057	−0.816	0.415
Naive Bayes	K-nearest neighbor	−0.777	0.437	−0.002	−0.034	0.973	−0.007	−0.114	0.909
Naive Bayes	Random forest	−0.386	0.699	−0.021	−0.316	0.752	−0.044	−0.658	0.511
Naive Bayes	Decision tree	2.114	0.035	0.122	1.606	0.108	0.079	1.203	0.229
Naive Bayes	Gradient boosting tree	0.800	0.424	0.059	0.789	0.430	0.040	0.594	0.553
Naive Bayes	Support vector machine	−0.675	0.500	0.842	5.496	<0.001	−0.019	−0.318	0.750
Naive Bayes	Multilayer perceptron	0.911	0.362	0.213	2.244	0.025	0.103	1.505	0.132
Logistic regression	K-nearest neighbor	−0.108	0.914	0.030	0.346	0.729	0.050	0.663	0.507
Logistic regression	Random forest	0.507	0.612	0.011	0.200	0.841	0.014	0.211	0.833
Logistic regression	Decision tree	2.688	0.007	0.155	1.666	0.096	0.136	1.832	0.067
Logistic regression	Gradient boosting tree	1.601	0.109	0.092	1.282	0.200	0.098	1.395	0.163
Logistic regression	Support vector machine	0.162	0.871	0.874	5.893	<0.001	0.038	0.559	0.576
Logistic regression	Multilayer perceptron	1.444	0.149	0.246	2.775	0.006	0.160	2.101	0.036
K-nearest neighbor	Random forest	0.606	0.545	−0.019	−0.225	0.822	−0.037	−0.503	0.615
K-nearest neighbor	Decision tree	2.944	0.003	0.124	1.842	0.065	0.086	1.377	0.169
K-nearest neighbor	Gradient boosting tree	1.583	0.113	0.061	0.766	0.444	0.047	0.694	0.488
K-nearest neighbor	Support vector machine	0.260	0.795	0.844	5.559	<0.001	−0.012	−0.197	0.844
K-nearest neighbor	Multilayer perceptron	1.755	0.079	0.216	2.033	0.042	0.110	1.579	0.114
Random forest	Decision tree	2.756	0.006	0.143	1.548	0.122	0.123	1.680	0.093
Random forest	Gradient boosting tree	1.628	0.104	0.080	1.121	0.262	0.084	1.206	0.228
Random forest	Support vector machine	−0.425	0.671	0.863	5.802	<0.001	0.024	0.359	0.720
Random forest	Multilayer perceptron	1.148	0.251	0.235	2.673	0.008	0.146	1.952	0.051
Decision tree	Gradient boosting tree	−1.388	0.165	−0.063	−0.708	0.479	−0.039	−0.583	0.560
Decision tree	Support vector machine	−2.696	0.007	0.720	4.981	<0.001	−0.099	−1.516	0.130
Decision tree	Multilayer perceptron	−0.946	0.344	0.091	0.829	0.407	0.023	0.357	0.721
Gradient boosting tree	Support vector machine	−1.554	0.120	0.783	5.338	<0.001	−0.060	−0.910	0.363
Gradient boosting tree	Multilayer perceptron	0.249	0.804	0.154	1.525	0.127	0.062	0.872	0.383
Support vector machine	Multilayer perceptron	1.458	0.145	−0.629	−4.759	<0.001	0.122	1.748	0.080

NRI, net reclassification improvement; IDI, integrated discrimination improvement.

The final model and nomogram for clinical application

Using stepwise logistic model to establish the final model based on the minimum Akaike Information Criterion (AIC) we included a series of QCT features and clinical factors in the final model (Figure 4B) with its standardized coefficients and intercept as following formula and the odds ratio of each variable in Figure S3:

$\begin{array}{l} S c o r e = 3.01 + 1.57 \cdot X_{subtype} + 0.57 \cdot Y_{gender} \\ - 0.02 \cdot F E V_{1} % - 0.01 \cdot D L_{C O} % \\ - 0.49 \cdot P u l m o n a r y V a s c u l a r L o w e r L e f t L o b e M a x i m u m D e n s i t y \\ + 0.81 \cdot G G O s L e f t U p p e r L o b e V o l u m e \\ + 0.16 \cdot C o n s o l i d a t i o n R i g h t U p p e r L o b e M E A N D e n s i t y \\ + 0.71 \cdot B r a n c h e s V o l u m e - 0.01 \cdot O I \end{array}$ [1]

where X_subtype represents ASS=0, MDA5+ DM=1, Y_gender represents female =0, male =1.

The “pulmonary vascular lower left lobe maximum density” refers to the maximum density of the pulmonary vessels in the lower left lobe (Hounsfield units; HU). The “GGOs” can be either localized or diffuse, with a hazy appearance resembling ground glass, despite this opacity, the internal blood vessels and bronchus remain visible, the “GGOs left upper lobe volume” refers to the volume of the GGOs in the left upper lobe (mL). The density of lesion was higher in “consolidation”, and the lung texture in the lesion area was covered, the “consolidation right upper lobe mean density” refers to the mean density of the consolidation in the right upper lobe (HU). The “branches volume” refers to the volume of the bronchus in the whole lung (mL).

For the final model that included clinical information (Tables 4,5, Figure 4B), the test set had an AUC of 0.882 (95% CI: 0.797–0.967), which was superior to both the HRCT-based model (AUC: 0.658, 95% CI: 0.536–0.781) and the clinic model (AUC: 0.797, 95% CI: 0.698–0.896). Delong’s test verified that the integrated model was better than HRCT (Z=3.225, P=0.001) or clinic models (Z=2.241, P=0.025). NRI and IDI indexes provided identical evidence for this result from AUC and Delong’s test. The NRI indexes for the final model versus the only QCT model and the final model versus the only clinical model were 0.384 (Z=3.171, P=0.002) and 0.212 (Z=2.090, P=0.037), respectively. Similarly, the IDI indexes were 0.380 (Z=3.234, P=0.001) and 0.252 (Z=2.200, P=0.028) for the final model compared with the only QCT model and the only clinical model, respectively. In calibration and clinical decision curve analysis, the final model was documented as having high prediction performance (concordance index: 0.887, 95% CI: 0.800–0.974, P<0.001) and more net benefit than only HRCT or clinical models using most thresholds (Figure 4C-4F, table available at https://cdn.amegroups.cn/static/public/qims-24-595-5.xlsx).

Table 4

Performance of final models predicting RP-ILD in the same test dataset (n=80)

Model	Sensitivity	Specificity	Accuracy	Positive prediction	Negative prediction	Jacobian	Precision	F1 score	AUC (95% CI)
All	0.750	0.955	0.863	0.931	0.824	0.705	0.931	0.802	0.882 (0.797–0.967)
HRCT	0.389	0.932	0.688	0.824	0.651	0.321	0.824	0.497	0.658 (0.536–0.781)
Clinic	0.583	0.909	0.763	0.840	0.727	0.492	0.840	0.661	0.797 (0.698–0.896)

RP-ILD, rapid progressive interstitial lung disease; AUC, area under the receiver operating characteristic curve; CI, confidence interval; HRCT, high-resolution computed tomography.

Table 5

Delong test, NRI, and IDI results of final models based on the same test dataset (n=80)

Model1	Model2	Z for Delong	P for Delong	NRI	Z for NRI	P for NRI	IDI	Z for IDI	P for IDI
All	HRCT	3.225	0.001	0.384	3.171	0.002	0.380	3.234	0.001
All	Clinic	2.241	0.025	0.212	2.090	0.037	0.252	2.200	0.028
HRCT	Clinic	1.549	0.122	−0.172	−1.124	0.261	−0.127	−0.988	0.323

NRI, net reclassification improvement; IDI, integrated discrimination improvement; HRCT, high-resolution computed tomography.

To use this final model, we need to obtain the subtype of IIM, gender, FEV₁%, DL_CO%, and OI of a specific patient with their scaled QCT features. To intuitively deduce the computational process of the ML model, the subtype, gender, FEV₁%, DL_CO%, OI, and QCT features including pulmonary vascular, GGOs, consolidation, and branches volume were incorporated into the nomogram construction (Figure 4G). We could gain a score from the provided formula, or the nomogram presented in Figure 4G to predict the probability of RP-ILD. Generally, the higher score calculated from the final model was associated with a larger probability for RP-ILD.

Here is an example for the nomogram. When a female patient is admitted to hospital, lab tests confirmed that anti-MDA5 antibody is positive and PFTs showed decreased FEV₁% (56.0%) with impaired DL_CO% (45.0%) but relatively normal OI (373.8), Using the commercial platform mentioned before and normalization, the max density of the pulmonary vascular in the left lobe is −0.17, the GGOs volume in the left upper lobe is −0.03, the consolidation density in the right upper lobe is 0.57, and the branches volume is −0.25. With the nomogram, we can obtain a predicting point (about 120) and corresponding RP-ILD risk (80%) by summing each point derived from the variables above. In reality, this patient was confirmed as RP-ILD within 3 months.

Discussion

Early prediction of RP-ILD is of great significance for reducing adverse prognosis and better weighing the benefits and risks of treatment. In this study, we firstly developed a logistic regression model to predict RP in patients with IIM-ILD predicated upon QCT and clinical features. Our research showed that pulmonary vasculature, pulmonary segmental volume, broncho-vascular structures, GGOs, and consolidation image features could predict RP-ILD at an early stage, which was consistent with previous research (7,56-58).

Previous studies have shown that vascular-related structures (VRS) are the most significant independent predictor of mortality in various connective tissue disease-related interstitial lung diseases (CTD-ILD) (59-61). Neovascularization, fine perivascular pulmonary fibrosis, and pulmonary hypertension in ILD may contribute to the increased pulmonary vascular-related structures volume (62). It has also been suggested that abnormal angiogenesis may be one of the initiating factors of ILD by triggering fibrotic repair in the lung. The existing literature has demonstrated a causal pathophysiological mechanism between pulmonary interstitial fibrosis and pulmonary vessels (62-68).

Diffuse multiple GGOs and consolidation can be seen in RP-ILD, often distributed around the vessels of the inferior lobular bronchi (8,69). Superimposed GGOs or frank consolidations may be observed during acute presentations or exacerbations of pre-existing disease (9,10,70,71). Based on QCT, Xu et al. discovered a significant association between GGOs and consolidation with 6-month mortality in MDA5+ DM patients (18). Ungprasert et al. discovered that there were notable inverse associations between GGOs and DL_CO as well as TLC when using CALIPER software (72). This suggested that the GGOs on HRCT could be linked to restricted ventilation function and diffusion function in patients.

With the development of AI in the field of medical imaging, DL algorithms can now be used to automatically identify and segment various structures and abnormalities in CT. By analyzing a large number of medical image data with AI, we can discover many features that are difficult to be identified by the naked eye or directly evaluated by clinicians. The lungs are partitioned into small subregions, often represented as voxels, which enable the identification of diagnostic patterns within these regions. However, there are some differences between this voxel-based analysis and the clinician’s evaluation method for CT, which makes our results seem uninterpretable from a clinical perspective. Therefore, on the basis of fully exploring a large number of quantitative CT parameters, we further included clinical features to establish a comprehensive and objective prediction model.

The PFTs of patients with IIM-ILD was mainly manifested as ventilation dysfunction and decreased diffusion function (8,73,74), and our study found that FEV₁% became a predictor of RP-ILD. Baseline data indicated that most individuals did not have an obstructive FEV₁/FVC ratio, so the reduced FEV₁% may simply be a consequence of restriction. Wells et al. (75) presented their composite physiologic index (CPI) as a tool to assess the morphologic extent of pulmonary fibrosis in IPF on CT scans. CPI has emerged as a robust prognostic indicator for mortality, incorporating FEV₁ as well. FVC has been a well-known important predictor, but it was not selected according to our modeling process. Adding FVC into the model will not improve the predictive accuracy. Arterial blood gas analysis is a good indicator to reflect the physiological condition of the lung, which can complement PFTs, and is especially suitable for the assessment of the physiological condition of the lung in patients who are aggravated by the disease and cannot cooperate with PFTs.

The latest research has demonstrated that individuals diagnosed with MDA5+ DM have poor responses to treatment and poor prognosis (74,76). The incidence of ILD in patients with MDA5+ DM was as high as more than 90% (76), and it was prone to RP-ILD. Nakashima et al. (77) found that 46% of MDA5+ DM patients died of respiratory failure within 6 months of symptom onset. Patients with MDA5+ DM associated with RP-ILD exhibited poor responses to combined glucocorticoid-immunosuppressive therapy and a high mortality rate (78,79).

To our best knowledge, this was the first ML model with nomograms to explore the diagnosis of RP-ILD in patients with ASS and MDA5+ DM. Previous investigations have found that ML prediction models for IIM disease diagnosis, treatment response, and complications can be established based on clinical manifestations and features (80-83); they can also classify IIM by identifying the relationship between different clinical features and clinical subtypes (83-85). However, no studies have established a predictive model for IIM-ILD from the radiological perspective, and most previous studies have ignored the feasibility of applying the model to clinical practice. We quantified the image lesions employing the QCT technique, combined the interpretive ML model with a nomogram, adequately revealed the underlying correlation between features and disease behavior, and established a comprehensive evaluation model that can be applied to clinical practice. Compared with previous studies, our nomogram was established based on a larger patient population, including demographic, physiological, antibody, and imaging information, which is suitable for patients with ASS and MDA5+ DM. These advantages provide a novel insight into RP-ILD among patients with ASS or MDA5+ DM.

Despite these advantages, our study is subject to certain limitations. First, the overall disease activity of IIM in clinical practice is mainly assessed using the Disease Activity Score (DAS scale (86), which is primarily used to evaluate muscle and skin involvement, and the Myositis Disease Activity Assessment Tool (MDAAT) scale is mainly used to assess involvement of organs other than muscles (87). There is currently no separate evaluation system for IIM-ILD. In terms of ILD prognosis assessment, most focus has been on IPF, such as the previously mentioned CPI (75), but it is similar to the gender-age-physiology (GAP) models (31) and mainly focuses on lung function. Therefore, there is currently no comparable gold standard scoring system in this field. Second, our study focused on the evaluation of IIM-ILD, disease activity, the severity of the affected organs, and additional risk factors such as hypolymphocytemia, elevated ferritin, KL-6, and C-reactive protein, were not included in our study; the overall state of the IIM is not fully presented. Third, our study primarily focused on IIM-ILD images, so the patients we included were predominantly subtypes closely associated with ILD. In addition to ASS and MDA5+ DM, other types of DM, immune-mediated necrotizing myopathy (IMNM), and isolated Ro52 antibody-positive IIM were also significantly linked to ILD. However, due to limited numbers (ranging from 0 to single digits) identified through retrospective screening in other clinical subtypes with ILD, potential bias could arise easily; hence they were not included. Fourth, patients with incomplete outcome data or clinical details were not included in our study, which may have led to selection bias and cannot adequately represent the severe cases. Also, we only included inpatients to train the final model, so whether the model is applicable to the outpatients remains to be studied. Last but not the least, the size of the sample of the internal test cohort in our single-center study is limited, thus the model necessitates further validation through larger multi-center studies and external patient cohorts.

Conclusions

The baseline QCT features based on the DL algorithm can predict rapid progression in patients with IIM-ILD. The logistic regression model built with the combined clinical and QCT features is superior to the only clinical model or only HRCT features model.

Acknowledgments

Funding: This study was supported by the National Key Technologies R & D Program Precision Medicine Research (Nos. 2021YFC2500700 and 2016YFC0901101), and the National Natural Science Foundation of China (No. 81870056).

Footnote

Reporting Checklist: The authors have completed the TRIPOD+AI reporting checklist. Available at https://qims.amegroups.com/article/view/10.21037/qims-24-595/rc

Conflicts of Interest: All authors have completed the ICMJE uniform disclosure form (available at https://qims.amegroups.com/article/view/10.21037/qims-24-595/coif). The authors have no conflicts of interest to declare.

Ethical Statement: The authors are accountable for all aspects of the work in ensuring that questions related to the accuracy or integrity of any part of the work are appropriately investigated and resolved. The study was conducted in accordance with the Declaration of Helsinki (as revised in 2013). The study was approved by the Ethics Board of China-Japan Friendship Hospital (No. 2017-25) and the requirement for informed consent was waived for this retrospective study.

Open Access Statement: This is an Open Access article distributed in accordance with the Creative Commons Attribution-NonCommercial-NoDerivs 4.0 International License (CC BY-NC-ND 4.0), which permits the non-commercial replication and distribution of the article with the strict proviso that no changes or edits are made and the original work is properly cited (including links to both the formal publication through the relevant DOI and the license). See: https://creativecommons.org/licenses/by-nc-nd/4.0/.

References

Mathai SC, Danoff SK. Management of interstitial lung disease associated with connective tissue disease. BMJ 2016;352:h6819. [Crossref] [PubMed]
Ye S, Chen XX, Lu XY, Wu MF, Deng Y, Huang WQ, Guo Q, Yang CD, Gu YY, Bao CD, Chen SL. Adult clinically amyopathic dermatomyositis with rapid progressive interstitial lung disease: a retrospective cohort study. Clin Rheumatol 2007;26:1647-54. [Crossref] [PubMed]
Xu L, You H, Wang L, Lv C, Yuan F, Li J, et al. Identification of Three Different Phenotypes in Anti-Melanoma Differentiation-Associated Gene 5 Antibody-Positive Dermatomyositis Patients: Implications for Prediction of Rapidly Progressive Interstitial Lung Disease. Arthritis Rheumatol 2023;75:609-19. [Crossref] [PubMed]
Kobayashi N, Takezaki S, Kobayashi I, Iwata N, Mori M, Nagai K, et al. Clinical and laboratory features of fatal rapidly progressive interstitial lung disease associated with juvenile dermatomyositis. Rheumatology (Oxford) 2015;54:784-91. [Crossref] [PubMed]
Xu Y, Yang CS, Li YJ, Liu XD, Wang JN, Zhao Q, Xiao WG, Yang PT. Predictive factors of rapidly progressive-interstitial lung disease in patients with clinically amyopathic dermatomyositis. Clin Rheumatol 2016;35:113-6. [Crossref] [PubMed]
Lega JC, Reynaud Q, Belot A, Fabien N, Durieu I, Cottin V. Idiopathic inflammatory myopathies and the lung. Eur Respir Rev 2015;24:216-38. [Crossref] [PubMed]
Morisset J, Johnson C, Rich E, Collard HR, Lee JS. Management of Myositis-Related Interstitial Lung Disease. Chest 2016;150:1118-28. [Crossref] [PubMed]
Tillie-Leblond I, Wislez M, Valeyre D, Crestani B, Rabbat A, Israel-Biet D, Humbert M, Couderc LJ, Wallaert B, Cadranel J. Interstitial lung disease and anti-Jo-1 antibodies: difference between acute and gradual onset. Thorax 2008;63:53-9. [Crossref] [PubMed]
Won Huh J, Soon Kim D, Keun Lee C, Yoo B, Bum Seo J, Kitaichi M, Colby TV. Two distinct clinical types of interstitial lung disease associated with polymyositis-dermatomyositis. Respir Med 2007;101:1761-9. [Crossref] [PubMed]
Koreeda Y, Higashimoto I, Yamamoto M, Takahashi M, Kaji K, Fujimoto M, Kuwana M, Fukuda Y. Clinical and pathological findings of interstitial lung disease patients with anti-aminoacyl-tRNA synthetase autoantibodies. Intern Med 2010;49:361-9. [Crossref] [PubMed]
Barba T, Fort R, Cottin V, Provencher S, Durieu I, Jardel S, Hot A, Reynaud Q, Lega JC. Treatment of idiopathic inflammatory myositis associated interstitial lung disease: A systematic review and meta-analysis. Autoimmun Rev 2019;18:113-22. [Crossref] [PubMed]
Fujisawa T. Management of Myositis-Associated Interstitial Lung Disease. Medicina (Kaunas) 2021;57:347. [Crossref] [PubMed]
Walsh SL, Calandriello L, Sverzellati N, Wells AU, Hansell DM, Observer Consort UIP. Interobserver agreement for the ATS/ERS/JRS/ALAT criteria for a UIP pattern on CT. Thorax 2016;71:45-51. [Crossref] [PubMed]
Jacob J, Bartholmai BJ, Rajagopalan S, Kokosi M, Nair A, Karwoski R, Walsh SL, Wells AU, Hansell DM. Mortality prediction in idiopathic pulmonary fibrosis: evaluation of computer-based CT analysis with conventional severity measures. Eur Respir J 2017;49:1601011. [Crossref] [PubMed]
Clukers J, Lanclus M, Mignot B, Van Holsbeke C, Roseman J, Porter S, Gorina E, Kouchakji E, Lipson KE, De Backer W, De Backer J. Quantitative CT analysis using functional imaging is superior in describing disease progression in idiopathic pulmonary fibrosis compared to forced vital capacity. Respir Res 2018;19:213. [Crossref] [PubMed]
Xu W, Wu W, Zheng Y, Chen Z, Tao X, Zhang D, Zhao J, Wang K, Guo B, Luo Q, Han Q, Zhou Y, Ye S. A Computed Tomography Radiomics-Based Prediction Model on Interstitial Lung Disease in Anti-MDA5-Positive Dermatomyositis. Front Med (Lausanne) 2021;8:768052. [Crossref] [PubMed]
Zhang Y, Chen Z, Long Y, Zhang B, He Q, Tang K, Zhang X. 18F-FDG PET/CT and HRCT: a combined tool for risk stratification in idiopathic inflammatory myopathy-associated interstitial lung disease. Clin Rheumatol 2022;41:3095-105. [Crossref] [PubMed]
Xu W, Wu W, Zhang D, Chen Z, Tao X, Zhao J, Wang K, Wang X, Zheng Y, Ye S. A novel CT scoring method predicts the prognosis of interstitial lung disease associated with anti-MDA5 positive dermatomyositis. Sci Rep 2021;11:17070. [Crossref] [PubMed]
Haishuang S, Xiaoyan Y, Min L, Huaping D, Chen W. Advances in the Application of Artificial intelligence Technology in the Evaluation of Interstitial Lung Disease. Chinese Journal of Medical Imaging 2022;30:509-13.
Walsh SLF, Humphries SM, Wells AU, Brown KK. Imaging research in fibrotic lung disease; applying deep learning to unsolved problems. Lancet Respir Med 2020;8:1144-53. [Crossref] [PubMed]
Park B, Park H, Lee SM, Seo JB, Kim N. Lung Segmentation on HRCT and Volumetric CT for Diffuse Interstitial Lung Disease Using Deep Convolutional Neural Networks. J Digit Imaging 2019;32:1019-26. [Crossref] [PubMed]
Anthimopoulos M, Christodoulidis S, Ebner L, Christe A, Mougiakakou S. Lung Pattern Classification for Interstitial Lung Diseases Using a Deep Convolutional Neural Network. IEEE Trans Med Imaging 2016;35:1207-16. [Crossref] [PubMed]
Agarwala S, Kale M, Kumar D, Swaroop R, Kumar A, Kumar Dhara A, Basu Thakur S, Sadhu A, Nandi D. Deep learning for screening of interstitial lung disease patterns in high-resolution CT images. Clin Radiol 2020;75:481.e1-8. [Crossref] [PubMed]
Kim GB, Jung KH, Lee Y, Kim HJ, Kim N, Jun S, Seo JB, Lynch DA. Comparison of Shallow and Deep Learning Methods on Classifying the Regional Pattern of Diffuse Lung Disease. J Digit Imaging 2018;31:415-24. [Crossref] [PubMed]
Wang Q, Zheng Y, Yang G, Jin W, Chen X, Yin Y. Multiscale Rotation-Invariant Convolutional Neural Networks for Lung Texture Classification. IEEE J Biomed Health Inform 2018;22:184-95. [Crossref] [PubMed]
Bermejo-Peláez D, Ash SY, Washko GR, San José Estépar R, Ledesma-Carbayo MJ. Classification of Interstitial Lung Abnormality Patterns with an Ensemble of Deep Convolutional Neural Networks. Sci Rep 2020;10:338. [Crossref] [PubMed]
Walsh SLF, Calandriello L, Silva M, Sverzellati N. Deep learning for classifying fibrotic lung disease on high-resolution computed tomography: a case-cohort study. Lancet Respir Med 2018;6:837-45. [Crossref] [PubMed]
Zhang L, Rong R, Li Q, Yang DM, Yao B, Luo D, Zhang X, Zhu X, Luo J, Liu Y, Yang X, Ji X, Liu Z, Xie Y, Sha Y, Li Z, Xiao G. A deep learning-based model for screening and staging pneumoconiosis. Sci Rep 2021;11:2201. [Crossref] [PubMed]
Aliboni L, Dias OM, Pennati F, Baldi BG, Sawamura MVY, Chate RC, Carvalho CRR, de Albuquerque ALP, Aliverti A, Quantitative CT. Analysis in Chronic Hypersensitivity Pneumonitis: A Convolutional Neural Network Approach. Acad Radiol 2022;29:S31-40. [Crossref] [PubMed]
Danieli MG, Paladini A, Longhi E, Tonacci A, Gangemi S. A machine learning analysis to evaluate the outcome measures in inflammatory myopathies. Autoimmun Rev 2023;22:103353. [Crossref] [PubMed]
Ley B, Ryerson CJ, Vittinghoff E, Ryu JH, Tomassetti S, Lee JS, Poletti V, Buccioli M, Elicker BM, Jones KD, King TE Jr, Collard HR. A multidimensional index and staging system for idiopathic pulmonary fibrosis. Ann Intern Med 2012;156:684-91. [Crossref] [PubMed]
Travis WD, Costabel U, Hansell DM, King TE Jr, Lynch DA, Nicholson AG, et al. An official American Thoracic Society/European Respiratory Society statement: Update of the international multidisciplinary classification of the idiopathic interstitial pneumonias. Am J Respir Crit Care Med 2013;188:733-48. [Crossref] [PubMed]
Raghu G, Remy-Jardin M, Richeldi L, Thomson CC, Inoue Y, Johkoh T, et al. Idiopathic Pulmonary Fibrosis (an Update) and Progressive Pulmonary Fibrosis in Adults: An Official ATS/ERS/JRS/ALAT Clinical Practice Guideline. Am J Respir Crit Care Med 2022;205:e18-47. [Crossref] [PubMed]
Mammen AL, Allenbach Y, Stenzel W, Benveniste O. ENMC 239th Workshop Study Group. 239th ENMC International Workshop: Classification of dermatomyositis, Amsterdam, the Netherlands, 14-16 December 2018. Neuromuscul Disord 2020;30:70-92. [Crossref] [PubMed]
Bohan A, Peter JB. Polymyositis and dermatomyositis (first of two parts). N Engl J Med 1975;292:344-7. [Crossref] [PubMed]
Cavagna L, Trallero-Araguás E, Meloni F, Cavazzana I, Rojas-Serrano J, Feist E, et al. Influence of Antisynthetase Antibodies Specificities on Antisynthetase Syndrome Clinical Spectrum Time Course. J Clin Med 2019;8:2013. [Crossref] [PubMed]
Raghu G, Collard HR, Egan JJ, Martinez FJ, Behr J, Brown KK, et al. An official ATS/ERS/JRS/ALAT statement: idiopathic pulmonary fibrosis: evidence-based guidelines for diagnosis and management. Am J Respir Crit Care Med 2011;183:788-824. [Crossref] [PubMed]
Zhang XL, Zhang L, Li YM, Xiang BY, Han T, Wang Y, Wang C. Multidimensional assessment and cluster analysis for OSA phenotyping. J Clin Sleep Med 2022;18:1779-88. [Crossref] [PubMed]
Stanojevic S, Kaminsky DA, Miller MR, Thompson B, Aliverti A, Barjaktarevic I, Cooper BG, Culver B, Derom E, Hall GL, Hallstrand TS, Leuppi JD, MacIntyre N, McCormack M, Rosenfeld M, Swenson ER. ERS/ATS technical standard on interpretive strategies for routine lung function tests. Eur Respir J 2022;60:2101499. [Crossref] [PubMed]
Miller MR, Hankinson J, Brusasco V, Burgos F, Casaburi R, Coates A, Crapo R, Enright P, van der Grinten CP, Gustafsson P, Jensen R, Johnson DC, MacIntyre N, McKay R, Navajas D, Pedersen OF, Pellegrino R, Viegi G, Wanger JATS/ERS Task Force. Standardisation of spirometry. Eur Respir J 2005;26:319-38. [Crossref] [PubMed]
Yu H, Yang Z, Wei Y, Shi W, Zhu M, Liu L, Wang M, Wang Y, Zhu Q, Liang Z, Zhao W, Chen LA. Computed tomography-based radiomics improves non-invasive diagnosis of Pneumocystis jirovecii pneumonia in non-HIV patients: a retrospective study. BMC Pulm Med 2024;24:11. [Crossref] [PubMed]
Yu F, Peng M, Bai J, Zhu X, Zhang B, Tang J, et al. Comprehensive characterization of genomic and radiologic features reveals distinct driver patterns of RTK/RAS pathway in ground-glass opacity pulmonary nodules. Int J Cancer 2022;151:2020-30. [Crossref] [PubMed]
Sun X, Meng X, Zhang P, Wang L, Ren Y, Xu G, Yang T, Liu M. Quantification of pulmonary vessel volumes on low-dose computed tomography in a healthy male Chinese population: the effects of aging and smoking. Quant Imaging Med Surg 2022;12:406-16. [Crossref] [PubMed]
Li L, Wang L, Zeng F, Peng G, Ke Z, Liu H, Zha Y. Development and multicenter validation of a CT-based radiomics signature for predicting severe COVID-19 pneumonia. Eur Radiol 2021;31:7901-12. [Crossref] [PubMed]
Rish I. An empirical study of the naive Bayes classifier. IJCAI 2001 workshop on empirical methods in artificial intelligence, 2001;3:41-6.
Gierada DS, Guniganti P, Newman BJ, Dransfield MT, Kvale PA, Lynch DA, Pilgram TK. Quantitative CT assessment of emphysema and airways in relation to lung cancer risk. Radiology 2011;261:950-9. [Crossref] [PubMed]
Gawlitza J, Sturm T, Spohrer K, Henzler T, Akin I, Schönberg S, Borggrefe M, Haubenreisser H, Trinkmann F. Predicting Pulmonary Function Testing from Quantified Computed Tomography Using Machine Learning Algorithms in Patients with COPD. Diagnostics (Basel) 2019;9:33. [Crossref] [PubMed]
Fernández-Delgado M, Cernadas E, Barro S, Amorim D. Do we need hundreds of classifiers to solve real world classification problems? J Mach Learn Res 2014;15:3133-81.
Chen W, Xie X, Wang J, Pradhan B, Hong H, Bui DT, et al. A comparative study of logistic model tree, random forest, and classification and regression tree models for spatial prediction of landslide susceptibility. CATENA 2017;151:147-60.
Babajide Mustapha I, Saeed F. Bioactive Molecule Prediction Using Extreme Gradient Boosting. Molecules 2016;21:983. [Crossref] [PubMed]
Moslemi A, Kontogianni K, Brock J, Wood S, Herth F, Kirby M. Differentiating COPD and asthma using quantitative CT imaging and machine learning. Eur Respir J 2022;60:2103078. [Crossref] [PubMed]
Nagaraj Y, de Jonge G, Andreychenko A, Presti G, Fink MA, Pavlov N, Quattrocchi CC, Morozov S, Veldhuis R, Oudkerk M, van Ooijen PMA. Facilitating standardized COVID-19 suspicion prediction based on computed tomography radiomics in a multi-demographic setting. Eur Radiol 2022;32:6384-96. [Crossref] [PubMed]
Chino H, Sekine A, Baba T, Kitamura H, Iwasawa T, Okudela K, Takemura T, Itoh H, Sato S, Suzuki Y, Ogura T. Interstitial Lung Disease with Anti-melanoma Differentiation-associated Protein 5 Antibody: Rapidly Progressive Perilobular Opacity. Intern Med 2019;58:2605-13. [Crossref] [PubMed]
Zuo Y, Ye L, Liu M, Li S, Liu W, Chen F, Lu X, Gordon P, Wang G, Shu X. Clinical significance of radiological patterns of HRCT and their association with macrophage activation in dermatomyositis. Rheumatology (Oxford) 2020;59:2829-37. [Crossref] [PubMed]
Shmueli G. To Explain or to Predict? Statistical Science 2010;25:289-310.
Yu KH, Wu YJ, Kuo CF, See LC, Shen YM, Chang HC, Luo SF, Ho HH, Chen IJ. Survival analysis of patients with dermatomyositis and polymyositis: analysis of 192 Chinese cases. Clin Rheumatol 2011;30:1595-601. [Crossref] [PubMed]
Sugiyama Y, Yoshimi R, Tamura M, Takeno M, Kunishita Y, Kishimoto D, et al. The predictive prognostic factors for polymyositis/dermatomyositis-associated interstitial lung disease. Arthritis Res Ther 2018;20:7. [Crossref] [PubMed]
Fujisawa T, Hozumi H, Kono M, Enomoto N, Hashimoto D, Nakamura Y, Inui N, Yokomura K, Koshimizu N, Toyoshima M, Shirai T, Yasuda K, Hayakawa H, Suda T. Prognostic factors for myositis-associated interstitial lung disease. PLoS One 2014;9:e98824. [Crossref] [PubMed]
Chung JH, Adegunsoye A, Oldham JM, Vij R, Husain A, Montner SM, Karwoski RA, Bartholmai BJ, Strek ME. Vessel-related structures predict UIP pathology in those with a non-IPF pattern on CT. Eur Radiol 2021;31:7295-302. [Crossref] [PubMed]
Sadeghi S, Granton JT, Akhavan P, Pasarikovski CR, Roos AM, Thenganatt J, Moric J, Johnson SR. Survival in rheumatoid arthritis-associated pulmonary arterial hypertension compared with idiopathic pulmonary arterial hypertension. Respirology 2015;20:481-7. [Crossref] [PubMed]
Takahashi K, Taniguchi H, Ando M, Sakamoto K, Kondoh Y, Watanabe N, Kimura T, Kataoka K, Suzuki A, Ito S, Hasegawa Y. Mean pulmonary arterial pressure as a prognostic indicator in connective tissue disease associated with interstitial lung disease: a retrospective cohort study. BMC Pulm Med 2016;16:55. [Crossref] [PubMed]
Chung JH, Adegunsoye A, Cannon B, Vij R, Oldham JM, King C, Montner SM, Thirkateh P, Barnett S, Karwoski R, Bartholmai BJ, Strek M, Nathan SD. Differentiation of Idiopathic Pulmonary Fibrosis from Connective Tissue Disease-Related Interstitial Lung Disease Using Quantitative Imaging. J Clin Med 2021;10:2663. [Crossref] [PubMed]
Renzoni EA, Walsh DA, Salmon M, Wells AU, Sestini P, Nicholson AG, Veeraraghavan S, Bishop AE, Romanska HM, Pantelidis P, Black CM, Du Bois RM. Interstitial vascularity in fibrosing alveolitis. Am J Respir Crit Care Med 2003;167:438-43. [Crossref] [PubMed]
Leach HG, Chrobak I, Han R, Trojanowska M. Endothelial cells recruit macrophages and contribute to a fibrotic milieu in bleomycin lung injury. Am J Respir Cell Mol Biol 2013;49:1093-101. [Crossref] [PubMed]
Baruah J, Wary KK. Exosomes in the Regulation of Vascular Endothelial Cell Regeneration. Front Cell Dev Biol 2019;7:353. [Crossref] [PubMed]
Wu X, Gao Y, Xu L, Dang W, Yan H, Zou D, Zhu Z, Luo L, Tian N, Wang X, Tong Y, Han Z. Exosomes from high glucose-treated glomerular endothelial cells trigger the epithelial-mesenchymal transition and dysfunction of podocytes. Sci Rep 2017;7:9371. [Crossref] [PubMed]
Muñoz-Espín D, Serrano M. Cellular senescence: from physiology to pathology. Nat Rev Mol Cell Biol 2014;15:482-96. [Crossref] [PubMed]
Xiong J, Kawagishi H, Yan Y, Liu J, Wells QS, Edmunds LR, Fergusson MM, Yu ZX, Rovira II, Brittain EL, Wolfgang MJ, Jurczak MJ, Fessel JP, Finkel T. A Metabolic Basis for Endothelial-to-Mesenchymal Transition. Mol Cell 2018;69:689-698.e7. [Crossref] [PubMed]
Tanizawa K, Handa T, Nakashima R, Kubo T, Hosono Y, Watanabe K, Aihara K, Oga T, Chin K, Nagai S, Mimori T, Mishima M. HRCT features of interstitial lung disease in dermatomyositis with anti-CADM-140 antibody. Respir Med 2011;105:1380-7. [Crossref] [PubMed]
Hayashi S, Tanaka M, Kobayashi H, Nakazono T, Satoh T, Fukuno Y, Aragane N, Tada Y, Koarada S, Ohta A, Nagasawa K. High-resolution computed tomography characterization of interstitial lung diseases in polymyositis/dermatomyositis. J Rheumatol 2008;35:260-9.
Tanizawa K, Handa T, Nakashima R, Kubo T, Hosono Y, Aihara K, Ikezoe K, Watanabe K, Taguchi Y, Hatta K, Oga T, Chin K, Nagai S, Mimori T, Mishima M. The prognostic value of HRCT in myositis-associated interstitial lung disease. Respir Med 2013;107:745-52. [Crossref] [PubMed]
Ungprasert P, Wilton KM, Ernste FC, Kalra S, Crowson CS, Rajagopalan S, Bartholmai BJ. Novel Assessment of Interstitial Lung Disease Using the "Computer-Aided Lung Informatics for Pathology Evaluation and Rating" (CALIPER) Software System in Idiopathic Inflammatory Myopathies. Lung 2017;195:545-52. [Crossref] [PubMed]
Marie I, Josse S, Hatron PY, Dominique S, Hachulla E, Janvresse A, Cherin P, Mouthon L, Vittecoq O, Menard JF, Jouen F. Interstitial lung disease in anti-Jo-1 patients with antisynthetase syndrome. Arthritis Care Res (Hoboken) 2013;65:800-8. [Crossref] [PubMed]
Fathi M, Vikgren J, Boijsen M, Tylen U, Jorfeldt L, Tornling G, Lundberg IE. Interstitial lung disease in polymyositis and dermatomyositis: longitudinal evaluation by pulmonary function and radiology. Arthritis Rheum 2008;59:677-85. [Crossref] [PubMed]
Wells AU, Desai SR, Rubens MB, Goh NS, Cramer D, Nicholson AG, Colby TV, du Bois RM, Hansell DM. Idiopathic pulmonary fibrosis: a composite physiologic index derived from disease extent observed by computed tomography. Am J Respir Crit Care Med 2003;167:962-9. [Crossref] [PubMed]
Li S, Sun Y, Shao C, Huang H, Wang Q, Xu K, Zhang X, Liu P, Zeng X, Xu Z. Prognosis of adult idiopathic inflammatory myopathy-associated interstitial lung disease: a retrospective study of 679 adult cases. Rheumatology (Oxford) 2021;60:1195-204. [Crossref] [PubMed]
Nakashima R, Imura Y, Kobayashi S, Yukawa N, Yoshifuji H, Nojima T, Kawabata D, Ohmura K, Usui T, Fujii T, Okawa K, Mimori T. The RIG-I-like receptor IFIH1/MDA5 is a dermatomyositis-specific autoantigen identified by the anti-CADM-140 antibody. Rheumatology (Oxford) 2010;49:433-40. [Crossref] [PubMed]
Chen Z, Cao M, Plana MN, Liang J, Cai H, Kuwana M, Sun L. Utility of anti-melanoma differentiation-associated gene 5 antibody measurement in identifying patients with dermatomyositis and a high risk for developing rapidly progressive interstitial lung disease: a review of the literature and a meta-analysis. Arthritis Care Res (Hoboken) 2013;65:1316-24. [Crossref] [PubMed]
Vuillard C, Pineton de Chambrun M, de Prost N, Guérin C, Schmidt M, Dargent A, et al. Clinical features and outcome of patients with acute respiratory failure revealing anti-synthetase or anti-MDA-5 dermato-pulmonary syndrome: a French multicenter retrospective study. Ann Intensive Care 2018;8:87. [Crossref] [PubMed]
Zhao L, Xie S, Zhou B, Shen C, Li L, Pi W, Gong Z, Zhao J, Peng Q, Zhou J, Peng J, Zhou Y, Zou L, Song L, Zhu H, Luo H. Machine Learning Algorithms Identify Clinical Subtypes and Cancer in Anti-TIF1γ+ Myositis: A Longitudinal Study of 87 Patients. Front Immunol 2022;13:802499. [Crossref] [PubMed]
Danieli MG, Tonacci A, Paladini A, Longhi E, Moroncini G, Allegra A, Sansone F, Gangemi S. A machine learning analysis to predict the response to intravenous and subcutaneous immunoglobulin in inflammatory myopathies. A proposal for a future multi-omics approach in autoimmune diseases. Autoimmun Rev 2022;21:103105. [Crossref] [PubMed]
Xue Y, Zhang J, Li C, Liu X, Kuang W, Deng J, Wang J, Tan X, Li S, Li C. Machine learning for screening and predicting the risk of anti-MDA5 antibody in juvenile dermatomyositis children. Front Immunol 2022;13:940802. [Crossref] [PubMed]
Zhu J, Wu L, Zhou Y, Wang R, Chen S, Zhao J, Yu S, Zheng S, Xiao F, Ren H, Yang M, Li J. A retrospective cohort study in Chinese patients with adult polymyositis and dermatomyositis: risk of comorbidities and subclassification using machine learning. Clin Exp Rheumatol 2022;40:224-36. [Crossref] [PubMed]
Liu D, Zhao L, Jiang Y, Li L, Guo M, Mu Y, Zhu H. Integrated analysis of plasma and urine reveals unique metabolomic profiles in idiopathic inflammatory myopathies subtypes. J Cachexia Sarcopenia Muscle 2022;13:2456-72. [Crossref] [PubMed]
Pinal-Fernandez I, Casal-Dominguez M, Derfoul A, Pak K, Miller FW, Milisenda JC, Grau-Junyent JM, Selva-O'Callaghan A, Carrion-Ribas C, Paik JJ, Albayda J, Christopher-Stine L, Lloyd TE, Corse AM, Mammen AL. Machine learning algorithms reveal unique gene expression profiles in muscle biopsies from patients with different types of myositis. Ann Rheum Dis 2020;79:1234-42. [Crossref] [PubMed]
Rider LG, Werth VP, Huber AM, Alexanderson H, Rao AP, Ruperto N, Herbelin L, Barohn R, Isenberg D, Miller FW. Measures of adult and juvenile dermatomyositis, polymyositis, and inclusion body myositis: Physician and Patient/Parent Global Activity, Manual Muscle Testing (MMT), Health Assessment Questionnaire (HAQ)/Childhood Health Assessment Questionnaire (C-HAQ), Childhood Myositis Assessment Scale (CMAS), Myositis Disease Activity Assessment Tool (MDAAT), Disease Activity Score (DAS), Short Form 36 (SF-36), Child Health Questionnaire (CHQ), physician global damage, Myositis Damage Index (MDI), Quantitative Muscle Testing (QMT), Myositis Functional Index-2 (FI-2), Myositis Activities Profile (MAP), Inclusion Body Myositis Functional Rating Scale (IBMFRS), Cutaneous Dermatomyositis Disease Area and Severity Index (CDASI), Cutaneous Assessment Tool (CAT), Dermatomyositis Skin Severity Index (DSSI), Skindex, and Dermatology Life Quality Index (DLQI). Arthritis Care Res (Hoboken) 2011;63:S118-57.
Nicholson LT, Haemel A. Outcome measures in dermatomyositis: quality of life and the patient perspective. Br J Dermatol 2020;182:830-1. [Crossref] [PubMed]

Cite this article as: Qiang Y, Wang H, Ni Y, Wang J, Liu A, Yang H, Xi L, Ren Y, Xie B, Wang S, Liu M, Wang C, Dai H. Development of a machine learning model in prediction of the rapid progression of interstitial lung disease in patients with idiopathic inflammatory myopathy. Quant Imaging Med Surg 2024;14(12):9258-9275. doi: 10.21037/qims-24-595

Development of a machine learning model in prediction of the rapid progression of interstitial lung disease in patients with idiopathic inflammatory myopathy

Introduction

Methods

Study cohort and design

PFTs

HRCT

Quantitative CT analysis

Establishment and validation of ML models

Statistical analysis

Results

Clinical characteristics

Table 1

QCT feature extraction and selection

The performance of different ML models

Table 2

Table 3

The final model and nomogram for clinical application

Table 4

Table 5

Discussion

Conclusions

Acknowledgments

Footnote

References

Article Options

Download Citation

Share