A novel computed tomography-based multi-parameter decision tree algorithm model for preoperatively predicting the risk of lymph node metastasis in surgically resectable synchronous multiple primary lung cancer
Introduction
Rationale
Non-small cell lung cancer (NSCLC) is one of the most prevalent malignancies and the worldwide leading cause of cancer-related deaths (1). With advances in the multi-slice spiral computed tomography (CT) and other medical imaging techniques, multiple primary lung cancer (MPLC), which refers to the primary lung cancer in which ≥2 lesions occur simultaneously or successively in different locations of the lung within the same individual, has been increasingly diagnosed over decades (2). Thoracic surgeons have recently paid attention to the benefits from surgical resection for synchronous MPLC (SMPLC), as well as to minimize the injuries from over-surgery. Accumulative evidence has revealed that lymph node metastasis (LNM) predominates in the principal reasons for treatment failure and unfavorable prognosis following lung cancer surgery (3). Moreover, the scope of lymphadenectomy usually depends on the probability of LNM (4). Therefore, with the aim to improve surgical outcome of SMPLC, it is crucial to achieve precise risk prediction of LNM preoperatively, since we cannot only avoid aggressive lymph node dissection in the low-risk patients but also optimize postoperative care and adjuvant therapy purposefully in the high-risk patients.
As the most common imaging test in daily practice, chest CT has also been widely demonstrated to have the potential to offer a series of qualitative and quantitative features which are of clinical significance for risk evaluation of worse prognosis in NSCLC (5,6). However, to our knowledge, there is no investigation yet on the key indicators derived from chest CT for the occurrence of LNM in SMPLC. Accordingly, an easy-to-use and practicable CT-based multi-parameter scoring system based on accurate risk stratification of LNM in surgically resectable SMPLC will be particularly valuable for decision-making.
Objectives
Decision tree algorithm (DTA) is a non-parametric supervised learning method with the goals to create a model that predicts the value of a target endpoint by learning simple decision rules inferred from the data features (7,8). It has been reported that DTA models can provide user-friendly information which are competent to aid treatment decision-making in clinical settings (7,8). Thus, the purpose of this multi-center study was to develop a novel DTA model based on chest CT imaging features of pulmonary nodules for precise risk evaluation of LNM in patients who intended to undergo surgical treatment for SMPLC. We presented this article in accordance with the TRIPOD+AI reporting checklist (available at https://qims.amegroups.com/article/view/10.21037/qims-24-2440/rc).
Methods
Study design, study protocol and settings
This retrospective cohort study was performed on the independent datasets of surgical patients with SMPLC prospectively collected from Sun Yat-Sen University Cancer Center (SYSUCC), the First Affiliated Hospital of Sun Yat-Sen University (FAH-SYSU) and Sichuan Provincial People’s Hospital (SPPH) between December 2011 and June 2020.
This study was approved by the Institutional Review Board of Sun Yat-sen University Cancer Center (No. B2022-293-Y01). All participating institutions were informed and agreed to the study. The study was conducted in accordance with the Declaration of Helsinki and its subsequent amendments. The participants were required to give informed consent before taking part.
Participants
Recruitment and study groups
We initially reviewed the clinicopathological characteristics and CT imaging records of 317 consecutive patients pathologically diagnosed with SMPLC following surgery at SYSUCC, FAH-SYSU and SPPH. Finally, there were 235 of these patients considered to fit with eligibility criteria, as shown in Figure 1. We determined to classify the 139 patients from SYSUCC into the training cohort and the remaining 96 patients from FAH-SYSU and SPPH into the validation cohort, respectively. A CT-based multi-parameter DTA model (CT-DTA) was built up based on the training cohort, and its predictive capacity for risk of LNM was further evaluated in the external validation cohort.
Eligibility criteria
The following criteria were utilized to judge the candidate’s suitability for inclusion or exclusion:
- Patients who were postoperatively and pathologically confirmed with SMPLC according to the 2013 American College of Chest Physicians (ACCP) criteria and the American Joint Committee on Cancer (AJCC)/Union for International Cancer Control (UICC) tumor-node-metastasis (TNM) classification system (8th edition) were included (9,10). Patients whose complete details of pathological reports were unavailable would not be considered in order to guarantee the accuracy and objectivity of the data analyzed;
- Curative-intent anatomical or wedge resections of the lung with systemic mediastinal lymph node dissection were eligible;
- The clinicopathological and chest thin-section CT (TS-CT, slice thickness ≤1 mm) imaging details must be completely obtained within 3 months before surgery;
- Patients who had received neoadjuvant therapy would be excluded to avoid potential confounding effects on metastatic cells within mediastinal or hilar lymph nodes;
- Patients with concurrent malignancies would be excluded to avoid potential selection bias caused by mediastinal LNM originated from additional primary tumors out of the lungs.
Measurement and definitions of outcome data
Clinicopathological characteristics
We recorded the following clinicopathological characteristics in compliance with the joint standardization of variable definitions and terminology from the Society of Thoracic Surgeons and the European Society of Thoracic Surgeons (11):
- Baseline information estimated: age, gender, and smoking status;
- Preoperative comorbidities estimated: respiratory comorbidities (chronic obstructive pulmonary disease, emphysema, tuberculosis, pneumonia, asthma, bronchiectasis and interstitial lung diseases), cardiovascular comorbidities (hypertension, coronary artery disease, valvular heart disease, cerebrovascular disease, cardiac arrhythmias, chronic heart failure and peripheral artery disease) and diabetes mellitus;
- Pathological features estimated: pleural invasion, lympho-vascular invasion, T-stage, N-stage and TNM stage. All of above results were estimated by our experienced pathologists according to the ACCP 2013 and the 8th edition AJCC/UICC criteria (9,10).
Chest TS-CT imaging parameters
Radiological assessment based on chest TS-CT images for each included patient, with both lung window (width: 1,500 HU; level: −450 HU) and mediastinal window (width: 350 HU; level: 40 HU) settings, was independently executed by two radiologists with work experience exceeding 10 years in a blind manner. Besides, in case of disagreement between the two primary radiologists, another radiologist with a 20-year experience would be invited to adjudicate a final decision.
The following TS-CT imaging features regarding each nodule of SMPLC were cautiously measured and recorded, including the type of nodule (12), long-axis diameter of the lesion, long-axis diameter of solid portion in the lesion, consolidation tumor ratio (CTR), spiculation, lobulation, pleural indentation, bubble-like vacuole and air bronchogram. And then, we combined the results regarding the same kind of TS-CT sign of all existing nodules as one covariable incorporated into CT-DTA model (Table S1). Qualitative and quantitative evaluation criteria of all the above parameters are detailed in Table S1 and described in our previous studies (7,13). Representative TS-CT images are displayed in Figure S1.
Outcome of interest
The primary endpoint in our study was the pathologically diagnosed LNM, which refers to cancer cells spread from SMPLC to mediastinal and/or hilar lymph nodes (11). The occurrence of LNM would be judged by an inter-institutional telepathology consultation if necessary.
Cutoffs of imaging parameters
The maximum value of Youden index (sensitivity + specificity −1) was employed as the optimal cut-point for continuous variable of CTR, long-axis diameter of the maximal lesion and long-axis diameter of solid portion in the maximal lesion respectively with respect to risk prediction of LNM in the training cohort of patients. The threshold values of these imaging parameters were further determined to be the grouping criteria corresponding to each lesion of SMPLC (Table S1).
Statistical analysis
Statistical differences between groups
We used Pearson’s Chi-squared test, Yates’ correction test or Fisher exact test to compare categorical variables and Mann-Whitney U test to compare continuous data [mean ± standard deviation (SD); median and interquartile range (IQR)], respectively. Statistical significance was suggested by P<0.050 in a two-sided test.
Predictive factor analysis
The correlations between all the evaluated characteristics and risk of LNM were initially investigated through univariable logistic regression analysis. Thereafter, all the clinicopathological and imaging covariables with P<0.20 were included in multivariable logistic regression models, and then, odds ratios (ORs) with 95% confidence intervals (CIs) were generated to determine which factors could play significantly predictive roles for the development of LNM. Ultimately, we utilized the Hosmer-Lemeshow test to measure the goodness-of-fit of each multivariable logistic regression model.
All the above statistical methods were accomplished using the IBM SPSS 27.0 software (IBM SPSS Statistics, Version 22.0., IBM Corp, Armonk, NY, USA).
Establishment and validation of DTA model
A DTA model was comprised of both qualitative and quantitative parameters of TS-CT signs based on the training cohort, all of which were found to show statistical significance in both group comparisons and logistic regression analyses. The eligible imaging parameters characterized as categorical or optimal cutoff values were input to establish a DTA model whose tree-growing methodology was set as follows: classification and regression tree (CART) algorithm; ‘Gini’ impurity criterion; the minimum number of samples required to split an internal node as 5; the minimum number of samples required to locate at a leaf node as 2; the maximum depth of the tree as 30. This decision-making process would be terminated when there was no substantial contribution in the next cycle of splitting (7).
The predictive performance of this CT-DTA model with regard to the risk of LNM was further externally validated based on the validation cohort. We conducted the receiver operating characteristic (ROC) analysis to estimate the capacity of our CT-DTA model and main clinicopathological characteristics to discriminate between the incidences of LNM. In addition, their areas under the curve (AUCs) would be inferred and further compared using DeLong test. A risk prediction model with an AUC >0.80 and P<0.001 would be considered to be clinically useful (14). In addition, we plotted calibration curves with bootstrap repetitions to reveal the consistency between the predicted probability by applying CT-DTA model and the real-world probability of LNM. Finally, we used the Shapley Additive Explanation (SHAP) method to explain covariable importance to the occurrence of LNM (15).
All the above statistical methods involved in CT-DTA modeling and validation were accomplished using the JMP Pro 16.0 software (SAS Institute, Cary, NC, USA) and R Studio 4.2.3 (R Foundation for Statistical Computing, Vienna, Austria).
Subgroup analyses
In addition, the efficiency of CT-DTA model was also assessed across all of the subgroups stratified by clinicopathological characteristics. A ROC analysis on each subgroup of the entire cohort was employed to measure the predictive accuracy of CT-DTA model for the emergence of LNM. The predictive independence of CT-DTA model in each subgroup of the entire cohort was further confirmed by a multivariable logistic regression analysis.
Results
Clinicopathological and imaging characteristics
The clinicopathological parameters of all the included patients are listed in Table 1. The majority of SMPLC contained two nodules (84.3%) and developed within the ipsilateral lobes (66.0%). Adenocarcinoma was the most common histological subtype of all the lesions in SMPLC (92.3%), 60% of which were identified as T1-stage tumors, and there were 29 (12.3%) of the patients diagnosed with LNM (N1–2-stage). No significant difference was found in the incidence of LNM between training cohort and the validation cohort.
Table 1
| Estimated characteristics | Entire cohort (n=235) | Training cohort (n=139) | Validation cohort (n=96) | P value |
|---|---|---|---|---|
| Clinicopathological parameters | ||||
| Age (years) | ||||
| Mean ± SD | 61.5±10.0 | 62.7±9.9 | 59.9±10.0 | 0.051 |
| Median [IQR] | 63 [55–68] | 65 [56–69] | 61 [52–68] | |
| Gender | 0.35 | |||
| Female | 131 (55.7) | 74 (53.2) | 57 (59.4) | |
| Male | 104 (44.3) | 65 (46.8) | 39 (40.6) | |
| Smoking status | 0.73 | |||
| Never | 166 (70.6) | 97 (69.8) | 69 (71.9) | |
| Current/former | 69 (29.4) | 42 (30.2) | 27 (28.1) | |
| Respiratory comorbidity | 0.45 | |||
| Absent | 222 (94.5) | 130 (93.5) | 92 (95.8) | |
| Present | 13 (5.5) | 9 (6.5) | 4 (4.2) | |
| Cardio-cerebrovascular comorbidity | 0.074 | |||
| Absent | 193 (82.1) | 109 (78.4) | 84 (87.5) | |
| Present | 42 (17.9) | 30 (21.6) | 12 (12.5) | |
| Diabetes mellitus | 0.069 | |||
| Absent | 217 (92.3) | 132 (95.0) | 85 (88.5) | |
| Present | 18 (7.7) | 7 (5.0) | 11 (11.5) | |
| Number of the lesions | 0.75 | |||
| 2 | 198 (84.3) | 118 (84.9) | 80 (83.3) | |
| ≥3 | 37 (15.7) | 21 (15.1) | 16 (16.7) | |
| Location of the lesions | <0.001 | |||
| Ipsilateral side: right & same lobe | 24 (10.2) | 5 (3.6) | 19 (19.8) | |
| Ipsilateral side: left & same lobe | 10 (4.3) | 3 (2.2) | 7 (7.3) | |
| Ipsilateral side: right & different lobes | 87 (37.0) | 56 (40.3) | 31 (32.3) | |
| Ipsilateral side: left & different lobes | 34 (14.5) | 23 (16.5) | 11 (11.5) | |
| Contralateral side | 80 (34.0) | 52 (37.4) | 28 (29.2) | |
| Surgical procedure | 0.068 | |||
| Sub-lobar resections | 55 (23.4) | 34 (24.5) | 21 (21.9) | |
| Lobectomy | 36 (15.3) | 15 (10.8) | 21 (21.9) | |
| Lobectomy + sub-lobar resection | 125 (53.2) | 83 (59.7) | 42 (43.8) | |
| Bi-lobectomy & pneumonectomy | 19 (8.1) | 7 (5.0) | 12 (12.5) | |
| Histology | 0.27 | |||
| AC + AC | 217 (92.3) | 126 (90.6) | 91 (94.8) | |
| AC + SCC | 14 (6.0) | 11 (7.9) | 3 (3.1) | |
| SCC + SCC | 4 (1.7) | 2 (1.4) | 2 (2.1) | |
| Pleural invasion | 0.79 | |||
| Absent | 179 (76.2) | 105 (75.5) | 74 (77.1) | |
| Present | 56 (23.8) | 34 (24.5) | 22 (22.9) | |
| Lymphovascular invasion | 0.56 | |||
| Absent | 209 (88.9) | 125 (89.9) | 84 (87.5) | |
| Present | 26 (11.1) | 14 (10.1) | 12 (12.5) | |
| T stage of the maximal lesion | <0.001 | |||
| Tis–1 | 141 (60.0) | 81 (58.3) | 60 (62.5) | |
| T2 | 67 (28.5) | 52 (37.4) | 15 (15.6) | |
| T3–4 | 27 (11.5) | 6 (4.3) | 21 (21.9) | |
| T stages of the lesions | 0.67 | |||
| Tis–1 + Tis–1 | 141 (60.0) | 81 (58.3) | 60 (62.5) | |
| Tis–1 + T2–4 | 81 (34.5) | 51 (36.7) | 30 (31.3) | |
| T2–4 + T2–4 | 13 (5.5) | 7 (5.0) | 6 (6.3) | |
| Lymph node metastasis | 0.64 | |||
| No (N0) | 206 (87.7) | 123 (88.5) | 83 (86.5) | |
| Yes (N1–2) | 29 (12.3) | 16 (11.5) | 13 (13.5) | |
| TNM stage | 0.29 | |||
| 0–I | 182 (77.4) | 111 (79.9) | 71 (74.0) | |
| II–IV | 53 (22.6) | 28 (20.1) | 25 (26.0) | |
| Imaging parameters on chest computed tomography | ||||
| Type of nodule | 0.022 | |||
| Pure GGN + pure GGN | 25 (10.6) | 12 (8.6) | 13 (13.5) | |
| Pure GGN + GGO-predominant nodule | 13 (5.5) | 9 (6.5) | 4 (4.2) | |
| Pure GGN + solid-predominant nodule | 37 (15.7) | 16 (11.5) | 21 (21.9) | |
| Pure GGN + pure solid nodule | 20 (8.5) | 10 (7.2) | 10 (10.4) | |
| GGO-predominant nodule + GGO-predominant nodule | 2 (0.9) | 2 (1.4) | 0 | |
| GGO-predominant nodule + solid-predominant nodule | 17 (7.2) | 15 (10.8) | 2 (2.1) | |
| GGO-predominant nodule + pure solid nodule | 10 (4.3) | 8 (5.8) | 2 (2.1) | |
| Solid-predominant nodule + solid-predominant nodule | 26 (11.1) | 13 (9.4) | 13 (13.5) | |
| Solid-predominant nodule + pure solid nodule | 48 (20.4) | 29 (20.9) | 19 (19.8) | |
| Pure solid nodule + pure solid nodule | 37 (15.7) | 25 (18.0) | 12 (12.5) | |
| Consolidation tumor ratio | 0.043 | |||
| All lesions <0.90 | 107 (45.5) | 55 (39.6) | 52 (54.2) | |
| 1 lesion ≥0.90 | 84 (35.7) | 52 (37.4) | 32 (33.3) | |
| ≥2 lesions ≥0.90 | 44 (18.7) | 32 (23.0) | 12 (12.5) | |
| Presence of spiculation | <0.001 | |||
| Absent | 133 (56.6) | 79 (56.8) | 54 (56.3) | |
| 1 lesion present | 56 (23.8) | 44 (31.7) | 12 (12.5) | |
| ≥2 lesions present | 46 (19.6) | 16 (11.5) | 30 (31.3) | |
| Presence of lobulation | 0.87 | |||
| Absent | 14 (6.0) | 9 (6.5) | 5 (5.2) | |
| 1 lesion present | 70 (29.8) | 40 (28.8) | 30 (31.3) | |
| ≥2 lesions present | 151 (64.3) | 90 (64.7) | 61 (63.5) | |
| Presence of bubble-like vacuole | 0.013 | |||
| Absent | 144 (61.3) | 95 (68.3) | 49 (51.0) | |
| 1 lesion present | 74 (31.5) | 38 (27.3) | 36 (37.5) | |
| ≥2 lesions present | 17 (7.2) | 6 (4.3) | 11 (11.5) | |
| Presence of air bronchogram | 0.018 | |||
| Absent | 121 (51.5) | 63 (45.3) | 58 (60.4) | |
| Normally present | 69 (29.4) | 43 (30.9) | 26 (27.1) | |
| 1 lesion pathologically present | 40 (17.0) | 28 (20.1) | 12 (12.5) | |
| ≥2 lesions pathologically present | 5 (2.1) | 5 (3.6) | 0 | |
| Presence of pleural indentation | 0.030 | |||
| Absent | 73 (31.1) | 41 (29.5) | 32 (33.3) | |
| 1 lesion present | 117 (49.8) | 78 (56.1) | 39 (40.6) | |
| ≥2 lesions present | 45 (19.1) | 20 (14.4) | 25 (26.0) | |
| Long-axis diameter of the maximal lesion (mm) | 0.37 | |||
| Mean ± SD | 25.5±13.7 | 26.0±14.0 | 24.8±13.3 | |
| Median [IQR] | 23 [16–31] | 24 [17–31] | 22 [15–31] | |
| Long-axis diameter of solid portion in the maximal lesion (mm) | ||||
| Mean ± SD | 19.0±16.9 | 20.8±17.0 | 16.4±16.6 | 0.020 |
| Median [IQR] | 17 [5–28] | 19 [9–29] | 13 [4–26] | |
Data were presented as n (%) if not otherwise specified. AC, adenocarcinoma; GGN, ground-glass nodule; GGO, ground-glass opacity; IQR, interquartile range; SCC, squamous cell carcinoma; SD, standard deviation.
The details of imaging parameters are listed in Table 1. On the one hand, the majority of the lesions in SMPLC were found to be without any TS-CT sign about spiculation (56.8%), bubble-like vacuole (68.3%) and abnormal air bronchogram (76.2%). On the other hand, TS-CT signs about pleural indentation (70.5%) and lobulation (93.5%) were both more frequently present in ≥1 lesion of SMPLC. The mean long-axis diameters of the maximal lesion and solid portion in the maximal lesion were 26.0±14.0 mm and 20.8±17.0 mm, respectively. Besides, demographic differences in TS-CT imaging features between the training cohort and the validation cohort were detailed in Table 1.
Derivation of CT-DTA model
Prediction of LNM by TS-CT imaging parameters
With respect to occurrence of LNM, the optimal cutoff points of CTR and long-axis diameters of the maximal lesion with its solid portion suggested by the maximum Youden indices were 0.90, 30 mm and 27 mm, respectively (Table S1). As exhibited in Table 2, we found significant differences in the type of nodule (P=0.021), CTR (P<0.001), presence of spiculation (P<0.001) and lobulation (P=0.012), and long-axis diameters of the lesion (P<0.001) and the solid portion (P<0.001) between patients with and without LNM in the training cohort. Moreover, in the univariable logistic regression analysis based on the training cohort, the number of pure solid nodule (PSN; P=0.001), CTR (P=0.001), presence of spiculation (P<0.001) and lobulation (P=0.039), and long-axis diameters of the lesion (P=0.011) and the solid portion (P=0.001) were initially found to be significantly associated with an increased risk of LNM (Table 3). After adjustment by all the covariable estimates holding P<0.20, a multivariable logistic regression analysis demonstrated that none of the above six imaging parameters could be independently predictive of LNM in the training cohort of patients, as detailed in Table 3 (model A).
Table 2
| Estimated characteristics | Training cohort (n=139) | Lymph node metastasis | P value | |
|---|---|---|---|---|
| No (N0: n=123) | Yes (N1–2: n=16) | |||
| Clinicopathological parameters | ||||
| Age (years) | ||||
| Mean ± SD | 62.7±9.9 | 62.9±10.2 | 61.1±7.7 | 0.49 |
| Median [IQR] | 65 [56–69] | 65 [56–69] | 63 [57–66] | |
| Gender | 0.18 | |||
| Female | 74 (53.2) | 68 (55.3) | 6 (37.5) | |
| Male | 65 (46.8) | 55 (44.7) | 10 (62.5) | |
| Smoking status | 0.034 | |||
| Never | 97 (69.8) | 90 (73.2) | 7 (43.8) | |
| Current/former | 42 (30.2) | 33 (26.8) | 9 (56.3) | |
| Respiratory comorbidity | 1.0 | |||
| Absent | 130 (93.5) | 115 (93.5) | 15 (93.8) | |
| Present | 9 (6.5) | 8 (6.5) | 1 (6.3) | |
| Cardio-cerebrovascular comorbidity | 0.21 | |||
| Absent | 109 (78.4) | 94 (76.4) | 15 (93.8) | |
| Present | 30 (21.6) | 29 (23.6) | 1 (6.3) | |
| Diabetes mellitus | 0.71 | |||
| Absent | 132 (95.0) | 116 (94.3) | 16 (100) | |
| Present | 7 (5.0) | 7 (5.7) | 0 | |
| Number of the lesions | 0.95 | |||
| 2 | 118 (84.9) | 105 (85.4) | 13 (81.3) | |
| ≥3 | 21 (15.1) | 18 (14.6) | 3 (18.8) | |
| Location of the lesions | 0.68 | |||
| Ipsilateral side: right & same lobe | 5 (3.6) | 5 (4.1) | 0 | |
| Ipsilateral side: left & same lobe | 3 (2.2) | 2 (1.6) | 1 (6.3) | |
| Ipsilateral side: right & different lobes | 56 (40.3) | 50 (40.7) | 6 (37.5) | |
| Ipsilateral side: left & different lobes | 23 (16.5) | 20 (16.3) | 3 (18.8) | |
| Contralateral side | 52 (37.4) | 46 (37.4) | 6 (37.5) | |
| Surgical procedure | 0.45 | |||
| Sub-lobar resections | 34 (24.5) | 30 (24.4) | 4 (25.0) | |
| Lobectomy | 15 (10.8) | 12 (9.8) | 3 (18.8) | |
| Lobectomy + sub-lobar resection | 83 (59.7) | 74 (60.2) | 9 (56.3) | |
| Bi-lobectomy & pneumonectomy | 7 (5.0) | 7 (5.7) | 0 | |
| Histology | 0.63 | |||
| AC + AC | 126 (90.6) | 112 (91.1) | 14 (87.5) | |
| AC + SCC | 11 (7.9) | 9 (7.3) | 2 (12.5) | |
| SCC + SCC | 2 (1.4) | 2 (1.6) | 0 | |
| Pleural invasion | 0.027 | |||
| Absent | 105 (75.5) | 97 (78.9) | 8 (50.0) | |
| Present | 34 (24.5) | 26 (21.1) | 8 (50.0) | |
| Lymphovascular invasion | <0.001 | |||
| Absent | 125 (89.9) | 116 (94.3) | 9 (56.3) | |
| Present | 14 (10.1) | 7 (5.7) | 7 (43.8) | |
| T stage of the maximal lesion | <0.001 | |||
| Tis–1 | 81 (58.3) | 79 (64.2) | 2 (12.5) | |
| T2 | 52 (37.4) | 38 (30.9) | 14 (87.5) | |
| T3–4 | 6 (4.3) | 6 (4.9) | 0 | |
| T stages of the lesions | <0.001 | |||
| Tis–1 + Tis–1 | 81 (58.3) | 79 (64.2) | 2 (12.5) | |
| Tis–1 + T2–4 | 51 (36.7) | 41 (33.3) | 10 (62.5) | |
| T2–4 + T2–4 | 7 (5.0) | 3 (2.4) | 4 (25.0) | |
| Imaging parameters on chest computed tomography | ||||
| Type of nodule | 0.021 | |||
| Pure GGN + pure GGN | 12 (8.6) | 12 (9.8) | 0 | |
| Pure GGN + GGO-predominant nodule | 9 (6.5) | 9 (7.3) | 0 | |
| Pure GGN + solid-predominant nodule | 16 (11.5) | 16 (13.0) | 0 | |
| Pure GGN + pure solid nodule | 10 (7.2) | 9 (7.3) | 1 (6.3) | |
| GGO-predominant nodule + GGO-predominant nodule | 2 (1.4) | 2 (1.6) | 0 | |
| GGO-predominant nodule + solid-predominant nodule | 15 (10.8) | 15 (12.2) | 0 | |
| GGO-predominant nodule + pure solid nodule | 8 (5.8) | 6 (4.9) | 2 (12.5) | |
| Solid-predominant nodule + solid-predominant nodule | 13 (9.4) | 12 (9.8) | 1 (6.3) | |
| Solid-predominant nodule + pure solid nodule | 29 (20.9) | 23 (18.7) | 6 (37.5) | |
| Pure solid nodule + pure solid nodule | 25 (18.0) | 19 (15.4) | 6 (37.5) | |
| Consolidation tumor ratio | <0.001 | |||
| All lesions <0.90 | 55 (39.6) | 55 (44.7) | 0 | |
| 1 lesion ≥0.90 | 52 (37.4) | 44 (35.8) | 8 (50.0) | |
| ≥2 lesions ≥0.90 | 32 (23.0) | 24 (19.5) | 8 (50.0) | |
| Presence of spiculation | <0.001 | |||
| Absent | 79 (56.8) | 79 (64.2) | 0 | |
| 1 lesion present | 44 (31.7) | 32 (26.0) | 12 (75.0) | |
| ≥2 lesions present | 16 (11.5) | 12 (9.8) | 4 (25.0) | |
| Presence of lobulation | 0.012 | |||
| Absent | 9 (6.5) | 9 (7.3) | 0 | |
| 1 lesion present | 40 (28.8) | 39 (31.7) | 1 (6.3) | |
| ≥2 lesions present | 90 (64.7) | 75 (61.0) | 15 (93.8) | |
| Presence of bubble-like vacuole | 0.46 | |||
| Absent | 95 (68.3) | 84 (68.3) | 11 (68.8) | |
| 1 lesion present | 38 (27.3) | 33 (26.8) | 5 (31.3) | |
| ≥2 lesions present | 6 (4.3) | 6 (4.9) | 0 | |
| Presence of air bronchogram | 0.083 | |||
| Absent | 63 (45.3) | 53 (43.1) | 10 (62.5) | |
| Normally present | 43 (30.9) | 42 (34.1) | 1 (6.3) | |
| 1 lesion pathologically present | 28 (20.1) | 24 (19.5) | 4 (25.0) | |
| ≥2 lesions pathologically present | 5 (3.6) | 4 (3.3) | 1 (6.3) | |
| Presence of pleural indentation | 0.97 | |||
| Absent | 41 (29.5) | 36 (29.3) | 5 (31.3) | |
| 1 lesion present | 78 (56.1) | 69 (56.1) | 9 (56.3) | |
| ≥2 lesions present | 20 (14.4) | 18 (14.6) | 2 (12.5) | |
| Long-axis diameter of the maximal lesion (mm) | <0.001 | |||
| Mean ± SD | 26.0±14.0 | 24.8±13.7 | 35.1±12.8 | |
| Median [IQR] | 24 [17–31] | 23 [16–29] | 33 [27–40] | |
| Long-axis diameter of solid portion in the maximal lesion (mm) | ||||
| Mean ± SD | 20.8±17.0 | 18.9±16.6 | 35.1±12.8 | <0.001 |
| Median [IQR] | 19 [9–29] | 17 [7–26] | 33 [27–40] | |
Data were presented as n (%) if not otherwise specified. AC, adenocarcinoma; GGN, ground-glass nodule; GGO, ground-glass opacity; IQR, interquartile range; SCC, squamous cell carcinoma; SD, standard deviation.
Table 3
| Estimated characteristics | Univariable analysis | Multivariable analysis† | Multivariable analysis‡ | |||||
|---|---|---|---|---|---|---|---|---|
| OR (95% CI) | P value | OR (95% CI) | P value | OR (95% CI) | P value | |||
| Age (years) (per 1 year increased) | 0.982 (0.932–1.034) | 0.49 | ||||||
| Gender (male vs. female) | 2.06 (0.71–6.02) | 0.19 | 1.78 (0.10–31.05) | 0.69 | 2.36 (0.13–44.21) | 0.57 | ||
| Smoking status (current/former vs. never) | 3.51 (1.21–10.17) | 0.021 | 1.13 (0.078–16.32) | 0.93 | 0.82 (0.043–15.33) | 0.89 | ||
| Preoperative comorbidity (present vs. absent) | 0.60 (0.18–1.97) | 0.40 | – | – | – | – | ||
| Number of the lesions (≥3 vs. 2) | 1.35 (0.35–5.20) | 0.67 | – | – | – | – | ||
| Location of the lesions (contralateral vs. ipsilateral) | 1.00 (0.34–2.95) | 0.99 | – | – | – | – | ||
| Surgical procedure (≥1 lobectomy vs. lobectomy vs. sub-lobar resections) | 0.87 (0.49–1.58) | 0.69 | – | – | – | – | ||
| Histology (AC + AC vs. AC + SCC vs. SCC + SCC) | 1.16 (0.29–4.59) | 0.84 | – | – | – | – | ||
| Pleural invasion (present vs. absent) | 3.73 (1.28–10.89) | 0.016 | 1.25 (0.25–6.19) | 0.79 | 1.18 (0.25–5.72) | 0.83 | ||
| Lymphovascular invasion (present vs. absent) | 12.89 (3.70–44.90) | <0.001 | 7.04 (1.26–39.19) | 0.026 | 7.70 (1.35–43.87) | 0.021 | ||
| T stage of the maximal lesion (T3–4vs. T2vs. Tis–1) | 3.51 (1.48–8.33) | 0.004 | 0.22 (0.013–3.50) | 0.28 | 0.25 (0.017–3.67) | 0.31 | ||
| T stages of the lesions (T2–4 + T2–4vs. Tis–1 + T2–4vs. Tis–1 + Tis–1) | 7.42 (2.75–20.02) | <0.001 | 12.67 (1.13–142.14) | 0.040 | 13.08 (1.45–118.37) | 0.022 | ||
| Pure solid nodules (≥2 lesions vs. 1 lesion vs. absent) | 3.26 (1.58–6.73) | 0.001 | 1.43 (0.22–9.13) | 0.71 | – | – | ||
| CTR (≥2 lesions ≥0.90 vs. 1 lesion ≥0.90 vs. all lesions <0.90) | 3.91 (1.75–8.73) | 0.001 | 0.43 (0.041–4.46) | 0.48 | – | – | ||
| Presence of spiculation (≥2 lesions vs. 1 lesion vs. absent) | 4.47 (2.08–9.62) | <0.001 | 1.73 (0.45–6.71) | 0.43 | – | – | ||
| Presence of lobulation (≥2 lesions vs. 1 lesion vs. absent) | 8.26 (1.12–61.14) | 0.039 | 4.77 (0.29–77.61) | 0.27 | 2.31 (0.18–30.11) | 0.52 | ||
| Presence of bubble-like vacuole (≥2 lesions vs. 1 lesion vs. absent) | 0.84 (0.32–2.22) | 0.72 | – | – | – | – | ||
| Presence of air bronchogram (pathologically vs. normally vs. absent) | 0.90 (0.49–1.66) | 0.73 | – | – | – | – | ||
| Presence of pleural indentation (≥2 lesions vs. 1 lesion vs. absent) | 0.91 (0.40–2.04) | 0.81 | – | – | – | – | ||
| Long-axis diameter of the maximal lesion (mm) (per 1 mm increased) | 1.041 (1.009–1.074) | 0.011 | Insufficient data | 0.99 | – | – | ||
| Long-axis diameter of solid portion in the maximal lesion (per 1 mm increased) | 1.048 (1.018–1.078) | 0.001 | Insufficient data | 0.99 | – | – | ||
| CT-based multi-parameter decision tree algorithm model (per split proceed) | 2.21 (1.59–3.08) | <0.001 | – | – | 2.09 (1.29–3.37) | 0.003 | ||
†, the multivariable binary logistic regression model (model A) was established on the original parameters estimated on chest CT images and other clinicopathological characteristics with P<0.20 in the univariable analysis (Hosmer-Lemeshow test P=0.79); ‡, the multivariable binary logistic regression model (model B) was established on the novel CT-based multi-parameter decision tree algorithm model and other clinicopathological characteristics with P<0.20 in the univariable analysis (Hosmer-Lemeshow test P=0.88). AC, adenocarcinoma; CI, confidence interval; CT, computed tomography; CTR, consolidation tumor ratio; OR, odds ratio; SCC, squamous cell carcinoma; SMPLC, synchronous multiple primary lung cancer.
Construction of CT-DTA model
We incorporated categorical data of all the above six imaging parameters showing univariable P<0.050 to train a DTA model. As illustrated in Figure 2, a DTA model consisting of presence of spiculation, long-axis diameters of the lesions and solid portion in the lesions, CTR, and number of PSN, which had been named as the CT-DTA model, was finally generated from the training cohort of patients.
This CT-DTA model contains seven leaf nodes with a predicted probability of LNM ranged from 0.1% to 45.4%. Notably, the importance of CT-based covariables contributing to the risk of LNM as estimated by SHAP values was visualized in Figure 3. Long-axis diameters of the solid portions was considered as the most predominant risk factor of LNM, followed by the presence of spiculation, CTR, the number of PSN, and long-axis diameters of the lesions.
Validation of CT-DTA model
Predictive performance of CT-DTA model
Figure 4 shows the AUCs of CT-DTA model for predicting the risk of LNM in both the training cohort and the validation cohort. This CT-DTA model was found to have an excellent predictive accuracy to distinguish the patients developed with LNM in the training cohort, with a clinically meaningful AUC of 0.905 (95% CI: 0.851–0.958; P<0.001). The discriminative power of CT-DTA model was further externally validated among patients of the validation cohort. We found that this CT-DTA model still played substantially predictive roles for the risk of LNM, as revealed by an AUC of 0.812 (95% CI: 0.699–0.926; P<0.001). There was no significant distinction in the predictive accuracy of CT-DTA model between the training cohort and the validation cohort (DeLong test P=0.14).
By plotting the calibration curves with Bootstrap repetitions in both the training cohort and the validation cohort, we also confirmed a significant agreement between the incidences predicted by CT-DTA model and the real-world incidences of LNM, as shown in Figure 5.
Predictive significance of CT-DTA model
In the training cohort, we run a multivariable logistic regression analysis again by replacing the raw data of TS-CT signs with our CT-DTA model (model B in Table 3). Finally, we verified that CT-DTA model was considered as the leading risk factor for LNM (OR: 2.09; 95% CI: 1.29–3.37; P=0.003; Table 3).
The predictive independence of CT-DTA model was further externally validated by multivariable logistic regression analyses in the validation cohort. As detailed in Table 4, we found none of the imaging characteristics showed statistical significance to be correlated with the occurrence of LNM in the validation cohort of patients (model A). A DTA model by sufficiently incorporating these variables on CT images could play as a grading system to predict the risk of LNM independently. When incorporating our CT-DTA model, a new multivariable logistic regression analysis (model B in Table 4) demonstrated that this newly established CT-DTA model by sufficiently integrating the CT imaging variables could independently predict the development of LNM (OR: 1.53; 95% CI: 1.02–2.31; P=0.041).
Table 4
| Estimated characteristics | Univariable analysis | Multivariable analysis† | Multivariable analysis‡ | |||||
|---|---|---|---|---|---|---|---|---|
| OR (95% CI) | P value | OR (95% CI) | P value | OR (95% CI) | P value | |||
| Age (years) (per 1 year increased) | 1.015 (0.955–1.078) | 0.64 | – | – | – | – | ||
| Gender (male vs. female) | 1.86 (0.57–6.03) | 0.30 | – | – | – | – | ||
| Smoking status (current/former vs. never) | 0.74 (0.19–2.92) | 0.66 | – | – | – | – | ||
| Preoperative comorbidity (present vs. absent) | 0.95 (0.24–3.77) | 0.94 | – | – | – | – | ||
| Number of the lesions (≥3 vs. 2) | 1.62 (0.39–6.68) | 0.51 | – | – | – | – | ||
| Location of the lesions (contralateral vs. ipsilateral) | 1.09 (0.31–3.89) | 0.89 | – | – | – | – | ||
| Surgical procedure (≥1 lobectomy vs. lobectomy vs. sub-lobar resections) | 1.45 (0.65–3.23) | 0.36 | – | – | – | – | ||
| Histology (AC + AC vs. AC + SCC vs. SCC + SCC) | 1.04 (0.19–5.86) | 0.96 | – | – | – | – | ||
| Pleural invasion (present vs. absent) | 12.12 (3.24–45.27) | <0.001 | Insufficient data | 1.0 | 5.55 (0.051–607.82) | 0.78 | ||
| Lymphovascular invasion (present vs. absent) | 31.60 (7.03–141.97) | <0.001 | 27.93 (1.85–421.25) | 0.016 | 41.12 (3.26–519.21) | 0.004 | ||
| T stage of the maximal lesion (T3–4vs. T2vs. Tis–1) | 3.99 (1.86–8.57) | <0.001 | Insufficient data | 1.0 | 1.47 (0.092–23.40) | 0.79 | ||
| T stages of the lesions (T2–4 + T2–4vs. Tis–1 + T2–4vs. Tis–1 + Tis–1) | 3.69 (1.49–9.10) | 0.005 | 0.45 (0.020–10.36) | 0.62 | 0.24 (1.020–2.93) | 0.26 | ||
| Pure solid nodules (≥2 lesions vs. 1 lesion vs. absent) | 4.15 (1.76–9.76) | 0.001 | Insufficient data | 1.0 | – | – | ||
| CTR (≥2 lesions ≥0.90 vs. 1 lesion ≥0.90 vs. all lesions <0.90) | 2.87 (1.28–6.44) | 0.010 | Insufficient data | 1.0 | – | – | ||
| Presence of spiculation (≥2 lesions vs. 1 lesion vs. absent) | 3.74 (1.67–8.39) | 0.001 | 2.70 (0.66–11.07) | 0.17 | – | – | ||
| Presence of lobulation (≥2 lesions vs. 1 lesion vs. absent) | 2.14 (0.62–7.43) | 0.23 | – | – | – | – | ||
| Presence of bubble-like vacuole (≥2 lesions vs. 1 lesion vs. absent) | 1.24 (0.54–2.83) | 0.62 | – | – | – | – | ||
| Presence of air bronchogram (pathologically vs. normally vs. absent) | 1.23 (0.56–2.72) | 0.61 | – | – | – | – | ||
| Presence of pleural indentation (≥2 lesions vs. 1 lesion vs. absent) | 3.16 (1.30–7.68) | 0.011 | 1.97 (0.39–9.85) | 0.41 | 3.02 (0.68–13.42) | 0.15 | ||
| Long-axis diameter of the maximal lesion (mm) (per 1 mm increased) | 1.044 (1.005–1.085) | 0.027 | 1.000 (0.84–1.19) | 1.0 | – | – | ||
| Long-axis diameter of solid portion in the maximal lesion (per 1 mm increased) | 1.036 (1.004–1.068) | 0.026 | 0.97 (0.84–1.13) | 0.72 | – | – | ||
| CT-based multi-parameter decision tree algorithm model (per split proceed) | 1.71 (1.25–2.36) | 0.001 | – | – | 1.53 (1.02–2.31) | 0.041 | ||
†, the multivariable binary logistic regression model (model A) was established on the original parameters estimated on chest CT images and other clinicopathological characteristics with P<0.20 in the univariable analysis (Hosmer-Lemeshow test P=0.91); ‡, the multivariable binary logistic regression model (model B) was established on the novel CT-based multi-parameter decision tree algorithm model and other clinicopathological characteristics with P<0.20 in the univariable analysis (Hosmer-Lemeshow test P=0.21). AC, adenocarcinoma; CI, confidence interval; CT, computed tomography; CTR, consolidation tumor ratio; OR, odds ratio; SCC, squamous cell carcinoma; SMPLC, synchronous multiple primary lung cancer.
Risk stratification according to CT-DTA model
With the aim to help therapeutic decision-making based on accurate risk stratification of LNM preoperatively, we tried to classify the two independent cohorts into low-risk (predictive probability 0.14–24.40%) and high-risk (predictive probability 31.96–45.40%) populations in compliance with the leaf node holding the maximum value of Youden index (0.71) based on the training cohort of patients (Figure 6A). Given such criteria, there were 110 (79.1%) low-risk and 29 (20.9%) high-risk patients in the training cohort, and 80 (83.3%) low-risk and 16 (16.7%) high-risk patients in the validation cohort, respectively.
A significant difference was observed in the incidence of LNM between low-risk (3.6%) and high-risk (41.4%) patients in the training cohort (P<0.001; Figure 6B). Subsequently, a multivariable logistic regression analysis determined that CT-DTA model (OR: 12.01; 95% CI: 2.32–62.32; P=0.003), when analyzed as a risk stratification tool, could be the strongest risk factor for LNM (Table 5). Risk stratification according to CT-DTA model was further externally validated since the high-risk (43.8%) patients had a significantly elevated incidence of LNM when compared to the low-risk (7.5%) patients in the validation cohort (P<0.001; Figure 6B). Finally, when evaluating CT-DTA model in terms of a risk stratification tool, its independent predictive value for the risk of LNM was still stable in the validation cohort as demonstrated by a multivariable logistic regression analysis (OR: 8.11; 95% CI: 1.19–55.30; P=0.033; Table 5).
Table 5
| Estimated characteristics | OR (95% CI) (multivariable analysis)† |
P value |
|---|---|---|
| Training cohort | ||
| Risk stratification by CT-DTA model (high-risk vs. low-risk) | 12.01 (2.32–62.32) | 0.003 |
| Gender (male vs. female) | 1.91 (0.13–29.04) | 0.64 |
| Smoking status (current/former vs. never) | 1.08 (0.064–18.40) | 0.96 |
| Pleural invasion (present vs. absent) | 1.34 (0.30–5.96) | 0.70 |
| Lymphovascular invasion (present vs. absent) | 7.48 (1.45–38.65) | 0.016 |
| T stage of the maximal lesion (T3–4vs. T2vs. Tis–1) | 5.35 (0.38–76.92) | 0.21 |
| T stages of the lesions (T2–4 + T2–4vs. Tis–1 + T2–4vs. Tis–1 + Tis–1) | 15.74 (1.88–131.46) | 0.011 |
| Presence of lobulation (≥2 lesions vs. 1 lesion vs. absent) | 2.73 (0.26–28.17) | 0.40 |
| Validation cohort | ||
| Risk stratification by CT-DTA model (high-risk vs. low-risk) | 8.11 (1.19–55.30) | 0.033 |
| Pleural invasion (present vs. absent) | 13.28 (0.092–1,924.86) | 0.31 |
| Lymphovascular invasion (present vs. absent) | 21.88 (3.45–138.67) | 0.001 |
| T stage of the maximal lesion (T3–4vs. T2vs. Tis–1) | 1.31 (0.070–24.41) | 0.86 |
| T stages of the lesions (T2–4 + T2–4vs. Tis–1 + T2–4vs. Tis–1 + Tis–1) | 6.94 (0.43–111.11) | 0.17 |
| Presence of pleural indentation (≥2 lesions vs. 1 lesion vs. absent) | 3.01 (0.68–13.36) | 0.15 |
†, the multivariable binary logistic regression model based on the training cohort: Hosmer-Lemeshow test P=0.84; the multivariable binary logistic regression model based on the validation cohort: Hosmer-Lemeshow test P=0.75. CI, confidence interval; CT-DTA, computed tomography-based multi-parametric decision tree algorithm; OR, odds ratio; SMPLC, synchronous multiple primary lung cancer.
Subgroup analyses on the entire cohort
As AUC values generated from subgroup ROC analyses indicated, the predictive accuracy of CT-DTA model for the risk of LNM remained significantly reliable across all the subgroups of age, gender, smoking status, preoperative comorbidity, location of lesions, histology, pleural invasion, lympho-vascular invasion, and T-stages of the lesions (Figure 7).
Another forest plot depicting the OR statistics of CT-DTA model from subgroup multivariable logistic regression analyses was shown in Figure 8. After controlling confounding effects from other clinicopathological covariables, we found that the significance of CT-DTA model as an independent risk factor for LNM continued to stay robust across all the subgroups stratified by gender, smoking status, histology, pleural invasion, lympho-vascular invasion, and T-stages of the lesions. Furthermore, this CT-DTA model could also be employed to independently predict the risk of LNM among the elderly patients, patients without any underlying comorbidity and those with lesions distributed on the ipsilateral side of the lung (Figure 8).
Discussion
Key results and interpretations
To our knowledge, this is the first time to employ a DTA modeling technique in multi-parametric risk assessment based on the relevant imaging features measured on chest TS-CT particularly for surgically resectable SMPLC. In this multi-center study, we had established a novel and non-invasive CT-DTA model by efficiently integrating five key determinants from chest TS-CT preoperatively, including CTR, presence of spiculation, number of PSNs, the long-axis diameters of the lesions and solid portion in the lesions, in order to precisely predict the incidence of LNM before undergoing surgery for SMPLC. After externally validated by a series of functional analyses, the excellent performance of CT-DTA model might have the potential to alert thoracic surgeons of high-risk of LNM in advance.
One of our focuses was to insert a novel DTA model into conventional radiological evaluation in the current clinical practice of surgically resectable SMPLC. For the first time, this study offered multivariable results showing possible CT-derived features related to the risk of LNM in surgical patients with SMPLC but none of them was observed with any predictive independence. Given such concerns, we developed a DTA model by incorporating five critical imaging parameters with statistical significance in the univariable analyses, and finally validated it as a clinically useful risk assessment tool by ROC analysis, calibration curve and multivariable logistic regression analysis in discerning which patients could easily develop N1–2 stage LNM. The seven leaf nodes generated in the CT-DTA model were from binary splits of five pivotal CT-based imaging features of pulmonary nodules, including CTR, presence of spiculation, number of PSNs, the long-axis diameters of the lesions and solid portion in the lesions. The following possible evidence from most recent investigations might help to elucidate the predictive strength of our CT-DTA model.
Firstly, a margin spiculation sign apparent on chest CT could be indicative of fibrotic stroma or desmoplasia, which was characterized by tumor microenvironmental remodeling due to desmoplastic reaction, proliferation of fibroblasts and dense deposition of extracellular matrix, all of which participate in modulating normal stroma into tumor stroma and then enhancing the growth and viability of cancer cells (16). During the process of oncogenesis, cancer-associated fibroblasts produce a variety of tumor growth factors, cytokines, chemokines and immune modulators and play an essential role in tissue fibrosis and desmoplasia (17). Therefore, the presence of spiculation is generally associated with LNM or distant metastasis from primary NSCLC, even though at an early stage when the disease was newly diagnosed. Secondly, it had been well demonstrated by a meta-analysis that high proportion of solid component in a nodule on chest CT was significantly associated with unfavorable pathological characteristics and overall survival of NSCLC (18,19). CTR >0.90, especially appeared in all the existing nodules, had also been proved as one key contributor in this CT-DTA model to the occurrence of LNM. Moreover, as previously reported, we had clarified the substantially prognostic roles of long-axis diameter of the lesion and PSN for spread through air spaces and worse overall survival in single or multiple lung adenocarcinoma (13,20). Choi et al. (21) had also emphasized the importance of lymph node evaluation in pure-solid NSCLC with no less than 2 cm in its long-axis diameter since such a PSN almost always had a strong linkage to invasive features in pathology.
Another highlight of our multi-center study was to perform both ROC analysis and multivariable logistic regression analysis on the clinical significance of CT-DTA model in each specific set of patient subgroups. Finally, CT-DTA model was found to be well validated for predicting the risk of LNM across nearly all the subgroups classified according to clinicopathological characteristics, especially in those who were traditionally regarded as the low-risk populations, such as female non-smokers, the patients without any underlying comorbidity, pleural or lympho-vascular invasion, and the patients diagnosed at early stage of the tumors. As regards the subgroups which failed to generate significant OR statistics, we speculated that the restricted sample size within those subgroups had the potential to attenuate the analytical performance when employing risk evaluations.
Clinical implications
Our findings provide solid evidence to support the involvement of a novel, easy-to-use and well-validated CT-DTA model in risk stratification prior to curative-intent resections for SMPLC to distinguish the patients who have a higher risk of LNM more accurately. The predictive probability can be exactly extrapolated according to this CT-DTA model, whose imaging features can be conveniently and non-invasively measured through chest TS-CT in routine practice. Moreover, the predictive accuracy of LNM in SMPLC may be obviously improved under the assistance of CT-DTA model before surgery. Therefore, this CT-DTA model has been proposed to aid thoracic surgeons in more accurate risk evaluation and then facilitate decision-making process on more individualized treatment plans following surgery to limit potential adverse events.
Limitations
Despite the above insightful findings, the following several limitations in this study should be sufficiently acknowledged. First of all, it was designed as a retrospective cohort study based on three prospectively maintained datasets with external validation and internal subgroup analyses. Due to the intrinsic limitations of retrospective nature, potential selection bias, such as variations in clinical pathways and treatment options across different centers, might still have weakened the demonstrative power of CT-DTA model as a reliable risk prediction tool. Second, due to the fact that the SMPLC itself belongs to a rare subtype of NSCLC, the relatively small sample size, while enrolled from three high-volume tertiary centers, might have brought negative effects on the statistical strength. Thus, a prospective study covering much more tertiary centers is urgently needed, and that is also a future project led by our research team on the basis of the present study. Third, our CT-DTA model had no efficacy to prejudge the nodal station or number of lymph nodes involved by metastatic cancer cells. Finally, qualitative evaluation of radiological features sometimes depends on the expertise of radiologists, and thus, such a subjective factor might result in confounding influence.
Conclusions
In conclusion, this study has proposed a novel user-friendly and non-invasive DTA model based on multiple radiological features easily obtained on chest TS-CT for accurate risk prediction of LNM before surgical resection for SMPLC. The CT-DTA model can serve as a practically useful tool to improve the predictive performance of traditional risk assessment and aid thoracic surgeons in decision-making of necessary lymphadenectomy and adjunctive therapies for potential high-risk patients with SMPLC who intend to undergo radical surgery. Larger-scale multi-center prospective studies are warranted to further validate the CT-DTA model for clinical utility.
Acknowledgments
We give special thanks to Dr. Chuanmiao Xie, from Department of Radiology, State Key Laboratory of Oncology in South China, Collaborative Innovation Center for Cancer Medicine, Sun Yat-Sen University Cancer Center, for his great assistance in this study. We also give special thanks to Mrs. Hong Xie and Mrs. Peng Wang, from Department of Medical English, West China School of Medicine, Sichuan University, for their English language editing to this manuscript.
Footnote
Reporting Checklist: The authors have completed the TRIPOD+AI reporting checklist. Available at https://qims.amegroups.com/article/view/10.21037/qims-24-2440/rc
Funding: This work was supported by grants from
Conflicts of Interest: All authors have completed the ICMJE uniform disclosure form (available at https://qims.amegroups.com/article/view/10.21037/qims-24-2440/coif). The authors have no conflicts of interest to declare.
Ethical Statement: The authors are accountable for all aspects of the work in ensuring that questions related to the accuracy or integrity of any part of the work are appropriately investigated and resolved. This study was approved by the Institutional Review Board of Sun Yat-sen University Cancer Center (No. B2022-293-Y01). All participating institutions were informed and agreed the study. The study was conducted in accordance with the Declaration of Helsinki and its subsequent amendments. The participants were required to give informed consent before taking part.
Open Access Statement: This is an Open Access article distributed in accordance with the Creative Commons Attribution-NonCommercial-NoDerivs 4.0 International License (CC BY-NC-ND 4.0), which permits the non-commercial replication and distribution of the article with the strict proviso that no changes or edits are made and the original work is properly cited (including links to both the formal publication through the relevant DOI and the license). See: https://creativecommons.org/licenses/by-nc-nd/4.0/.
References
- Jia X, Wang Y, Zhang H, Sun D. Current status and quality of prognosis prediction models of non-small cell lung cancer constructed using computed tomography (CT)-based radiomics: a systematic review and radiomics quality score 2.0 assessment. Quant Imaging Med Surg 2024;14:6978-89. [Crossref] [PubMed]
- Liu Z, Wang L, Gao S, Xue Q, Tan F, Li Z, Gao Y. Plasma metabolomics study in screening and differential diagnosis of multiple primary lung cancer. Int J Surg 2023;109:297-312. [Crossref] [PubMed]
- Jiang C, Zhang Y, Fu F, Deng P, Chen H. A Shift in Paradigm: Selective Lymph Node Dissection for Minimizing Oversurgery in Early Stage Lung Cancer. J Thorac Oncol 2024;19:25-35. [Crossref] [PubMed]
- Zhang R, Wang G, Lin Y, Wen Y, Huang Z, Zhang X, Yu X, Wang W, Xi K, Cerfolio RJ, D'Journo XB, Ruetzler K, Depypere L, Filosso PL, Zhang L. written on behalf of AME Thoracic Surgery Collaborative Group. Extent of resection and lymph node evaluation in early stage metachronous second primary lung cancer: a population-based study. Transl Lung Cancer Res 2020;9:33-44. [Crossref] [PubMed]
- Xie Z, Yang Y, Niu Z, Mao G, Zhu X, Xu Z, Yang D, Wang H, Wang J. Preoperative computed tomography semantic features in predicting lymph node metastasis of part-solid nodules in non-small cell lung cancer: a multicenter retrospective study. Quant Imaging Med Surg 2024;14:5151-63. [Crossref] [PubMed]
- Xie X, Yan H, Liu K, Guan W, Luo K, Ma Y, Xu Y, Zhu Y, Wang M, Shen W. Value of dual-layer spectral detector CT in predicting lymph node metastasis of non-small cell lung cancer. Quant Imaging Med Surg 2024;14:749-64. [Crossref] [PubMed]
- Luo Y, Li S, Ma H, Zhang W, Liu B, Xie C, Li Q. CT-based decision tree model for predicting EGFR mutation status in synchronous multiple primary lung cancers. J Thorac Dis 2023;15:1196-209. [Crossref] [PubMed]
- Wan F, He W, Zhang W, Zhang H, Zhang Y, Guang Y. Application of decision tree algorithms to predict central lymph node metastasis in well-differentiated papillary thyroid carcinoma based on multimodal ultrasound parameters: a retrospective study. Quant Imaging Med Surg 2023;13:2081-97. [Crossref] [PubMed]
- Kozower BD, Larner JM, Detterbeck FC, Jones DR. Special treatment issues in non-small cell lung cancer: Diagnosis and management of lung cancer, 3rd ed: American College of Chest Physicians evidence-based clinical practice guidelines. Chest 2013;143:e369S-99S.
- Detterbeck FC, Boffa DJ, Kim AW, Tanoue LT. The Eighth Edition Lung Cancer Stage Classification. Chest 2017;151:193-203.
- Fernandez FG, Falcoz PE, Kozower BD, Salati M, Wright CD, Brunelli A. The Society of Thoracic Surgeons and the European Society of Thoracic Surgeons general thoracic surgery databases: joint standardization of variable definitions and terminology. Ann Thorac Surg 2015;99:368-76. [Crossref] [PubMed]
- Zhang Y, Li G, Li Y, Liu Q, Yu Y, Ma Y, Pan Y, Zhang Y, Hu H, Sun Y, Zhang Y, Xiang J, Chen H. Imaging Features Suggestive of Multiple Primary Lung Adenocarcinomas. Ann Surg Oncol 2020;27:2061-70. [Crossref] [PubMed]
- Ma H, Li S, Zhu Y, Zhang W, Luo Y, Liu B, Gou W, Xie C, Li Q. A Novel Prognostic Score Based on Multiple Quantitative Parameters of Chest CT for Patients with Synchronous Multiple Primary Lung Cancer: Is Solid Component Size a Better Prognostic Indicator? Ann Surg Oncol 2023;30:3769-78. [Crossref] [PubMed]
- Grant SW, Collins GS, Nashef SAM. Statistical Primer: developing and validating a risk prediction model. Eur J Cardiothorac Surg 2018;54:203-8. [Crossref] [PubMed]
- Wang Y, Zhang L, Jiang Y, Cheng X, He W, Yu H, Li X, Yang J, Yao G, Lu Z, Zhang Y, Yan S, Zhao F. Multiparametric magnetic resonance imaging (MRI)-based radiomics model explained by the Shapley Additive exPlanations (SHAP) method for predicting complete response to neoadjuvant chemoradiotherapy in locally advanced rectal cancer: a multicenter retrospective study. Quant Imaging Med Surg 2024;14:4617-34. [Crossref] [PubMed]
- Kim H, Park CM. Tumor-associated prognostic factors extractable from chest CT scans in patients with lung cancer. Transl Lung Cancer Res 2023;12:1133-9. [Crossref] [PubMed]
- Yang H, Sun B, Ma W, Fan L, Xu K, Jia Y, Xu J, Wang Z, Yao F. Multi-scale characterization of tumor-draining lymph nodes in resectable lung cancer treated with neoadjuvant immune checkpoint inhibitors. EBioMedicine 2022;84:104265. [Crossref] [PubMed]
- Jing W, Liu M, Li W, Li D, Wu Y, Lv F. Prognostic implication of consolidation-to-tumor ratio in early lung adenocarcinoma: a retrospective cross-sectional study. Quant Imaging Med Surg 2024;14:3366-80. [Crossref] [PubMed]
- Nie Y, Wang X, Yang F, Zhou Z, Wang J, Chen K. Surgical Prognosis of Synchronous Multiple Primary Lung Cancer: Systematic Review and Meta-Analysis. Clin Lung Cancer 2021;22:341-350.e3. [Crossref] [PubMed]
- Liu BC, Ma HY, Huang J, Luo YW, Zhang WB, Deng WW, Liao YT, Xie CM, Li Q. Does dual-layer spectral detector CT provide added value in predicting spread through air spaces in lung adenocarcinoma? A preliminary study. Eur Radiol 2024;34:4176-86. [Crossref] [PubMed]
- Choi S, Yoon DW, Shin S, Kim HK, Choi YS, Kim J, Shim YM, Cho JH. Importance of Lymph Node Evaluation in ≤2-cm Pure-Solid Non-Small Cell Lung Cancer. Ann Thorac Surg 2024;117:586-93. [Crossref] [PubMed]

