Development and validation of intratumoral and peritumoral CT-based radiomics nomograms for predicting malignancy and invasiveness of pulmonary ground-glass nodules: a retrospective study
Introduction
With the widespread adoption of low-dose computed tomography (CT) screening, heightened individual health awareness, and the extensive use of artificial intelligence (AI), the detection rate of pulmonary nodules has significantly increased, particularly for ground-glass nodules (GGNs) (1,2). The pathological outcomes of GGNs can be malignant tumors or benign lesions, which include inflammatory lesions, lung interstitial diseases, and more (3-5). It has been reported that the incidence of malignancy in GGNs is considerably higher than that of solid nodules, especially for mixed GGNs (mGGNs) (6). Despite the high probability of malignancy associated with GGNs, many surgically resected GGNs have been confirmed as benign. These benign GGNs are often misdiagnosed as lung cancers, leading to unnecessary surgical excisions. An accurate and effective non-invasive tool is needed to predict the malignancy of lung GGNs preoperatively. Currently, it remains very challenging for radiologists to visually discriminate between benign and malignant GGNs directly from CT images as the CT features of benign and malignant GGNs are often atypical (7).
Lung adenocarcinoma can be pathologically classified into four subtypes: atypical adenomatous hyperplasia (AAH), adenocarcinoma in situ (AIS), minimally invasive adenocarcinoma (MIA), and invasive adenocarcinoma (IAC), as per the lung adenocarcinoma classification system proposed by the International Association for the Study of Lung Cancer (IASLC)/American Thoracic Society (ATS)/European Respiratory Society (ERS) international multidisciplinary classification in 2011 (8). The clinical management strategies, surgical procedures, and prognosis for non-invasive GGNs such as AAH, AIS, and MIA differ significantly from those for invasive GGNs such as IAC (9-11). Therefore, a definitive preoperative evaluation of the invasiveness of GGNs is essential to develop personalized clinical management strategies, optimal surgical plans, and improve prognosis.
Radiomics is a newly emerging form of imaging analysis that has the potential to transform digital medical images into countless quantitative features revealing pathophysiology. It is capable of extracting and quantifying subtle image features to explore their potential associations with tumor pathophysiology (12-15). However, most previous radiomics studies have mainly focused on the evaluation of the intratumoral region of GGNs while neglecting the peritumoral microenvironments, including tissue surrounding the tumor. To our knowledge, no radiomics-based studies have explored the peritumoral region beyond the tumor to distinguish benign and malignant GGNs. Given that peritumoral radiomic features have been reported to show promise in providing additional value for the clinical assessment of tumor aggressive biological behavior (16-19), we hypothesized that peritumoral microenvironmental features would provide a wealth of useful information for predicting the benign or malignant status of GGNs. Additionally, CT-based radiomics analysis for the prediction of the degree of invasiveness of GGNs is still insufficient, and a consensus has not been reached.
Herein, the first goal of our study was to explore the additional value of peritumoral CT radiomics in predicting the malignancy and invasiveness of GGNs by evaluating radiomic signatures capturing subtle changes from the gross tumor volume (GTV) and peritumoral volume (PTV) and comparing their performance. Our ultimate goal was to construct a noninvasive, comprehensive preoperative nomogram integrating CT radiomics parameters, radiological imaging features, and clinical characteristics to enhance the diagnostic accuracy for GGNs and to facilitate personalized clinical decision-making. The objectives of this study were to develop and internally validate multivariable radiomics nomograms integrating intratumoral and peritumoral features with clinical predictors for predicting (I) the malignancy status of pulmonary GGNs and (II) the invasiveness of malignant GGNs. We present this article in accordance with the TRIPOD reporting checklist (available at https://qims.amegroups.com/article/view/10.21037/qims-2026-1-0049/rc).
Methods
Patient population
The study was conducted in accordance with the Declaration of Helsinki and its subsequent amendments. This study was approved by the Institutional Review Board of The First Affiliated Hospital of the University of Science and Technology of China (ethics approval number: 2025-RE-164) and the requirement for individual consent for this retrospective analysis was waived. The patient inclusion criteria were as follows: (I) patients with GGNs confirmed by surgical excision and histopathological diagnosis; (II) thorax CT examination with thin-slice thickness (0.625–1 mm) and satisfactory image quality performed prior to any therapies; (III) GGNs less than 3 cm in diameter on lung window CT images; and (IV) complete clinical data. The exclusion criteria were as follows: (I) missing postoperative histopathological information; (II) lack of preoperative thin-section thorax CT scans; (III) low CT image quality with severe breathing artifacts; (IV) GGNs diameter greater than 3 cm; and (V) incomplete clinical data. No specific treatments were administered prior to CT imaging or surgery. We defined two binary clinical endpoints: (I) malignancy status (benign inflamed or infected lesions versus malignant GGNs including AAH, AIS, MIA, and IAC), and (II) invasiveness among pathologically confirmed malignant GGNs (non-invasive lesions: AAH, AIS, and MIA versus invasive lesion: IAC). Histopathological examination of surgically resected specimens served as the gold-standard reference for both endpoints. Separate binary classification models were developed for each endpoint. From January 2019 to June 2025, 1,280 patients with a total of 1,372 GGNs who underwent CT imaging at The First Affiliated Hospital of the University of Science and Technology of China and met the criteria were enrolled in our study. The following clinical characteristics of all patients, including gender, age, smoking history, and family history, were obtained from medical records. The following seven CT imaging features for each GGN, evaluated by consensus of two radiologists, were recorded: attenuation, maximum tumor diameter, spiculation, lobulation, vacuole, air bronchogram, and pleural indentation. Patients (rather than individual lesions) were randomly split into training and testing sets in a 7:3 ratio to avoid data leakage from patients with multiple GGNs. The overall patient enrollment flowchart of this study is depicted in Figure 1.
Image acquisition and segmentation
For all patients enrolled in our study, non-enhanced thoracic images were obtained respectively with one of the four multi-detector CT scanners and the detailed scanning parameters were displayed in Table 1. All patients were placed in the supine position, and whole-lung scans were performed from the thoracic entrance to the base of the lungs at the end of inspiration. The images were reconstructed with a slice thickness of 0.625–1 mm using soft tissue and lung algorithms.
Table 1
| Parameters | CT scanner [No.] | |||
|---|---|---|---|---|
| CT 256 [1] | CT 64 [2] | CT 64 [3] | CT 64 [4] | |
| Manufacture | General Electric | General Electric | Philips | Siemens |
| Tube voltage (kV) | 120 | 120 | 120 | 120 |
| Tube current (mA) | 200 | 180 | 180 | 180 |
| Matrix | 512×512 | 512×512 | 512×512 | 512×512 |
| Slice thickness (mm) | 0.625 | 0.625 | 0.625 | 1.000 |
| Detector collimation (mm) | 128×0.625 | 64×0.625 | 64×0.625 | 64×0.625 |
CT, computed tomography.
All regions of interest (ROIs) for the GGNs were manually delineated by a radiologist with 8 years of experience in pulmonary CT diagnosis using the ITK-SNAP software (version 3.8, www.itksnap.org) and further confirmed by another senior radiologist with 15 years of experience in pulmonary CT diagnosis. All radiologists were blinded to the pathological results. Two types of ROIs—GTV and PTV—were delineated from the CT images. For GTV segmentations, GGNs were manually delineated along their edges on each consecutive axial two-dimensional (2D) CT slice using lung window settings [window level, –450 Hounsfield units (HU); window width, 1,500 HU]. All GGNs meeting the inclusion criteria were included irrespective of their location (isolated, juxta-vascular, or juxta-pleural). Bronchi and large vessels were manually excluded from the GTV contours. PTV1 and PTV2 were generated by isotropic three-dimensional (3D) expansion of the GTV by 5 and 10 mm, respectively. Any voxels extending beyond the lung parenchyma were automatically excluded using the lung mask generated from the original CT images. An example of two types of ROIs including GTV and PTV is shown in Figure 2. To evaluate the reproducibility of the ROI segmentation, the intra-observer and inter-observer segmentations were confirmed by two experienced radiologists with 6 years and 8 years of experience, respectively.
Radiomics feature extraction
PyRadiomics (version 3.0) was used to extract a total of 1,372 radiomics features per ROI, including first-order, shape, gray-level co-occurrence matrix (GLCM), gray-level run-length matrix (GLRLM), gray-level size-zone matrix (GLSZM), neighboring gray-tone difference matrix (NGTDM), and gray-level dependence matrix (GLDM) features from the original image as well as those obtained after Laplacian of Gaussian (LoG), wavelet, and gradient transformations. The CT images were resampled to a uniform isotropic voxel size of 1×1×1 mm3 (original voxel size range: in-plane 0.5–0.8 mm, slice thickness 0.625–1 mm) and preprocessed with Gaussian smoothing to reduce noise.
PyRadiomics was used to extract a comprehensive set of radiomic features, applying transformations such as LoG filtering, wavelet transforms, and gradient-based operations. These transformations enabled the extraction of features across multiple frequency bands, capturing both low- and high-frequency intensity patterns. Image discretization into bins of size 25 was applied to standardized intensity values, enhancing the robustness of the feature extraction process. Image discretization was performed with a fixed bin width of 25 HU using absolute intensity resampling (no relative resampling was applied).
To assess the robustness of the extracted radiomic features, five pseudo masks were generated based on the original annotated ROI. These pseudo masks were created by applying random variations to the original annotation, simulating potential segmentation fluctuations. The intraclass and interclass correlation coefficient (ICCs) were used to evaluate the intra- and inter-observer agreement of feature extraction. When the ICC value was greater than 0.8, the consistency of feature extraction was considered good. Features with an ICC value lower than 0.8 were deemed to have poor consistency and were therefore removed.
Radiomics feature selection and CT-based radiomics model construction
Prior to feature selection and modeling, all radiomics features were z-score normalized (standardized to a mean of 0 and a standard deviation of 1) using parameters derived exclusively from the training set. The identical transformation was subsequently applied to the testing set.
Following the extraction of radiomic features and the calculation of ICCs to evaluate the consistency of features across the five pseudo masks, only those features with significant ICC values, which indicates excellent and reliable reproducibility, were retained. These features were then subjected to univariate logistic regression to assess their individual statistical significance. Features with P values less than 0.05 were selected for further analysis using least absolute shrinkage and selection operator (LASSO) regression.
For the combined radiomics models (GTV + PTV1 and GTV + PTV1 + PTV2), early fusion was implemented by concatenating the radiomics features extracted from the respective ROIs into a single feature vector prior to feature selection and model construction.
LASSO regression was used to select the most predictive features while minimizing overfitting. The optimal penalty coefficient was determined through internal cross-validation, ensuring that the selected features were robust and generalizable. Following LASSO selection, stepwise backward regression was applied using the Akaike information criterion (AIC) to optimize the model fit and retain the most predictive features.
The optimal radiomics features, chosen from the candidate features, were used to establish radiomics models. Radiomics scores (Radscores) were calculated by summing the selected features weighted by their coefficients.
The clinical characteristics and CT imaging features with statistical significance in the univariable logistic regression analysis were selected for a multivariable logistic regression to identify independent clinical predictors of malignancy and invasiveness of GGNs and build clinical factor models. The radiomics nomogram was constructed based on these independent clinical predictors and the Radscores of the combined radiomics model.
Statistical analysis
All statistical analyses were performed using either Python (version 3.11; Python Software Foundation, Wilmington, DE, USA) or R software (version 4.3.3; R Foundation for Statistical Computing, Vienna, Austria). The normality of continuous variables was assessed with the Shapiro-Wilk test. For normally distributed data, the independent t-test was used, whereas non-normally distributed data were analyzed using the Mann-Whitney U test. Either the chi-square test or Fisher’s exact test was applied to compare categorical variables, depending on the situation. Model performance was assessed using receiver operating characteristic (ROC) curves, with the area under the curve (AUC) calculated, and the DeLong test was used to compare AUCs across different models. There were no missing data for the predictors or outcomes in the final cohort, as only patients with complete clinical, imaging, and histopathological information were included (complete-case analysis).
Predictors were z-score normalized in the training set. Feature selection was performed using univariate logistic regression (P<0.05), followed by LASSO regression with 10-fold cross-validation and stepwise backward selection. Logistic regression models were developed. Internal validation was performed on a single held-out testing set (patient-level split). No model updating was performed. Model calibration was checked using calibration curves and further validated by the Hosmer-Lemeshow test (20). Decision curve analysis (DCA) was performed to examine the clinical net benefit of the model across various threshold probabilities (21). Statistical significance was set at P<0.05, with the Bonferroni correction applied for multiple comparisons. Variables with P values less than 0.05 in multivariate regression were considered independent predictors and were included in the final model.
Results
Patient characteristics and construction of a clinical factor model
A total of 1,280 patients were enrolled in our study, including 1,372 GGNs. Among these GGNs, 236 were defined as inflamed or infected benign lesions, 55 as AAH, 301 as AIS, 695 as MIA, and 339 as IAC. The clinical characteristics and CT imaging features of these patients are shown in Table 2.
Table 2
| Features | Training set | Validation set | |||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Benign group | Malignant group | P value | Preinvasive group | Invasive group | P value | Benign group | Malignant group | P value | Preinvasive group | Invasive group | P value | ||
| Gender | 0.008 | 0.023 | 0.223 | 0.032 | |||||||||
| Male | 65 | 263 | 187 | 76 | 27 | 123 | 81 | 42 | |||||
| Female | 84 | 548 | 403 | 145 | 35 | 224 | 174 | 50 | |||||
| Age, years | 50.70 | 53.07 | 0.020 | 51.15 | 58.21 | <0.001 | 51.50 | 54.44 | 0.063 | 52.20 | 60.63 | <0.001 | |
| Smoking history | 0.235 | 0.277 | 0.205 | 0.014 | |||||||||
| Yes | 6 | 19 | 16 | 3 | 4 | 11 | 4 | 7 | |||||
| No | 143 | 792 | 574 | 218 | 58 | 336 | 251 | 85 | |||||
| Family history | 0.130 | 0.212 | >0.99 | >0.99 | |||||||||
| Yes | 2 | 3 | 2 | 3 | 0 | 0 | 0 | 0 | |||||
| No | 147 | 808 | 587 | 221 | 62 | 347 | 255 | 92 | |||||
| Attenuation | 0.237 | <0.001 | 0.401 | <0.001 | |||||||||
| Pure | 76 | 371 | 314 | 57 | 28 | 137 | 116 | 21 | |||||
| Part solid | 73 | 440 | 276 | 164 | 34 | 210 | 139 | 71 | |||||
| Maximum tumor diameter, mm | 10.41 | 11.39 | 0.014 | 9.79 | 15.65 | <0.001 | 11.29 | 11.54 | 0.693 | 9.99 | 15.85 | <0.001 | |
| Spiculation | <0.001 | <0.001 | <0.001 | <0.001 | |||||||||
| Yes | 63 | 617 | 411 | 206 | 27 | 258 | 178 | 80 | |||||
| No | 86 | 194 | 179 | 15 | 35 | 89 | 77 | 12 | |||||
| Lobulation | <0.001 | <0.001 | <0.001 | <0.001 | |||||||||
| Yes | 72 | 643 | 443 | 200 | 37 | 275 | 193 | 82 | |||||
| No | 77 | 168 | 147 | 21 | 25 | 72 | 62 | 10 | |||||
| Vacuole | <0.001 | <0.001 | <0.001 | <0.001 | |||||||||
| Yes | 38 | 505 | 381 | 125 | 16 | 204 | 161 | 43 | |||||
| No | 111 | 305 | 209 | 96 | 46 | 143 | 94 | 49 | |||||
| Air bronchogram | <0.001 | <0.001 | 0.122 | <0.001 | |||||||||
| Yes | 15 | 187 | 79 | 108 | 9 | 81 | 39 | 42 | |||||
| No | 134 | 624 | 511 | 113 | 53 | 266 | 216 | 50 | |||||
| Pleural indentation | 0.208 | <0.001 | 0.024 | <0.001 | |||||||||
| Yes | 24 | 167 | 71 | 96 | 7 | 84 | 35 | 49 | |||||
| No | 125 | 644 | 519 | 125 | 55 | 263 | 220 | 43 | |||||
For malignancy predictions, Table 3 presents the risk factors as determined by univariate and multivariate logistic regression analyses. The final clinical factor model yielded an AUC of 0.774 in the training cohort and 0.724 in the validation cohort. For predictions of invasiveness, Table 4 shows the risk factors identified through univariate and multivariate logistic regression analyses. The final clinical factor model yielded an AUC of 0.815 in the training cohort and 0.798 in the validation cohort.
Table 3
| Clinical factors | Single logistic regression analysis | Multiple logistic regression analysis | |||||
|---|---|---|---|---|---|---|---|
| P value | OR | 95% CI | P value | OR | 95% CI | ||
| Gender | 0.003 | −0.548 | −0.908, −0.189 | 0.001 | −0.663 | −1.055, −0.270 | |
| Age | 0.019 | 0.018 | 0.003, 0.033 | 0.107 | 0.014 | −0.003, 0.032 | |
| Smoking history | 0.208 | −0.601 | −1.536, 0.334 | ||||
| Family history | 0.144 | −1.340 | −3.138, 0.458 | ||||
| Attenuation | 0.281 | 0.195 | −0.160, 0.549 | ||||
| Maximal tumor diameter | 0.049 | 0.041 | 0.000, 0.081 | 0.527 | −0.016 | −0.064, 0.033 | |
| Spiculation | 0.000 | 1.383 | 1.017, 1.749 | 0.035 | 0.537 | 0.038, 1.035 | |
| Lobulation | 0.000 | 1.348 | 0.981, 1.715 | 0.003 | 0.745 | 0.251, 1.239 | |
| Vacuole | 0.000 | 1.557 | 1.157, 1.957 | 0.000 | 1.374 | 0.952, 1.796 | |
| Air bronchogram | 0.001 | 0.939 | 0.380, 1.498 | 0.071 | 0.590 | −0.049, 1.229 | |
| Pleural indentation | 0.202 | 0.310 | −0.167, 0.787 | ||||
CI, confidence interval; GGNs, ground-glass nodules; OR, odds ratio.
Table 4
| Clinical factors | Single logistic regression analysis | Multiple logistic regression analysis | |||||
|---|---|---|---|---|---|---|---|
| P value | OR | 95% CI | P value | OR | 95% CI | ||
| Gender | 0.832 | 0.037 | −0.306, 0.380 | ||||
| Age | <0.001 | 0.030 | 0.016, 0.043 | 0.077 | 0.014 | −0.002, 0.030 | |
| Smoking history | 0.827 | 0.126 | −0.997, 1.248 | ||||
| Family history | 0.424 | −1.134 | −3.910, 1.643 | ||||
| Attenuation | <0.001 | 1.180 | 0.845, 1.516 | 0.002 | 0.605 | 0.219, 0.991 | |
| Maximal tumor diameter | <0.001 | 0.190 | 0.141, 0.239 | <0.001 | 0.107 | 0.050, 0.164 | |
| Spiculation | <0.001 | 1.886 | 1.531, 2.241 | <0.001 | 1.345 | 0.891, 1.799 | |
| Lobulation | <0.001 | 1.322 | 0.962, 1.682 | 0.962 | 0.012 | −0.462, 0.485 | |
| Vacuole | 0.002 | 0.507 | 0.183, 0.831 | 0.031 | 0.422 | 0.039, 0.806 | |
| Air bronchogram | <0.001 | 1.356 | 0.852, 1.860 | 0.159 | 0.412 | −0.161, 0.984 | |
| Pleural indentation | <0.001 | 1.289 | 0.760, 1.818 | 0.835 | 0.066 | −0.551, 0.683 | |
CI, confidence interval; GGNs, ground-glass nodules; OR, odds ratio.
Radiomics feature selection
A total of 4,116 standardized radiomics features, comprising 1,372 features from the GTV region, 1,372 from the PTV1 region (0–5 mm), and 1,372 from the PTV2 region (5–10 mm), were analyzed using logistic regression (P<0.05) to identify variables significantly associated with the classification of benign versus malignant GGNs. A total of 926 out of 1,372 (67.49%) GTV features, 1,028 out of 1,372 (74.93%) PTV1 features, and 997 out of 1,372 (72.67%) PTV2 features demonstrated good reproducibility, with intra-class and inter-class ICCs >0.8. Subsequently, LASSO regression was employed to eliminate redundant and irrelevant features, resulting in 30 features being retained for each ROI. We applied 10-fold cross-validation to determine the tuning parameter λ value. Ultimately, for malignancy prediction, 18 significant features were selected from the GTV region, whereas only 10 and 4 significant features were identified from the PTV1 region and PTV2 region, respectively (Figures S1-S3). For invasiveness predictions, 7, 12, and 3 significant features were selected from the GTV region, PTV1 region, and PTV2 region, respectively (Figures S4-S6).
Radiomics model performance evaluation
For malignancy predictions, the performance of the five models to distinguish between benign and malignant GGNs in the training cohort and validation cohort is summarized in Table 5. DeLong test indicated that the AUC values of the GTV + PTV1 radiomics model and the GTV + PTV1 + PTV2 radiomics model were significantly different from those of the GTV radiomics model in both the training and validation cohorts (P<0.05).
Table 5
| Cohort | Malignancy | Invasiveness | |||||||
|---|---|---|---|---|---|---|---|---|---|
| AUC | Accuracy | Sensitivity | Specificity | AUC | Accuracy | Sensitivity | Specificity | ||
| Training cohort | |||||||||
| Clinical | 0.774 (0.733–0.812) | 0.706 (0.678–0.733) | 0.702 (0.672–0.734) | 0.730 (0.652–0.797) | 0.815 (0.755–0.865) | 0.778 (0.731–0.820) | 0.782 (0.729–0.830) | 0.768 (0.671–0.848) | |
| GTV | 0.797 (0.762–0.831) | 0.606 (0.575–0.636) | 0.554 (0.520–0.588) | 0.902 (0.848–0.946) | 0.946 (0.930–0.960) | 0.885 (0.863–0.906) | 0.900 (0.874–0.923) | 0.839 (0.786–0.888) | |
| PTV1 | 0.815 (0.782–0.847) | 0.687 (0.656–0.719) | 0.657 (0.622–0.691) | 0.862 (0.801–0.915) | 0.935 (0.907–0.960) | 0.875 (0.840–0.909) | 0.860 (0.818–0.899) | 0.920 (0.857–0.976) | |
| PTV2 | 0.568 (0.515–0.617) | 0.761 (0.735–0.788) | 0.843 (0.817–0.867) | 0.298 (0.228–0.370) | 0.540 (0.493–0.585) | 0.383 (0.351–0.415) | 0.235 (0.202–0.266) | 0.843 (0.791–0.888) | |
| GTV + PTV1 | 0.837 (0.807–0.867) | 0.691 (0.661–0.721) | 0.661 (0.627–0.695) | 0.861 (0.801–0.915) | 0.955 (0.942–0.966) | 0.896 (0.874–0.917) | 0.893 (0.864–0.915) | 0.914 (0.875–0.951) | |
| GTV + PTV1 + PTV2 | 0.840 (0.811–0.869) | 0.673 (0.642–0.698) | 0.632 (0.598–0.663) | 0.889 (0.837–0.936) | 0.956 (0.943–0.967) | 0.899 (0.879–0.919) | 0.896 (0.872–0.919) | 0.912 (0.869–0.947) | |
| Nomogram | 0.870 (0.840–0.896) | 0.681 (0.653–0.711) | 0.635 (0.603–0.669) | 0.945 (0.902–0.979) | 0.962 (0.948–0.974) | 0.902 (0.881–0.920) | 0.888 (0.863–0.911) | 0.945 (0.908–0.974) | |
| Validation cohort | |||||||||
| Clinical | 0.724 (0.650–0.790) | 0.672 (0.629–0.716) | 0.665 (0.619–0.714) | 0.708 (0.585–0.820) | 0.798 (0.756–0.834) | 0.759 (0.729–0.788) | 0.782 (0.749–0.815) | 0.687 (0.621–0.747) | |
| GTV | 0.724 (0.660–0.783) | 0.613 (0.568–0.663) | 0.579 (0.529–0.635) | 0.806 (0.700–0.898) | 0.939 (0.915–0.961) | 0.834 (0.794–0.869) | 0.810 (0.762–0.857) | 0.905 (0.841–0.963) | |
| PTV1 | 0.795 (0.737–0.846) | 0.710 (0.667–0.752) | 0.701 (0.653–0.742) | 0.758 (0.646–0.857) | 0.927 (0.908–0.944) | 0.844 (0.817–0.870) | 0.821 (0.789–0.851) | 0.914 (0.874–0.952) | |
| PTV2 | 0.482 (0.403–0.559) | 0.788 (0.750–0.828) | 0.903 (0.874–0.932) | 0.143 (0.056–0.237) | 0.526 (0.455–0.591) | 0.334 (0.283–0.383) | 0.136 (0.095–0.181) | 0.942 (0.888–0.988) | |
| GTV + PTV1 | 0.791(0.734–0.845) | 0.824 (0.784–0.859) | 0.863 (0.826–0.9) | 0.578 (0.453–0.705) | 0.953 (0.935–0.972) | 0.883 (0.849–0.914) | 0.872 (0.832–0.909) | 0.919 (0.859–0.969) | |
| GTV + PTV1 + PTV2 | 0.790 (0.733–0.841) | 0.751 (0.711–0.791) | 0.764 (0.72–0.807) | 0.678 (0.563–0.79) | 0.954 (0.936–0.973) | 0.888 (0.851–0.922) | 0.878 (0.835–0.915) | 0.919 (0.857–0.968) | |
| Nomogram | 0.817 (0.756–0.870) | 0.788 (0.748–0.828) | 0.790 (0.749–0.831) | 0.774 (0.655–0.878) | 0.957 (0.934–0.976) | 0.886 (0.851–0.920) | 0.872 (0.827–0.914) | 0.930 (0.874–0.977) | |
Data in parentheses are 95% CI. AUC, area under the curve; CI, confidence interval; GGNs, ground-glass nodules; GTV, gross tumor volume; PTV, peritumoral volume.
For invasiveness predictions, the ability of the five models to distinguish between benign and malignant GGNs in the training cohort and validation cohort is summarized in Table 5. DeLong test showed that the AUC values of the GTV + PTV1 radiomics model and the GTV + PTV1 + PTV2 radiomics model were not significantly different from that of the GTV radiomics model in both training and validation cohorts. There was no significant difference in AUC between the GTV + PTV1 + PTV2 radiomics model and the GTV + PTV1 radiomics model in either the training or validation cohorts.
Radiomics nomogram construction and performance evaluation
As shown in Figure 3, for malignancy predictions, gender, spiculation, lobulation, vacuole, GTV-Radscore, PTV1 Radscore, and PTV2 Radscore were incorporated into the construction of the nomogram displayed in Figure 3A. The nomogram was constructed based on the training cohort. Calibration curves of the nomogram in both the training and validation cohorts are shown in Figure 3B,3C. The Hosmer-Lemeshow test showed that the nomogram fit well in both the training cohort (P=0.921) and the validation cohort (P=0.874). Figure 4A,4B presents the ROC curves of the clinical factor model, radiomics model, and comprehensive nomogram. The comprehensive nomogram yielded optimal predictive performance with AUC values of 0.870 and 0.817 in the training and validation cohort, respectively. In addition, DeLong test revealed that the AUC values of comprehensive nomogram were significantly different from those of the clinical factor model, GTV radiomics model, PTV2 radiomics model, and GTV + PTV1 + PTV2 radiomics models in both the training and validation cohorts (P<0.05). The AUC value of comprehensive nomogram was significantly different from that of the PTV1 radiomics model in the training cohort. The AUC value of the comprehensive nomogram was significantly different from that of the GTV + PTV1 radiomics model in the validation cohort. The DCA results for these discrimination models are shown in Figure 4C, indicating that the clinical factor model, GTV radiomics model, PTV1 radiomics model, GTV + PTV1 radiomics model, GTV + PTV1 + PTV2 radiomics model, and comprehensive nomogram had high clinical benefits, with the comprehensive nomogram achieving the highest overall net benefits in the prediction of malignancy of GGNs.
For invasiveness predictions, attenuation, maximal tumor diameter, spiculation, vacuole, GTV-Radscore, PTV1-Radscore, and PTV2-Radscore were integrated into the construction of the nomogram displayed in Figure 3D. The nomogram was constructed based on the training cohort. Calibration curves of the nomogram in both the training and validation cohorts are displayed in Figure 3E,3F. The Hosmer-Lemeshow test showed that the nomogram fit well in both the training cohort (P=0.254) and the validation cohort (P=0.125). Figure 4D,4E presents the ROC curves of the clinical factor model, radiomics model, and comprehensive nomogram. The comprehensive nomogram demonstrated optimal predictive performance with AUC values of 0.962 and 0.957 in the training and validation cohort, respectively. Furthermore, DeLong test showed that the AUC values of comprehensive nomogram were significantly different from those of the clinical factor model and PTV2 radiomics model in both cohorts (P<0.05). The AUC value of comprehensive nomogram was significantly different from that of the PTV1 radiomics model in the training cohort (P<0.05). The AUC value of the comprehensive nomogram was significantly different from the GTV radiomics model in the validation cohort (P<0.05). The DCA results for these discrimination models are shown in Figure 4F, demonstrating that the clinical factor model, GTV radiomics model, PTV1 radiomics model, GTV + PTV1 radiomics model, GTV + PTV1 + PTV2 radiomics model, and comprehensive nomogram had high clinical benefits, and the comprehensive nomogram obtained the highest overall net benefits in the prediction of invasiveness of GGNs.
Discussion
In the present study, we investigated the feasibility of CT-based radiomics features extracting from intra- and peritumoral regions for the prediction of benign or malignant status of GGNs and the invasiveness of malignant GGNs and developed a quantitative diagnostic radiomics-clinical nomogram based on radiomics features and clinical risk factors. Our study demonstrated that a combination of intra- and peritumoral radiomics yielded a superior predictive performance than intratumoral radiomics alone in terms of clinical prediction of malignancy and invasiveness of GGNs, in both training and validation cohorts. Additionally, our findings suggest that the nomogram combing the intra- and peritumoral radiomics features with clinical indicators demonstrated outstanding performance in predicting the malignancy and invasiveness of GGNs and also enhanced predictive accuracy compared to radiomics or clinical factor models alone, in both training and validation cohorts.
The peritumoral microenvironment refers to the parenchyma tissue immediately surrounding the tumor—a dynamic region where tumor cells constantly interact with the surrounding environment. It plays a complex and crucial role in the initiation, progression, recurrence, and metastasis of the tumor (22,23). Therefore, gaining insight into the tumor-associated microenvironment and its association with radiological and pathological features is critical yet presents an intractable clinical challenge. The underlying heterogeneity of the microenvironment surrounding the lesion is difficult to reflect well on conventional CT imaging, but can be quantitatively analyzed by peritumoral characteristics. CT-based radiomics, which quantitatively analyze phenotypic characteristics of lesions by extracting high-throughput data from CT images, can noninvasively evaluate peritumoral microenvironment heterogeneity that cannot be detected by human observers (12,14,24). In this study, we extracted radiomic features from peritumoral volumes of 5 and 10 mm, respectively, and subsequently constructed PTV1 and PTV2 radiomics models for predicting the malignancy and invasiveness of GGNs. To the best of our knowledge, no radiomics-based study has ever focused on peritumoral microenvironments and prediction of malignancy of GGNs. For malignancy predictions, the PTV1 radiomics model exhibited satisfactory classification performance, with AUC values of 0.815 and 0.795 in the training and validation cohorts, respectively. In the training cohort, DeLong test showed that there was no significant difference in AUC value between PTV1 radiomics model and GTV radiomics model. In the validation cohort, DeLong test showed that the P value between AUC values of the PTV1 radiomics model and the GTV radiomics model was less than 0.05. Nevertheless, the PTV2 radiomics model exhibited a low predictive value for the prediction of malignancy of GGNs, with AUC values of 0.568 and 0.483 in the training and validation cohorts, respectively. For invasiveness prediction, the AUC values of the PTV1 radiomics model in the training and validation cohorts were 0.935 and 0.927, respectively. DeLong test indicated no significant difference in AUC values between the PTV1 radiomics model and the GTV radiomics model in both training and validation cohorts. The AUC values of the PTV2 radiomics model in the training and validation cohorts were 0.540 and 0.252, respectively, demonstrating that the prediction performance of the single PTV2 radiomics model was inferior to that of the PTV1 radiomics model. The low AUC of the PTV2 model (0.252 in the validation cohort for invasiveness prediction) aligns with the reduced information content in the farther peritumoral zone and this result does not imply any error in model development or validation. Additionally, this study found that the total number of significant features extracted from PTV1 peritumoral region (0–5 mm) was considerably greater than that of significant features extracted from PTV2 peritumoral region (5–10 mm). Furthermore, ROC curves indicated that GTV + PTV1 radiomics model and GTV + PTV1 + PTV2 radiomics model had greater predictive value in terms of malignancy and invasiveness of GGNs than the GTV model alone in both training and validation cohorts. These results demonstrate that the peritumoral region surrounding the GGNs on CT images also contains substantial parenchyma information that can reflect the tissue texture heterogeneity between benign GGNs and malignant GGNs, as well as between non-invasive GGNs and invasive GGNs. Moreover, the closer the peritumoral area is to the intratumoral region, the more significant features it contains, and thus, the more useful information it provides.
In this study, we discovered that the intratumoral radiomic feature of the Cluster Prominence had the potential to predict the malignancy of GGNs. Cluster Prominence measures the skewness and asymmetry of the GLCM. A higher Cluster Prominence value indicates a higher peak of image gray level value and more heterogeneity. The peritumoral Dependence Non-Uniformity was found to be the most significant in predicting the malignancy of GGNs. Dependence non-uniformity assesses the similarity of dependence throughout the image, with a lower value reflecting greater heterogeneity among dependencies in the image. For predicting the invasiveness of GGNs, we identified a highly informative intratumoral feature (Major Axis Length) and a highly informative peritumoral feature (Dependence Non-Uniformity). Major Axis Length represents the largest axis length of tumor, akin to the well-established independent marker “the longest diameter of tumor”. In summary, quantitative analysis of intratumoral and peritumoral radiomics features can serve as a non-invasive method to reflect the biological behavior of GGNs, offering a foundation for predicting the malignancy and invasiveness of GGNs and facilitating precision treatment.
Independent clinical predictors and Radscores of GTV, PTV1, and PTV2 were then integrated to develop individualized prediction nomograms. The comprehensive nomograms attained the highest AUC values in predicting the benign or malignant status of pulmonary GGNs and the invasiveness of malignant GGNs in both training and validation cohorts. Furthermore, the DCA results demonstrated that the comprehensive nomograms provided better net benefits than either the clinical or radiomics model alone in predicting the malignancy and invasiveness of GGNs. These findings suggest that the noninvasive and comprehensive nomogram, which fuses CT radiomics parameters, radiological imaging features, and clinical characteristics, can act as a robust tool for diagnosing GGNs and for individualized clinical decision-making.
From a clinical perspective, the proposed radiomics nomograms could assist radiologists and thoracic surgeons in non-invasively risk-stratifying GGNs. This may help to reduce unnecessary surgeries for benign or non-invasive lesions while identifying high-risk invasive cases that require more extensive resection or closer surveillance. By providing individualized probability outputs, the nomograms may facilitate shared decision-making, optimize surgical planning, and ultimately improve patient outcomes and healthcare resource utilization.
Nonetheless, our study’s findings have several limitations. It was retrospective and included only surgically resected GGNs, which could introduce selection bias. The malignancy prediction task exhibited moderate class imbalance (benign vs. malignant lesions). Although precision-recall curves and PR-AUC were not generated in the present study, the reported AUC values, combined with sensitivity, specificity, and DCA, still provide a comprehensive assessment of model discrimination and clinical utility. Future studies may further evaluate precision-recall metrics to complement the current findings. Additionally, being a single-center study, it lacked an external validation cohort. Furthermore, whole-volume ROI segmentation was conducted manually, potentially leading to sampling bias. To address this issue, future studies should explore a reliable and robust automatic segmentation method. Moreover, the sample size of our study was relatively small, particularly for benign GGNs and the AAH subtype. Further studies with larger sample sizes are necessary. The model performance was evaluated using a single train-test split (7:3 ratio at the patient level). Although this approach is commonly adopted in radiomics studies, it may be susceptible to split-dependent bias and does not fully assess the stability and repeatability of the performance metrics. Future studies will incorporate repeated cross-validation or bootstrapping to further validate model robustness. Lastly, other clinical indicators, such as genetic markers and molecular biomarkers, were not considered in our study and should be analyzed in subsequent research.
Conclusions
Despite certain limitations, current research presents compelling evidence indicating that the PTV1 radiomics model performs comparably, or even superiorly, to the GTV radiomics model in predicting the malignancy and invasiveness of GGNs. Additionally, the comprehensive nomogram we developed demonstrates enhanced predictive capabilities for the malignancy and invasiveness of GGNs, showing significant potential in offering clinical guidance for individualized and precise medical treatment of patients with pulmonary GGNs.
Acknowledgments
We would like to thank Hanzhang Wang (GE Healthcare China) for his support in data analysis. We would like to thank Bing Zhang (Orient Securities) for her help in picture processing. A version of the abstract for this article has been previously presented at the 110th Radiological Society of North America Abstracts.
Footnote
Reporting Checklist: The authors have completed the TRIPOD reporting checklist. Available at https://qims.amegroups.com/article/view/10.21037/qims-2026-1-0049/rc
Data Sharing Statement: Available at https://qims.amegroups.com/article/view/10.21037/qims-2026-1-0049/dss
Funding: This work was supported by
Conflicts of Interest: All authors have completed the ICMJE uniform disclosure form (available at https://qims.amegroups.com/article/view/10.21037/qims-2026-1-0049/coif). M.Z. reports funding from the Health Research Program of Anhui (No. AHWJ2024Aa30418). Q.C. reports funding from the Kunshan Key Research and Development Program Project (No. KS2442). The other authors have no conflicts of interest to declare.
Ethical Statement: The authors are accountable for all aspects of the work in ensuring that questions related to the accuracy or integrity of any part of the work are appropriately investigated and resolved. The study was conducted in accordance with the Declaration of Helsinki and its subsequent amendments. This study was approved by the institutional review board of The First Affiliated Hospital of the University of Science and Technology of China (ethics approval number: 2025-RE-164) and individual consent for this retrospective analysis was waived.
Open Access Statement: This is an Open Access article distributed in accordance with the Creative Commons Attribution-NonCommercial-NoDerivs 4.0 International License (CC BY-NC-ND 4.0), which permits the non-commercial replication and distribution of the article with the strict proviso that no changes or edits are made and the original work is properly cited (including links to both the formal publication through the relevant DOI and the license). See: https://creativecommons.org/licenses/by-nc-nd/4.0/.
References
- Aberle DR, Adams AM, Berg CD, Black WC, Clapp JD, Fagerstrom RM, Gareen IF, Gatsonis C, Marcus PM, Sicks JD. Reduced lung-cancer mortality with low-dose computed tomographic screening. N Engl J Med 2011;365:395-409.
- Bach PB, Mirkin JN, Oliver TK, Azzoli CG, Berry DA, Brawley OW, Byers T, Colditz GA, Gould MK, Jett JR, Sabichi AL, Smith-Bindman R, Wood DE, Qaseem A, Detterbeck FC. Benefits and harms of CT screening for lung cancer: a systematic review. JAMA 2012;307:2418-29. [Crossref] [PubMed]
- Lee HY, Lee KS. Ground-glass opacity nodules: histopathology, imaging evaluation, and clinical implications. J Thorac Imaging 2011;26:106-18. [Crossref] [PubMed]
- Kim HY, Shim YM, Lee KS, Han J, Yi CA, Kim YK. Persistent pulmonary nodular ground-glass opacity at thin-section CT: histopathologic comparisons. Radiology 2007;245:267-75. [Crossref] [PubMed]
- Park CM, Goo JM, Lee HJ, Lee CH, Chung DH, Chun EJ, Im JG. Focal interstitial fibrosis manifesting as nodular ground-glass opacity: thin-section CT findings. Eur Radiol 2007;17:2325-31. [Crossref] [PubMed]
- Henschke CI, Yankelevitz DF, Mirtcheva R, McGuinness G, McCauley D, Miettinen OS. CT screening for lung cancer: frequency and significance of part-solid and nonsolid nodules. AJR Am J Roentgenol 2002;178:1053-7. [Crossref] [PubMed]
- Li WJ, Lv FJ, Tan YW, Fu BJ, Chu ZG. Pulmonary Benign Ground-Glass Nodules: CT Features and Pathological Findings. Int J Gen Med 2021;14:581-90. [Crossref] [PubMed]
- Travis WD, Brambilla E, Noguchi M, Nicholson AG, Geisinger KR, Yatabe Y, et al. International association for the study of lung cancer/american thoracic society/european respiratory society international multidisciplinary classification of lung adenocarcinoma. J Thorac Oncol 2011;6:244-85. [Crossref] [PubMed]
- Yamaguchi M, Furuya A, Edagawa M, Taguchi K, Shimamatsu S, Toyokawa G, Toyozawa R, Nosaki K, Hirai F, Seto T, Takenoyama M, Ichinose Y. How should we manage small focal pure ground-glass opacity nodules on high-resolution computed tomography? A single institute experience. Surg Oncol 2015;24:258-63.
- Dembitzer FR, Flores RM, Parides MK, Beasley MB. Impact of histologic subtyping on outcome in lobar vs sublobar resections for lung cancer: a pilot study. Chest 2014;146:175-81. [Crossref] [PubMed]
- Lee HW, Jin KN, Lee JK, Kim DK, Chung HS, Heo EY, Choi SH. Long-Term Follow-Up of Ground-Glass Nodules After 5 Years of Stability. J Thorac Oncol 2019;14:1370-7. [Crossref] [PubMed]
- Avanzo M, Stancanello J, El Naqa I. Beyond imaging: The promise of radiomics. Phys Med 2017;38:122-39. [Crossref] [PubMed]
- Lambin P, Leijenaar RTH, Deist TM, Peerlings J, de Jong EEC, van Timmeren J, Sanduleanu S, Larue RTHM, Even AJG, Jochems A, van Wijk Y, Woodruff H, van Soest J, Lustberg T, Roelofs E, van Elmpt W, Dekker A, Mottaghy FM, Wildberger JE, Walsh S. Radiomics: the bridge between medical imaging and personalized medicine. Nat Rev Clin Oncol 2017;14:749-62. [Crossref] [PubMed]
- Gillies RJ, Kinahan PE, Hricak H. Radiomics: Images Are More than Pictures, They Are Data. Radiology 2016;278:563-77. [Crossref] [PubMed]
- Yang SX, Li M, Zhou LN, Hou DH, Zhang L, Wu N. Reproducibility of the CT radiomic features of pulmonary nodules: the effects of the CT reconstruction algorithm, radiation dose, and contrast agent. Quant Imaging Med Surg 2025;15:2309-18. [Crossref] [PubMed]
- Prasanna P, Patel J, Partovi S, Madabhushi A, Tiwari P. Radiomic features from the peritumoral brain parenchyma on treatment-naïve multi-parametric MR imaging predict long versus short-term survival in glioblastoma multiforme: Preliminary findings. Eur Radiol 2017;27:4188-97. [Crossref] [PubMed]
- Sun Q, Lin X, Zhao Y, Li L, Yan K, Liang D, Sun D, Li ZC. Deep Learning vs. Radiomics for Predicting Axillary Lymph Node Metastasis of Breast Cancer Using Ultrasound Images: Don’t Forget the Peritumoral Region. Front Oncol 2020;10:53.
- Mittal V, El Rayes T, Narula N, McGraw TE, Altorki NK, Barcellos-Hoff MH. The Microenvironment of Lung Cancer and Therapeutic Implications. Adv Exp Med Biol 2016;890:75-110. [Crossref] [PubMed]
- Zhong L, Shi L, Zhou L, Liu X, Gu L, Bai W. Development of a nomogram-based model combining intra- and peritumoral ultrasound radiomics with clinical features for differentiating benign from malignant in Breast Imaging Reporting and Data System category 3-5 nodules. Quant Imaging Med Surg 2023;13:6899-910. [Crossref] [PubMed]
- Kramer AA, Zimmerman JE. Assessing the calibration of mortality benchmarks in critical care: The Hosmer-Lemeshow test revisited. Crit Care Med 2007;35:2052-6. [Crossref] [PubMed]
- Van Calster B, Wynants L, Verbeek JFM, Verbakel JY, Christodoulou E, Vickers AJ, Roobol MJ, Steyerberg EW. Reporting and Interpreting Decision Curve Analysis: A Guide for Investigators. Eur Urol 2018;74:796-804. [Crossref] [PubMed]
- Quail DF, Joyce JA. Microenvironmental regulation of tumor progression and metastasis. Nat Med 2013;19:1423-37. [Crossref] [PubMed]
- Polyak K, Haviv I, Campbell IG. Co-evolution of tumor cells and their microenvironment. Trends Genet 2009;25:30-8. [Crossref] [PubMed]
- Lambin P, Rios-Velazquez E, Leijenaar R, Carvalho S, van Stiphout RG, Granton P, Zegers CM, Gillies R, Boellard R, Dekker A, Aerts HJ. Radiomics: extracting more information from medical images using advanced feature analysis. Eur J Cancer 2012;48:441-6. [Crossref] [PubMed]

