Enhancing diagnostic accuracy of American College of Radiology TI-RADS 4 nodules: nomogram models based on MRI morphological features
Introduction
Accurate diagnosis of benign and malignant thyroid nodules remains challenging, even though the American College of Radiology Thyroid Imaging Reporting and Data System (ACR-TI-RADS) offers a standardized method for risk assessment (1). Particularly, ACR TI-RADS category 4 (ACR-TR4) nodules present a diagnostic challenge due to their undetermined nature, leading to frequent unnecessary fine-needle aspiration (FNA) and missed cancers (2,3). Although FNA is the gold standard for preoperative thyroid cancer diagnosis, about 20–30% of these invasive procedures yield non-diagnostic or indeterminate results (4-6). Therefore, enhancing the diagnostic accuracy of benign and malignant ACR-TR4 thyroid nodules remains a critical challenge to minimize unnecessary FNAs and surgical interventions (2).
Prior research has primarily focused on enhancing the diagnostic accuracy of ACR-TR4 nodules using multimodal ultrasound techniques, such as conventional ultrasound (7), shear wave elastography (SWE) (3,8,9), super microvascular imaging (SMI) (10), enhanced contrast-enhanced ultrasound (CEUS) (11-13), and artificial intelligence (AI) (14). These methods have been applied both individually and collectively. However, no technique reliably differentiates between benign and malignant ACR-TR4 nodules. The inherent subjectivity of ultrasound assessment and the limitations of these techniques invariably introduce discrepancies. The integration of conventional ultrasound, real-time elastography, and SMI has improved the sensitivity, specificity, and accuracy in diagnosing ACR-TR4 thyroid nodules compared to the use of each modality independently; however, the occurrence of false positives and false negatives has persisted (15).
Magnetic resonance imaging (MRI) has been instrumental in advancing radiological assessment standards, including Prostate Imaging Reporting and Data System (PI-RADS), Breast Imaging Reporting and Data System (BI-RADS), Vesical Imaging Reporting and Data System (VI-RADS), and Ovarian-Adnexal Reporting and Data System (O-RADS) (16-19). These developments have incorporated T2-weighted imaging (T2WI), dynamic contrast-enhanced MRI, and diffusion-weighted imaging (DWI) as the key components of multiparametric MRI (20). MRI is recommended for the evaluation of TI-RADS category 4 or higher thyroid nodules, suspected aggressive thyroid carcinoma, and large nodules to ascertain malignancy, cancer aggressiveness, and anatomical relationships. Although recent studies have applied multiparametric MRI to differentiate benign from malignant thyroid nodules and assess papillary thyroid carcinoma (21-26), research specifically exploring the diagnostic performance of multiparametric MRI for ACR-TR4 thyroid nodules has remained limited. Recognizing this gap, our study aimed to develop and validate nomogram models based on MRI morphological features specifically for ACR-TR4 thyroid nodules. This approach seeks to enhance diagnostic accuracy, reduce unnecessary FNA, and minimize missed cancers. We present this article in accordance with the TRIPOD reporting checklist (available at https://qims.amegroups.com/article/view/10.21037/qims-24-1427/rc).
Methods
Patients and study design
The study was conducted in accordance with the Declaration of Helsinki (as revised in 2013). This retrospective study was approved by the Institutional Review Board of Minhang Hospital, Fudan University (approval number: 2023-037-01K). The requirement for informed consent was waived due to the retrospective nature of the study design. We reviewed consecutive patients who underwent preoperative thyroid MRI and surgical thyroidectomy from January 2017 to December 2022 in Minhang Hospital, Fudan University, China. The study included thyroid nodules categorized as ACR-TR4, accompanied by conclusive postoperative pathological findings. The exclusion criteria were as follows: (I) patients who had undergone FNA or partial thyroidectomy prior to MRI; (II) patients with poor image quality, such as severe artefacts, rendering the images unsuitable for diagnostic analysis; and (III) nodules less than 5 mm in diameter. A total of 229 ACR-TR4 thyroid nodules were finally included and randomly divided into a training cohort (166 thyroid nodules) and a validation cohort (63 thyroid nodules) in a 7:3 ratio. The study flow diagram is shown in Figure 1.
ACR TI-RADS
The ultrasound images conforming to the ACR-TR4 criteria were assessed by two seasoned ultrasound specialists, each with more than 10 years of experience, who were blinded to the clinical and pathological data. Discrepancies in interpretation were resolved through consensus. The evaluation criteria encompassed composition, echogenicity, margin, shape, calcification, aspect ratio, extrathyroidal extension, and suspicious cervical lymph nodes, in accordance with the comprehensive ACR TI-RADS guidelines (27).
MRI acquisition and analysis
MRI examinations were performed with a 1.5T MRI scanner (EXCITE HD; GE Healthcare, Waukesha, WI, USA) equipped with a customized 8-channel neck coil from Chenguang Medical Technology Ltd. (Shanghai, China). The detailed MRI acquisition parameters are listed in Table S1. Two radiologists with five and nine years of experience in thyroid MRI independently reviewed the MRI images. They used the Advantage Workstation 4.5 (GE Healthcare, USA) and a picture archiving and communication system (PACS). The radiologists were blinded to the results of histopathological outcomes. Discrepancies between their interpretations were resolved through consensus.
The assessment of lesions involved the following parameters: (I) size of the lesions, measured by the largest dimension, categorized into three groups: ≤1, 1–4, or ≥4 cm; (II) number of lesions, classified as either unifocal or multifocal; (III) location of the lesions, categorized into right lobe, left lobe, and isthmus; (IV) clinical parameters included age, sex, and Hashimoto’s thyroiditis. The qualitative MRI morphological features potentially associated with the benign and malignant nature of thyroid nodules were assessed: (I) hyperintense on T2WI, hypointense on T2WI, and hyperintense on T1-weighted imaging (T1WI); (II) restricted diffusion; (III) cystic degeneration; (IV) flow-void signal; (V) reversed halo sign in the delayed phase; (VI) pseudocapsule; (VII) fissure-filling enhancement; (VIII) wash-out pattern; (IX) hyperenhancement in the early phase; and (X) change of lesion size in multiphasic enhancement. The detailed definitions and diagrams of MRI morphological features are provided in Supplementary file (Appendix 1).
Model establishment
Univariate logistic regression analysis was performed to identify potential significant predictive factors. Subsequently, multivariable stepwise logistic regression analysis was then employed. For feature selection, the least absolute shrinkage and selection operator (LASSO) method was utilized. Data preprocessing involved converting continuous variables among the feature parameters into binary variables using the optimal cut-off values determined by the maximum Youden index. This transformation helped simplify the model while maintaining the critical discriminatory information. Additionally, features with a coefficient value of zero were excluded to avoid unnecessary complexity and improve model robustness. We employed 10-fold cross-validation to determine the optimal λ value. When features are highly correlated, LASSO tends to select one representative feature from the correlated group and shrink the coefficients of the others to zero. Based on these analyses, a nomogram model was developed to predict the risk of malignancy in ACR-TR4 thyroid nodules in the training cohort.
Subsequently, improved models were developed to facilitate the practical application of nomograms in clinical settings. The models considered various combinations of independent predictors, either individually or collectively. For ACR-TR4 nodules, FNA was recommended when the criteria of the improved model were satisfied; otherwise, it was not. The area under the curve (AUC), sensitivity, specificity, accuracy, positive predictive value (PPV), and negative predictive value (NPV) were calculated for each improved model. Furthermore, the rates of unnecessary FNA and missed cancer were compared between the improved models and ACR TI-RADS.
Statistical analysis
All statistical analyses were performed with the software SPSS 26.0 (IBM Corp., Armonk, NY, USA) and R software 4.2.0 (http://www.r-project.org). The variable age was compared using an independent sample t-test and presented as mean ± standard deviation (SD), whereas MRI morphological features were assessed using the chi-square test or Fisher’s exact test and reported as frequencies and percentages. The Cohen’s kappa test was used to compare the concordance between the two radiologists. The nomogram was constructed using the “rms” package in R. The Hosmer-Lemeshow test was used to assess the model’s goodness-of-fit, with P≥0.05 indicating a good fit. Receiver operating characteristic (ROC) analysis, calibration curve analysis, and decision curve analysis (DCA) were conducted to evaluate the performance of the nomogram. Statistical tests were performed with two-tailed P values, and P<0.05 was deemed statistically significant.
Results
Clinicopathological characteristics
The study finally included 229 (140 benign and 89 malignant) ACR-TR4 thyroid nodules from 184 patients, comprising 46 males and 183 females, with a mean age of 51.2±13.5 years. These were divided into two cohorts: 166 nodules (100 benign and 66 malignant) in the training cohort and 63 nodules (40 benign and 23 malignant) in the validation cohort. The pathologic results of patients in the training and validation cohorts are shown in Table 1.
Table 1
Pathological patterns | Total (n=229) | Training cohort (n=166) | Validation cohort (n=63) |
---|---|---|---|
Benign | 140 (61.1) | 100 (60.2) | 40 (63.5) |
Nodular goiter | 82 (35.8) | 58 (34.9) | 24 (38.1) |
Adenomatous goiter | 23 (10.0) | 16 (9.6) | 7 (11.1) |
Follicular thyroid adenoma | 19 (8.3) | 14 (8.4) | 5 (7.9) |
Nodular Hashimoto’s thyroiditis | 8 (3.5) | 7 (4.2) | 1 (1.6) |
Subacute thyroiditis | 8 (3.5) | 5 (3.0) | 3 (4.8) |
Malignant | 89 (38.9) | 66 (39.8) | 23 (36.5) |
Papillary thyroid carcinoma | 74 (32.3) | 56 (33.7) | 18 (28.6) |
Follicular thyroid carcinoma | 11 (4.8) | 6 (3.6) | 5 (7.9) |
Medullary thyroid carcinoma | 2 (0.9) | 2 (1.2) | 0 (0.0) |
Undifferentiated carcinoma | 2 (0.9) | 2 (1.2) | 0 (0.0) |
Data are presented as n (%).
Basic clinical information and MRI qualitative features
Table 2 presents the basic clinical information and MRI qualitative features of thyroid nodules in the training and validation cohorts. In the training cohort, variables such as age, the number of nodules, and most MRI morphological features—excluding the flow-void signal—showed significant differences between the benign and malignant nodules (P<0.05). In the validation cohort, sex and several MRI qualitative features, including hyperintense on T2WI, hypointense on T2WI, restricted diffusion, flow-void signal, reversed halo sign in the delayed phase, pseudocapsule, and wash-out pattern, showed significant differences between the benign and malignant nodules (P<0.05). There were no significant differences in the basic clinical information and MRI qualitative features between the training and validation cohorts (P>0.05).
Table 2
Variables | Training cohort (n=166) | Validation cohort (n=63) | Total (n=229) | P value# | |||||
---|---|---|---|---|---|---|---|---|---|
Benign | Malignant | P value | Benign | Malignant | P value | ||||
Age (years) | 54.3±13.4 | 47.0±13.9 | 0.001* | 52.8±11.2 | 46.7±13.6 | 0.056 | 51.2±13.5 | 0.667 | |
Sex | 0.274 | 0.002* | 0.619 | ||||||
Male | 22 (22.0) | 10 (15.2) | 4 (10.0) | 10 (43.5) | 46 (20.1) | ||||
Female | 78 (78.0) | 56 (84.8) | 36 (90.0) | 13 (56.5) | 183 (79.9) | ||||
Number of lesions | 0.019* | 0.050 | 0.137 | ||||||
Unifocal | 20 (20.0) | 24 (36.4) | 11 (27.5) | 12 (52.2) | 67 (29.3) | ||||
Multifocal | 80 (80.0) | 42 (63.6) | 29 (72.5) | 11 (47.8) | 162 (70.7) | ||||
Location of the lesions | 0.442 | 0.108 | 0.119 | ||||||
Left lobe | 54 (54.0) | 30 (45.5) | 18 (45.0) | 5 (21.7) | 107 (46.7) | ||||
Right lobe | 40 (40.0) | 33 (50.0) | 21 (52.5) | 16 (69.6) | 110 (48.0) | ||||
Isthmus | 6 (6.0) | 3 (4.5) | 1 (2.5) | 2 (8.7) | 12 (5.24) | ||||
Size of the lesions | 0.127 | 0.938 | 0.257 | ||||||
≤1 cm | 30 (30.0) | 30 (45.5) | 12 (30.0) | 6 (26.1) | 78 (34.1) | ||||
1–4 cm | 59 (59.0) | 30 (45.5) | 21 (52.5) | 13 (56.5) | 123 (53.7) | ||||
≥4 cm | 11 (11.0) | 6 (9.0) | 7 (17.5) | 4 (17.4) | 28 (12.2) | ||||
Hashimoto’s thyroiditis | 0.429 | 0.185 | 0.857 | ||||||
Absent | 85 (85.0) | 53 (80.3) | 36 (90.0) | 17 (73.9) | 191 (83.4) | ||||
Present | 15 (15.0) | 13 (19.7) | 4 (10.0) | 6 (26.1) | 38 (16.6) | ||||
Hyperintense on T2WI | <0.001* | 0.021* | 0.057 | ||||||
Absent | 27 (27.0) | 46 (69.7) | 8 (20.0) | 11 (47.8) | 92 (40.2) | ||||
Present | 73 (73.0) | 20 (30.3) | 32 (80.0) | 12 (52.2) | 137 (59.8) | ||||
Hyperintense on T1WI | 0.001* | 1.000 | 0.741 | ||||||
Absent | 77 (77.0) | 63 (95.5) | 33 (82.5) | 19 (82.6) | 192 (83.8) | ||||
Present | 23 (23.0) | 3 (4.5) | 7 (17.5) | 4 (17.4) | 37 (16.2) | ||||
Hypointense on T2WI | <0.001* | 0.030* | 0.125 | ||||||
Absent | 83 (83.0) | 26 (39.4) | 34 (85.0) | 14 (60.9) | 157 (68.6) | ||||
Present | 17 (17.0) | 40 (60.6) | 6 (15.0) | 9 (39.1) | 72 (31.4) | ||||
Restricted diffusion | <0.001* | <0.001* | 0.818 | ||||||
Absent | 87 (87.0) | 13 (19.7) | 34 (85.0) | 5 (21.7) | 139 (60.7) | ||||
Present | 13 (13.0) | 53 (80.3) | 6 (15.0) | 18 (78.3) | 90 (39.3) | ||||
Cystic degeneration | 0.042* | 0.966 | 0.804 | ||||||
Absence | 88 (88.0) | 64 (97.0) | 38 (95.0) | 21 (91.3) | 211 (92.1) | ||||
Present | 12 (12.0) | 2 (3.0) | 2 (5.0) | 2 (8.7) | 18 (7.9) | ||||
Flow-void signal | 0.921 | 0.040* | 0.678 | ||||||
Absent | 92 (92.0) | 61 (92.4) | 39(97.5) | 18 (78.3) | 210 (91.7) | ||||
Present | 8 (8.0) | 5 (7.6) | 1 (2.5) | 5 (21.7) | 19 (8.30) | ||||
Reversed halo sign in the delayed phase | <0.001* | <0.001* | 0.494 | ||||||
Absent | 94 (94.0) | 14 (21.2) | 37 (92.5) | 7 (30.4) | 152 (66.4) | ||||
Present | 6 (6.0) | 52 (78.8) | 3 (7.5) | 16 (69.6) | 77 (33.6) | ||||
Pseudocapsule | <0.001* | 0.017* | 0.288 | ||||||
Absent | 48 (48.0) | 62 (93.9) | 19 (47.5) | 18 (78.3) | 147 (64.2) | ||||
Present | 52 (52.0) | 4 (6.1) | 21 (52.5) | 5 (21.7) | 82 (35.8) | ||||
Fissure-filling enhancement | 0.014* | 1.000 | 0.254 | ||||||
Absent | 88 (88.0) | 65 (98.5) | 35 (87.5) | 20 (87.0) | 208 (90.8) | ||||
Present | 12 (12.0) | 1 (1.5) | 5 (12.5) | 3 (13.0) | 21 (9.17) | ||||
Wash-out pattern | 0.004* | 0.008* | 0.748 | ||||||
Absent | 59 (59.0) | 24 (36.4) | 26 (65.0) | 7 (30.4) | 116 (50.7) | ||||
Present | 41 (41.0) | 42 (63.6) | 14 (35.0) | 16 (69.6) | 113 (49.3) | ||||
Hyperenhancement in the early phase | 0.006* | 1.000 | 0.580 | ||||||
Absent | 76 (76.0) | 61 (92.4) | 32 (80.0) | 18 (78.3) | 187 (81.7) | ||||
Present | 24 (24.0) | 5 (7.6) | 8 (20.0) | 5 (21.7) | 42 (18.3) | ||||
Change of lesion size in multiphasic enhancement | <0.001* | 0.063 | 0.330 | ||||||
Absent | 50 (50.0) | 7 (10.6) | 20 (50.0) | 6 (26.1) | 83 (36.2) | ||||
Present | 50 (50.0) | 59 (89.4) | 20 (50.0) | 17 (73.9) | 146 (63.8) |
Data are presented as mean ± SD or n (%). *, P<0.05; #, the P values representing the differences between the training and validation cohorts. MRI, magnetic resonance imaging; T2WI, T2-weighted imaging; T1WI, T1-weighted imaging; SD, standard deviation.
Interobserver agreement of MRI qualitative features and ACR-TR4
The kappa values for interobserver agreement for all MRI qualitative features were from 0.707 to 0.984 (Table 3). Out of 229 thyroid nodules classified as ACR-TR4, two experienced ultrasound specialists agreed on 180 cases, achieving an interobserver agreement of 0.786. There were 20 disagreements between categories 3 and 4, and 29 between categories 4 and 5.
Table 3
MRI morphological features | Radiologist 1 | Radiologist 2 | Kappa |
---|---|---|---|
Hyperintense on T2WI | 0.945 | ||
Absent | 92 (40.2) | 90 (39.3) | |
Present | 137 (59.8) | 139 (60.7) | |
Hyperintense on T1WI | 0.984 | ||
Absent | 192 (83.8) | 191 (83.4) | |
Present | 37 (16.2) | 38 (16.6) | |
Hypointense on T2WI | 0.707 | ||
Absent | 157 (68.6) | 129 (56.3) | |
Present | 72 (31.4) | 100 (43.7) | |
Restricted diffusion | 0.926 | ||
Absent | 139 (60.7) | 145 (63.3) | |
Present | 90 (39.3) | 84 (36.7) | |
Cystic degeneration | 0.940 | ||
Absence | 211 (92.1) | 211 (92.1) | |
Present | 18 (7.9) | 18 (7.9) | |
Flow-void signal | 0.895 | ||
Absent | 210 (91.7) | 206 (90.0) | |
Present | 19 (8.3) | 23 (10.0) | |
Reversed halo sign in the delayed phase | 0.941 | ||
Absent | 152 (66.4) | 152 (66.4) | |
Present | 77 (33.6) | 77 (33.6) | |
Pseudocapsule | 0.934 | ||
Absent | 147 (64.2) | 144 (62.9) | |
Present | 82 (35.8) | 85 (37.1) | |
Fissure-filling enhancement | 0.974 | ||
Absent | 207 (90.4) | 208 (90.8) | |
Present | 22 (9.6) | 21 (9.2) | |
Wash-out pattern | 0.904 | ||
Absent | 116 (50.7) | 119 (52.0) | |
Present | 113 (49.3) | 110 (48.0) | |
Hyperenhancement in the early phase | 0.870 | ||
Absent | 186 (81.2) | 187 (81.7) | |
Present | 43 (18.8) | 42 (18.3) | |
Change of lesion size in multiphasic enhancement | 0.845 | ||
Absent | 83 (36.2) | 98 (42.8) | |
Present | 146 (63.8) | 131 (57.2) |
Data are presented as n (%). MRI, magnetic resonance imaging; T2WI, T2-weighted imaging; T1WI, T1-weighted imaging.
Univariate and multivariable logistic regression analysis
The results of univariate and multivariable analyses of basic clinical information and MRI qualitative features related to malignant ACR-TR4 nodules in the training cohort are provided in Table 4. In the multivariable logistic regression analysis, diffusion restriction [odds ratio (OR) =12.722, P<0.001] and reversed halo sign in the delayed phase (OR =30.274, P<0.001) were identified as independent predictors of malignant ACR-TR4 nodules.
Table 4
Variables | Univariate analysis | Multivariable analysis | |||
---|---|---|---|---|---|
OR (95% CI) | P value | OR (95% CI) | P value | ||
Male | 0.633 (0.278–1.441) | 0.276 | |||
Age | 0.962 (0.939–0.985) | 0.001* | |||
Unifocal | 0.438 (0.217–0.882) | 0.021* | |||
Tumor size | 0.639 (0.385–1.063) | 0.084 | |||
Hyperintense on T2WI | 0.161 (0.081–0.319) | <0.001* | |||
Hyperintense on T1WI | 0.159 (0.046–0.556) | 0.004* | |||
Hypointense on T2WI | 7.511 (3.662–15.406) | <0.001* | |||
Restricted diffusion | 27.284 (11.765–63.276) | <0.001* | 12.722 (4.475–36.173) | <0.001* | |
Reversed halo sign in the delayed phase | 58.190 (21.097–160.501) | <0.001* | 30.274 (9.844–93.106) | <0.001* | |
Pseudocapsule | 0.060 (0.020–0.176) | <0.001* | |||
Fissure-filling enhancement | 0.113 (0.014–0.890) | 0.038* | |||
Cystic degeneration | 0.229 (0.050–1.060) | 0.059 | |||
Flow-void signal | 0.943 (0.295–3.017) | 0.921 | |||
Wash-out pattern | 2.518 (1.327–4.779) | 0.005* | |||
Hyperenhancement in the early phase | 0.260 (0.094–0.720) | 0.010* | |||
Change of lesion size in multiphasic enhancement | 8.429 (3.510–20.241) | <0.001* |
*, P<0.05. OR, odds ratio; CI, confidence interval; T2WI, T2-weighted imaging; T1WI, T1-weighted imaging.
Development and validation of the nomogram
In the training cohort, diffusion restriction and the reversed halo sign in the delayed phase were selected as key predictive variables through LASSO logistic regression (Figure 2) and were subsequently incorporated into the nomogram for predicting malignant thyroid nodules, as depicted in Figure 3. Figure 4 illustrates the nomogram’s ability to differentiate between benign and malignant ACR TI-RADS category 4 nodules. The AUC of the nomogram in the training and validation cohorts was 0.928 (95% CI: 0.887–0.970) and 0.904 (95% CI: 0.825–0.984), respectively. The calibration curve and Hosmer-Lemeshow test statistic (P=0.983 and 0.936) demonstrated excellent calibration. Furthermore, the DCA analysis indicated a larger overall net benefit of the nomogram.
Diagnostic performance of improved models
To enhance clinical utility, we developed four improved models based on the nomogram, including diffusion restriction (A), reversed halo sign in the delayed phase (B), combined model 1 (A and B, ACR-TR4 nodules were deemed malignant only if both A and B were met; otherwise, they were considered benign), and combined model 2 (A or B, ACR-TR4 nodules were deemed benign only if both A and B were not met; otherwise, they were considered malignant). The AUCs for these models were 0.831 (95% CI: 0.772–0.890), 0.850 (95% CI: 0.792–0.908), 0.810 (95% CI: 0.745–0.874), and 0.871 (95% CI: 0.822–0.921), respectively. The diagnostic performances of these models are provided in Table 5.
Table 5
Models | AUC | Sensitivity (%) | Specificity (%) | Accuracy (%) | PPV (%) |
NPV (%) |
UFNA rate (%) |
Missed cancer rate (%) |
---|---|---|---|---|---|---|---|---|
Restricted diffusion (A) | 0.831 | 79.8 | 86.4 | 83.8 | 78.9 | 87.1 | 21.1 (19/90) | 12.9 (18/139) |
Reversed halo sign in delayed phase (B) | 0.850 | 76.4 | 93.6 | 86.9 | 88.3 | 86.2 | 11.7 (9/77) | 13.8 (21/152) |
Combined model 1 (A and B) | 0.810 | 64.0 | 97.9 | 84.7 | 95.0 | 81.1 | 5 (3/60) | 18.9 (32/169) |
Combined model 2 (A or B) | 0.871 | 92.1 | 82.1 | 86.0 | 76.6 | 94.3 | 23.4 (25/107) | 5.7 (7/122) |
ACR TI-RADS Category 4 | NA | NA | NA | NA | NA | NA | 64 (87/136) | 43 (40/93) |
ACR TI-RADS, American College of Radiology Thyroid Imaging Reporting and Data System; AUC, area under the curve; PPV, positive predictive value; NPV, negative predictive value; UFNA, unnecessary fine-needle aspiration; NA, not applicable.
For predicting malignant ACR-TR4 thyroid nodules, the combined model 2 (A or B) achieved the highest sensitivity at 92.1%, the combined model 1 (A and B) achieved the highest specificity at 97.9%, and the reversed halo sign in the delayed phase (B) showed the highest accuracy at 86.9%. Representative ultrasound and MRI images illustrating these findings are shown in Figure 5.
Unnecessary FNA and missed cancer rates
The rates of unnecessary FNA and missed cancer for the ACR-TR4 and four improved models are presented in Table 5. The combined model 1 (A and B) exhibited the lowest unnecessary FNA rate at 5%. With this model, the three cases of unnecessary FNA included one follicular thyroid adenoma, one case of subacute thyroiditis, and one adenomatous goiter. The combined model 2 (A or B) demonstrated the lowest missed cancer rate at 5.7% and maintained a relatively low unnecessary FNA rate of 23.4%, significantly better than those observed with the ACR-TR4 model (43% and 64%, respectively). In the combined model 2 (A or B), of the seven missed cancer cases, five were follicular thyroid carcinoma and two were papillary thyroid carcinoma. The diagnostic performance of various models within the training and validation cohorts is detailed in Table S2.
Discussion
We identified diffusion restriction and reversed halo sign in the delayed phase as independent predictors of malignancy of ACR-TR4 thyroid nodules. The nomogram we developed, which incorporated these two MRI features, demonstrated superior diagnostic performance, achieving an AUC of 0.928 and 0.904 in the training and validation cohorts, respectively. By employing the combined model 2, which combined restriction diffusion or the reversed halo sign in the delayed phase, we achieved the lowest missed cancer rate (5.7%) in ACR-TR4 thyroid nodules, along with the significantly reduced unnecessary FNA rate (23.4%).
TI-RADSs have become increasingly integral to the diagnosis of thyroid nodules. Among the different risk stratification systems, ACR TI-RADS stood out in ultrasound-based diagnosis with a pooled sensitivity of 0.89 and a specificity of 0.70 (28,29). In recent years, an increasing use of multimodal ultrasound imaging has enhanced the diagnostic efficacy of ACR-TR4 nodules. For instance, Gong et al. (11) enhanced diagnostic accuracy to 83.78% by integrating AI with CEUS. Lai et al. (30) achieved an AUC of 0.880 using an AI algorithm, whereas Li et al. (9) reached an AUC of 0.890 by combining CEUS and SWE. Furthermore, Zhang et al. (10) reported an AUC of 0.910 with CEUS alone after exploring the efficacy of color Doppler imaging, CEUS, or SMI. Building on this, our study further advanced the research by employing a nomogram with two MRI features to distinguish between malignant and benign ACR-TR4 nodules, demonstrating a strong predictive capacity with an AUC of 0.904 in the validation cohort.
The suboptimal accuracy of thyroid cancer diagnosis has increased unnecessary FNA procedures, which are invasive. Notably, Bethesda categories III and IV, which account for approximately 20–30% of all FNAs, yield indeterminate results and typically necessitate further evaluation (31). Given the varying malignancy risk (5–20%) associated with ACR-TR4 nodules, unnecessary FNAs appear almost inevitable. Yoon et al. (32) reported an unnecessary FNA rate of 28% under the existing ACR TI-RADS framework. Risk stratification systems for thyroid nodules based on ultrasound often suffer from low specificity and poor interobserver agreement. Enhancing these systems is crucial to reduce unnecessary FNAs. Li et al. (33) improved ACR TI-RADS by increasing the FNA threshold for ACR-TR4 to 2.5 cm, increasing the specificity to 73% and reducing the unnecessary FNA rate to 25%. Moreover, recent modifications that included category 5 nodules smaller than 1.0 cm in the FNA criteria further reduced the unnecessary FNA rate to 17.9% (34). To integrate our nomogram into clinical decision-making effectively, we utilized four improved models that substantially decreased the incidence of unnecessary FNAs and missed cancer diagnoses compared to the ACR-TR4 system alone. Our results showed that the combined model 1 (A and B) achieved the lowest unnecessary FNA rate at 5%, whereas the combined model 2 (A or B) yielded the lowest missed cancer rate of 5.7%.
The study revealed a 64% unnecessary FNA rate for ACR-TR4, along with a 43% missed cancer rate, potentially attributed to inter-operator variability and sample inconsistencies. Among the study cohort, 23.1% were diagnosed with follicular thyroid neoplasm (FTN). Lin et al. (35) emphasized the limitations of various TI-RADS in managing patients with FTN, resulting in an unnecessary FNA rate ranging from 65.3% to 93.1%. In our combined model 2 (A or B), 5 out of 7 missed cases were identified as follicular thyroid carcinoma. Future studies should investigate improvements for FTN and non-FTN in ACR-TR4 nodules.
Our study found that restricted diffusion, marked by high interobserver agreement (kappa value =0.908), effectively differentiated benign ACR-TR4 nodules from malignant ones with a specificity of 86.4%. DWI is a valuable tool for distinguishing between benign and malignant thyroid nodules (36). Restricted diffusion is defined by the presence of a solid component within the lesion, manifested as hyperintensity on DWI and hypointensity on ADC. This pattern is typical due to the dense cellular structure of malignant tumors, which impedes the water molecule movement (37).
Furthermore, the reversed halo sign in the delayed phase was identified as a robust independent predictor of malignancy, with an OR of 30.274 and a specificity of 93.6%. The sign is characterized by a wash-out pattern in the central portion of the lesion, continuous enhancement in the periphery during the delayed phase, and a blurred border, indicating active proliferation of neoplastic cells centrally and abundant tumor stroma peripherally, leading to sustained enhancement. We also noted a good interobserver agreement between the radiologists regarding this sign (kappa value =0.941).
Our study has several limitations. Firstly, as a single-center retrospective investigation, our results might be subject to selection bias. Secondly, the study did not include nodules smaller than 5 mm due to spatial resolution limitations of MRI imaging. Thirdly, the qualitative parameters we used are inherently subjective; unlike quantitative parameters, they are less affected by various factors, such as equipment type, imaging parameters, and measurement methods. Nonetheless, qualitative indicators offer greater practicality in clinical settings. Fourthly, the routine integration of MRI into the sonographic evaluation of thyroid nodules may significantly increase the overall cost of assessment. However, this strategy could potentially decrease the frequency of unnecessary surgical procedures, which might offset the increased expenses. For now, the extent of this potential cost offset remains uncertain. Finally, the absence of an independent external test set in this study underscores the need for incorporating multi-center data to enhance the validity of the MRI-based diagnostic models.
Conclusions
Our study demonstrated that integrating MRI-based imaging features into our nomogram substantially enhanced the diagnostic accuracy for distinguishing between benign and malignant ACR-TR4 thyroid nodules. In predicting malignant ACR-TR4 thyroid nodules, combined model 1, which incorporates restricted diffusion and the reversed halo sign in the delayed phase, demonstrated the highest specificity. Conversely, combined model 2, characterized by either restricted diffusion or the reversed halo sign in the delayed phase, exhibited the highest sensitivity. These improved models show potential for reducing the necessity of unnecessary FNA procedures while simultaneously minimizing the risk of missed cancers.
Acknowledgments
Funding: This work was supported by
Footnote
Reporting Checklist: The authors have completed the TRIPOD reporting checklist. Available at https://qims.amegroups.com/article/view/10.21037/qims-24-1427/rc
Conflicts of Interest: All authors have completed the ICMJE uniform disclosure form (available at https://qims.amegroups.com/article/view/10.21037/qims-24-1427/coif). B.S. has received grant support from Nature Science Foundation of Shanghai (No. 24ZR1461900). H.W. has received grant support from Shanghai Municipal Health Commission (No. 202140325). The other authors have no conflicts of interest to declare.
Ethical Statement: The authors are accountable for all aspects of the work in ensuring that questions related to the accuracy or integrity of any part of the work are appropriately investigated and resolved. The study was conducted in accordance with the Declaration of Helsinki (as revised in 2013). This retrospective study was approved by the Institutional Review Board of Minhang Hospital, Fudan University (approval number: 2023-037-01K). The requirement for informed consent was waived due to the retrospective nature of the study design.
Open Access Statement: This is an Open Access article distributed in accordance with the Creative Commons Attribution-NonCommercial-NoDerivs 4.0 International License (CC BY-NC-ND 4.0), which permits the non-commercial replication and distribution of the article with the strict proviso that no changes or edits are made and the original work is properly cited (including links to both the formal publication through the relevant DOI and the license). See: https://creativecommons.org/licenses/by-nc-nd/4.0/.
References
- Yucel S, Balci IG, Tomak L. Diagnostic Performance of Thyroid Nodule Risk Stratification Systems: Comparison of ACR-TIRADS, EU-TIRADS, K-TIRADS, and ATA Guidelines. Ultrasound Q 2023;39:206-11. [Crossref] [PubMed]
- Han Z, Huang Y, Wang H, et al. Multimodal ultrasound imaging: A method to improve the accuracy of diagnosing thyroid TI-RADS 4 nodules. J Clin Ultrasound 2022;50:1345-52. [Crossref] [PubMed]
- Liu X, Xie L, Ye X, Cui Y, He N, Hu L. Evaluation of Ultrasound Elastography Combined With Chi-Square Automatic Interactive Detector in Reducing Unnecessary Fine-Needle Aspiration on TIRADS 4 Thyroid Nodules. Front Oncol 2022;12:823411. [Crossref] [PubMed]
- Hall EA, Hartzband P, VanderLaan PA, Nishino M. Risk stratification of cytologically indeterminate thyroid nodules with nondiagnostic or benign cytology on repeat FNA: Implications for molecular testing and surveillance. Cancer Cytopathol 2023;131:313-24. [Crossref] [PubMed]
- Moon HJ, Kim EK, Yoon JH, Kwak JY. Malignancy risk stratification in thyroid nodules with nondiagnostic results at cytologic examination: combination of thyroid imaging reporting and data system and the Bethesda System. Radiology 2015;274:287-95. [Crossref] [PubMed]
- Wu J, Li Y, Zhang M. Clinical value of FNA puncture feeling in the diagnosis of non-diagnostic and indeterminate thyroid nodules. Front Endocrinol (Lausanne) 2022;13:1022438. [Crossref] [PubMed]
- Delfim RLC, Assumpção LR, Lopes FPPL, de Fátima Dos Santos Teixeira P. Does a three-degree hypoechogenicity grading improve ultrasound thyroid nodule risk stratification and affect the TI-RADS 4 category? A retrospective observational study. Arch Endocrinol Metab 2023;67:e000608. [Crossref] [PubMed]
- Hao L, Liu P, Ding C, Li J, Zhang Y. Diagnostic value of ACR TI-RADS combined with three-dimensional shear wave elastography in ACR TI-RADS 4 and 5 thyroid nodules. Chin Med J (Engl) 2023;136:1225-30. [Crossref] [PubMed]
- Li HJ, Sui GQ, Teng DK, Lin YQ, Wang H. Incorporation of CEUS and SWE parameters into a multivariate logistic regression model for the differential diagnosis of benign and malignant TI-RADS 4 thyroid nodules. Endocrine 2024;83:691-9. [Crossref] [PubMed]
- Zhang L, Gu J, Zhao Y, Zhu M, Wei J, Zhang B. The role of multimodal ultrasonic flow imaging in Thyroid Imaging Reporting and Data System (TI-RADS) 4 nodules. Gland Surg 2020;9:1469-77. [Crossref] [PubMed]
- Gong ZJ, Xin J, Yin J, Wang B, Li X, Yang HX, Zhu YW, Shen J, Gu J. Diagnostic Value of Artificial Intelligence-Assistant Diagnostic System Combined With Contrast-Enhanced Ultrasound in Thyroid TI-RADS 4 Nodules. J Ultrasound Med 2023;42:1527-35. [Crossref] [PubMed]
- Wang B, Ou X, Yang J, Zhang H, Cui XW, Dietrich CF, Yi AJ. Contrast-enhanced ultrasound and shear wave elastography in the diagnosis of ACR TI-RADS 4 and 5 category thyroid nodules coexisting with Hashimoto’s thyroiditis. Front Oncol 2022;12:1022305. [Crossref] [PubMed]
- Yu P, Niu S, Gao S, Tian H, Zhu J. Benefits of Contrast-Enhanced Ultrasonography to the Differential Diagnosis of TI-RADS 4-5 Thyroid Nodules. Appl Bionics Biomech 2022;2022:7386516. [Crossref] [PubMed]
- Wang SR, Zhu PS, Li J, Chen M, Cao CL, Shi LN, Li WX. Study on diagnosing thyroid nodules of ACR TI-RADS 4-5 with multimodal ultrasound radiomics technology. J Clin Ultrasound 2024;52:274-83. [Crossref] [PubMed]
- Pei S, Cong S, Zhang B, Liang C, Zhang L, Liu J, Guo Y, Zhang S. Diagnostic value of multimodal ultrasound imaging in differentiating benign and malignant TI-RADS category 4 nodules. Int J Clin Oncol 2019;24:632-9. [Crossref] [PubMed]
- Istomin A, Masarwah A, Okuma H, Sutela A, Vanninen R, Sudah M. A multiparametric classification system for lesions detected by breast magnetic resonance imaging. Eur J Radiol 2020;132:109322. [Crossref] [PubMed]
- Séguier D, Puech P, Kool R, Dernis L, Gabert H, Kassouf W, Villers A, Marcq G. Multiparametric magnetic resonance imaging for bladder cancer: a comprehensive systematic review of the Vesical Imaging-Reporting and Data System (VI-RADS) performance and potential clinical applications. Ther Adv Urol 2021;13:17562872211039583. [Crossref] [PubMed]
- Manganaro L, Ciulla S, Celli V, Ercolani G, Ninkova R, Miceli V, Cozzi A, Rizzo SM, Thomassin-Naggara I, Catalano C. Impact of DWI and ADC values in Ovarian-Adnexal Reporting and Data System (O-RADS) MRI score. Radiol Med 2023;128:565-77. [Crossref] [PubMed]
- Turkbey B, Rosenkrantz AB, Haider MA, Padhani AR, Villeirs G, Macura KJ, Tempany CM, Choyke PL, Cornud F, Margolis DJ, Thoeny HC, Verma S, Barentsz J, Weinreb JC. Prostate Imaging Reporting and Data System Version 2.1: 2019 Update of Prostate Imaging Reporting and Data System Version 2. Eur Urol 2019;76:340-51. [Crossref] [PubMed]
- Bi Q, Chen Y, Wu K, Wang J, Zhao Y, Wang B, Du J. The Diagnostic Value of MRI for Preoperative Staging in Patients with Endometrial Cancer: A Meta-Analysis. Acad Radiol 2020;27:960-8. [Crossref] [PubMed]
- Chen L, Xu J, Bao J, Huang X, Hu X, Xia Y, Wang J. Diffusion-weighted MRI in differentiating malignant from benign thyroid nodules: a meta-analysis. BMJ Open 2016;6:e008413. [Crossref] [PubMed]
- Jiang L, Chen J, Huang H, Wu J, Zhang J, Lan X, Liu D, Zhang J. Comparison of the Differential Diagnostic Performance of Intravoxel Incoherent Motion Imaging and Diffusion Kurtosis Imaging in Malignant and Benign Thyroid Nodules. Front Oncol 2022;12:895972. [Crossref] [PubMed]
- Sakat MS, Sade R, Kilic K, Gözeler MS, Pala O, Polat G, Kantarcı M. The Use of Dynamic Contrast-Enhanced Perfusion MRI in Differentiating Benign and Malignant Thyroid Nodules. Indian J Otolaryngol Head Neck Surg 2019;71:706-11. [Crossref] [PubMed]
- Song M, Yue Y, Jin Y, Guo J, Zuo L, Peng H, Chan Q. Intravoxel incoherent motion and ADC measurements for differentiating benign from malignant thyroid nodules: utilizing the most repeatable region of interest delineation at 3.0 T. Cancer Imaging 2020;20:9. [Crossref] [PubMed]
- Tan H, Chen J, Zhao YL, Liu JH, Zhang L, Liu CS, Huang D. Feasibility of Intravoxel Incoherent Motion for Differentiating Benign and Malignant Thyroid Nodules. Acad Radiol 2019;26:147-53. [Crossref] [PubMed]
- Zheng T, Wang L, Wang H, Tang L, Xie X, Fu Q, Wu PY, Song B. Prediction model based on MRI morphological features for distinguishing benign and malignant thyroid nodules. BMC Cancer 2024;24:256. [Crossref] [PubMed]
- Tessler FN, Middleton WD, Grant EG, Hoang JK, Berland LL, Teefey SA, Cronan JJ, Beland MD, Desser TS, Frates MC, Hammers LW, Hamper UM, Langer JE, Reading CC, Scoutt LM, Stavros AT. ACR Thyroid Imaging, Reporting and Data System (TI-RADS): White Paper of the ACR TI-RADS Committee. J Am Coll Radiol 2017;14:587-95. [Crossref] [PubMed]
- Kim DH, Kim SW, Basurrah MA, Lee J, Hwang SH. Diagnostic Performance of Six Ultrasound Risk Stratification Systems for Thyroid Nodules: A Systematic Review and Network Meta-Analysis. AJR Am J Roentgenol 2023;220:791-803. [Crossref] [PubMed]
- Li W, Wang Y, Wen J, Zhang L, Sun Y. Diagnostic Performance of American College of Radiology TI-RADS: A Systematic Review and Meta-Analysis. AJR Am J Roentgenol 2021;216:38-47. [Crossref] [PubMed]
- Lai M, Feng B, Yao J, Wang Y, Pan Q, Chen Y, Chen C, Feng N, Shi F, Tian Y, Gao L, Xu D. Value of Artificial Intelligence in Improving the Accuracy of Diagnosing TI-RADS Category 4 Nodules. Ultrasound Med Biol 2023;49:2413-21. [Crossref] [PubMed]
- Durante C, Grani G, Lamartina L, Filetti S, Mandel SJ, Cooper DS. The Diagnosis and Management of Thyroid Nodules: A Review. JAMA 2018;319:914-24. [Crossref] [PubMed]
- Yoon SJ, Na DG, Gwon HY, Paik W, Kim WJ, Song JS, Shim MS. Similarities and Differences Between Thyroid Imaging Reporting and Data Systems. AJR Am J Roentgenol 2019;213:W76-84. [Crossref] [PubMed]
- Li X, Peng C, Liu Y, Hu Y, Yang L, Yu Y, Zeng H, Huang W, Li Q, Tao N, Cao L, Zhou J. Modified American College of Radiology Thyroid Imaging Reporting and Data System and Modified Artificial Intelligence Thyroid Imaging Reporting and Data System for Thyroid Nodules: A Multicenter Retrospective Study. Thyroid 2024;34:88-100. [Crossref] [PubMed]
- Li G, Zhang B, Liu J, Xiong Y. The diagnostic efficacy and inappropriate biopsy rate of ACR TI-RADS and ATA guidelines for thyroid nodules in children and adolescents. Front Endocrinol (Lausanne) 2023;14:1052945. [Crossref] [PubMed]
- Lin Y, Lai S, Wang P, Li J, Chen Z, Wang L, Guan H, Kuang J. Performance of current ultrasound-based malignancy risk stratification systems for thyroid nodules in patients with follicular neoplasms. Eur Radiol 2022;32:3617-30. [Crossref] [PubMed]
- Zhou J, Zhang Y, Chang KT, Lee KE, Wang O, Li J, Lin Y, Pan Z, Chang P, Chow D, Wang M, Su MY. Diagnosis of Benign and Malignant Breast Lesions on DCE-MRI by Using Radiomics and Deep Learning With Consideration of Peritumor Tissue. J Magn Reson Imaging 2020;51:798-809. [Crossref] [PubMed]
- Zhao S, Zhang Y, Wang L, Yang L, Zou L, Gao F. Adeno-associated virus 2 mediated gene transfer of vascular endothelial growth factor Trap: a new treatment option for glioma. Cancer Biol Ther 2019;20:65-72. [Crossref] [PubMed]