Diagnostic value of a magnetic resonance imaging (MRI)-based vertebral bone quality score for bone mineral density assessment: an updated systematic review and meta-analysis
Introduction
Osteoporosis is the most common skeletal disorder worldwide, and is becoming more common among postmenopausal women, and the general population aged over 50 years. The prevalence of osteoporosis and its related fractures is increasing as life expectancy increases (1). Osteoporosis is characterized by microarchitecture deterioration in bone tissue and decreased bone mass (1), and it increases bone fragility and susceptibility to fracture. A systematic review reported that the global prevalence of osteoporosis among individuals aged 15–105 years was 18.3% (2), and nearly 9 million osteoporotic fractures occur annually (3). Osteoporosis and the resultant fragility fractures contribute to increased morbidity and mortality, the need for long-term care facilities, and economic costs (4). Thus, the early diagnosis and treatment of osteoporosis is critical.
Dual-energy X-ray absorptiometry (DXA) is currently the reference standard for diagnosing bone mineral density (BMD) (5). It has become one of the most widely used techniques for the assessment of BMD, as it is inexpensive, easy to use, and has a low radiation dose. However, DXA also has some drawbacks in terms of its utility, as it tends to overestimate BMD in patients with aortic calcifications, degenerative spines, or a high body mass index (BMI) (5-7), all of which are common in the elderly and those seeking spine surgery treatment. DXA assessment can also be inaccurate in patients with scoliosis (8). In addition, DXA cannot distinguish between cortical and trabecular bone. Thus, it cannot provide detailed information about skeletal strength and bone microarchitecture. Quantitative computed tomography (QCT) provides accurate volumetric BMD (vBMD) measurements, and can overcome the deficiencies of DXA, but its radiation is higher than that of DXA. These limitations have prompted research into other techniques to diagnose osteoporosis.
Several previous studies have sought to establish a magnetic resonance imaging (MRI)-based method for quantifying BMD to decrease patient radiation exposure and overall care expenses (9,10). These studies have founded that as bone becomes osteoporotic, the trabecular portion becomes more hyperintense on T1-weighted imaging (T1WI), and that bone marrow signal intensity (SI) is negatively correlated with BMD. This may be due to the high signal shown when fat infiltrates bone, as previous studies have shown that osteoporotic bone is characterized by trabecular atrophy and local adipocyte replacement (11). Further, research has shown that vertebral SI on lumbar spine MRI-T1WI can more sensitively evaluate BMD than DXA (12).
Ehresman et al. first proposed the MRI-based vertebral bone quality (VBQ) score as a novel method for evaluating BMD (13). The VBQ score is calculated based on non-contrast T1-weighted MRI in the midsagittal plane. First, a region of interest (ROI) is manually placed in the medullary bone of the L1–L4 vertebral bodies, and in the cerebrospinal fluid (CSF) space at the level of L3, and the average SIs in each ROI are recorded. The VBQ score is calculated by dividing the median SI of the vertebral bone by the mean SI of the CSF results. Ehresman et al. found that the VBQ score could be used to differentiate between healthy and osteopenic/osteoporotic bone, and was moderately correlated with the femoral neck and overall lowest T-scores (13). MRI is a routine preoperative examination tool that is readily available and ionizing radiation free. Thus, the VBQ score could serve as an opportunistic osteoporosis screening tool for patients undergoing spine surgery. It enables surgeons to evaluate nerve and spinal cord conditions, while providing a more widely available of clinically opportunistic screening for BMD to patients in an economic-efficient and radiation-free manner, thereby reducing unnecessary radiation exposure.
The use of the VBQ score to evaluate osteoporosis has been the subject of many recently published studies. However, there is a lack of consensus on the diagnostic value of the VBQ score in assessing BMD, and varying thresholds have been used throughout the literature. Chen et al. conducted the first meta-analysis to examine the diagnostic value of VBQ (14). However, due to the high heterogeneity and limited sample size of the studies included in the meta-analysis, Chen et al. did not conduct any further analyses to investigate the source of heterogeneity, which might have led to results bias. Therefore, we performed this systematic review and meta-analysis to explore the value of the MRI-based VBQ score in evaluating abnormal BMD and the source of heterogeneity. Our findings provide a comprehensive overview of the effectiveness of the MRI-based VBQ score in identifying BMD. We present this article in accordance with the PRISMA-DTA reporting checklist (15) (available at https://qims.amegroups.com/article/view/10.21037/qims-24-532/rc).
Methods
Data sources and search strategy
A systematic search of the following electronic databases was performed to retrieve articles published from the inception of the databases to December 31, 2023: PubMed, EBSCO, Ovid, Web of Science, and Cochrane Library. We also searched Chinese electronic databases, including the Wanfang, China National Knowledge Infrastructure (CNKI), Chinese Science and Technology Periodical (VIP) databases. The reference lists of the articles were reviewed to identify any additional relevant studies that were not found in the primary searches. The following keywords were used in the search: (osteoporosis OR bone loss OR osteopenia OR BMD OR bone mineral density) AND (magnetic resonance imaging OR MRI OR MR) AND (VBQ OR vertebral bone quality). The search strategy is shown in Table S1. Our search was registered in the PROSPERO database with all the necessary details (No. CRD42024501549).
Study selection
To be eligible for inclusion in this meta-analysis, the studies had to meet the following inclusion criteria: (I) population: include patients aged >18 years; (II) index test: include patients who had undergone MRI and DXA/QCT for whom MRI-VBQ scores had been used as the diagnostic tool for BMD; (III) outcomes: focus on diagnosing BMD and include sufficient data to reconstruct 2×2 tables to determine sensitivity and specificity; (IV) have been published as original articles. Studies were excluded from the meta-analysis if they met any of the following exclusion criteria: (I) had a sample size <10 patients; (II) had a manuscript type that comprised a case report, animal trial, review article, systematic review, meta-analysis, commentaries, editorial, or meeting abstract; and/or (III) had an overlapping patient population. If there was a similarity between the study populations, the study with the largest and most recent sample was chosen. When 2×2 tables could not be established, the authors of the eligible studies were contacted for more information.
Data extraction
Two reviewers (D.Y. and C.L., with 5 years and 8 years of experience in radiology, respectively) independently reviewed the included studies to extract and enter the key data elements into pre-designed data abstraction forms. Any discrepancies were resolved by consensus review. The extracted data included the first author, publication year, publication region, study design, duration of patient recruitment, participant characteristics, sample size, identification of bone status, reference standard, magnet field strength, VBQ method, sensitivity, specificity, numbers of true/false positives and true/false negatives, area under the curve (AUC), and VBQ score threshold.
Quality assessment
The articles included in this study were assessed for bias and clinical applicability using the Quality Assessment of Diagnostic Accuracy Studies (QUADAS-2) tool (16).
Data synthesis and statistical analysis
The statistical analysis was performed using MetaDiSc 1.4 software (Universidad Autónoma de Madrid, Spain), RevMan software (version 5.3.2; Cochrane Collaboration), and STATA (version 14.0, STATA Corp., Texas, USA) with the MIDAS module. The presence of heterogeneity due to threshold effects was tested using the Spearman correlation coefficient between the log of sensitivity and the log of 1-specificity. A Spearman correlation coefficient >0.6 indicated a threshold effect (17). Diagnostic accuracy data (true/false positive and true/false negative) extracted from the included studies were used to calculate the sensitivity, specificity, positive likelihood ratio (PLR), negative likelihood ratio (NLR), and diagnostic odds ratio (DOR) for all individual studies and their corresponding pooled measurements at 95% confidence intervals (CIs). Pooled estimates along with 95% CIs and the AUCs of the summary receiver operating characteristic (SROC) curves were calculated. The closer the AUC was to 1, the higher the diagnostic efficacy of a test or model. Publication bias was investigated using Deeks’ funnel plot asymmetry test. A P value <0.05 indicated significant publication bias. Heterogeneity due to non-threshold effects was assessed using the Cochran-Q test and Higgins inconsistency index (I2) test. Heterogeneity was considered moderate when I2 exceeded 50% and high when I2 exceeded 75% (18). If significant heterogeneity was detected, meta-regression and subgroup analyses were conducted to explore the source of the heterogeneity. This study pre-specified five covariates (i.e., the reference standard, sex, mean age, region of publication and VBQ method), and pre-specified three subgroups based on the magnet field strength, VBQ measurement region, and identification of bone status, respectively. A sensitivity analysis was performed to assess whether the results of the meta-analysis were stable.
Results
Literature search
The detailed literature selection process is shown in Figure 1. Initially, 831 articles were retrieved from the systematic literature search of the relevant databases, and one additional article was identified by checking the reference lists of the retrieved articles. Thus, in total, 832 articles were initially identified. After removing the duplicate articles, the titles and abstracts of the remaining 657 articles were screened, yielding 27 potentially eligible articles. The full text of the remaining articles was then reviewed, and eight additional articles were excluded because they either did not report results of interest (i.e., include diagnostic data on the VBQ evaluation of BMD), or 2×2 columnar data could not be extracted. Ultimately, 19 articles (19-37), comprising 23 studies (one article contained three studies, and two articles contained two studies each), were included in this meta-analysis.
Assessment of study quality
Figure 2 provides graphical representations of the QUADAS-2 risk assessment results. In terms of the risk of bias evaluation, the “patient selection” domain was rated low risk for all studies, as all the patients were enrolled consecutively, case-control designs were avoided, and there were no inappropriate exclusions of cases. The “index test” domain was rated high risk for all studies, as 13 studies were unclear as to whether the MRI analysis had been performed by a clinician blinded to the reference standard (19-21,24,27,29,34-37), and none of the included studies had a predefined threshold of the MRI-based VBQ score. The “reference standard” domain was rated low risk for all studies, as the reference standard (DXA or QCT) was able to correctly classify the patient’s BMD condition (normal or osteopenia/osteoporosis), and the individual interpreting the results of the reference standard (DXA or QCT) was fully blinded to the MRI. In relation to the “flow and timing” domain, one study (20) was rated high risk, as there was not an appropriate interval between MRI and DXA (the patients received the DXA scan 3 years before or after the MRI), and the remaining 22 studies were rated low risk, as there was an appropriate time interval between the included studies (no more than 1 year between the DXA/QCT and MRI scans), the same and only reference standard (DXA or QCT) was used for all the patients, and all the patients were included in the analysis. In relation to the evaluation of the applicability concerns, the risks for “patient selection”, the “index test”, and the “reference standard” were rated low for all studies, as the included patients matched the evaluation value of the MRI-based VBQ scores for evaluating BMD, the VBQ scores were applicable to the evaluation of BMD, and DXA and QCT were applicable to the evaluation of BMD.
Study and patient characteristics
Tables 1,2 provide details of the study and patient characteristics. Of the 23 studies, 15 were performed in China, six in the United States of America, one in Germany, and one in Turkey. The study sample size comprised 2,981 patients (range, 55–426 per study), and the patients were all enrolled consecutively. All the included studies were retrospective (cross-sectional) studies. The average age of the patients ranged from 46 to 71 years, and the average BMI ranged from 23.40 to 31.50 kg/m2. Females accounted for 76.21% of the patient cohort. Of the 23 studies, 17 used the DXA T-score as the reference standard for assessing BMD, and 1,291 patients were diagnosed with osteopenia/osteoporosis, and 773 patients were identified as having normal BMD. The remaining six studies used QCT as the reference standard, and 643 patients were diagnosed with osteopenia/osteoporosis and 274 with normal BMD.
Table 1
ID | First author | Year | Region | Design | Duration | Age (year), mean (SD) |
Female, n (%) | BMI (kg/m2), mean (SD) |
Sample size | Identification of bone quality status | Reference standard (DXA T-score/QCT) | Field strength (T) | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Osteopenia/ osteoporosis (n) |
Normal BMD (n) | ||||||||||||
1 | Salzmann SN (19) | 2022 | USA | R | 2014–2019 | 62.00 (NR) | 104 (52.50) | 28.20 (SD) | 128 | 70 | Normal vs. osteopenia/osteoporosis | QCT | NR |
2 | Kadri A (20) | 2022 | USA | R | 2017.9–2020.9 | 70.10 (8.36) | 66 (79.50) | 28.93 (5.96) | 35/31 | 17 | Normal vs. osteopenia/osteoporosis | DXA | 1.5 or 3.0 |
3 | Huang W (22) | 2022 | China | R | 2020.9–2022.3 | 66.10 (7.20); 66.70 (7.10) |
46 (55.50) | 24.30 (2.40); 23.60 (3.00) |
63 | 20 | Normal vs. osteopenia/osteoporosis | DXA | 1.5 or 3.0 |
4 | Haffer H (21) | 2022 | USA | R | 2014–2021 | 63.30 (12.20) | 149 (55.80) | 29.70 (6.20) | 174 | 93 | Normal vs. osteopenia/osteoporosis | QCT | NR |
5 | Roch PJ (24) | 2023 | Germany | R | 2017–2021 | 69.70 (15.00) | 77 (56.60) | 27.00 (5.00) | 108 | 28 | Normal vs. osteopenia/osteoporosis | QCT | NR |
6 | Chen Z (26) | 2023 | China | R | 2019.7–2020.6 | 59.40 (7.80) | 97 (72.40) | 23.90 (3.10) | 107 | 27 | Normal vs. osteopenia/osteoporosis | DXA | NR |
7 | Li W (30) | 2023 | China | R | 2019.1–2021.7 | 59.4 0 (9.60) | 80 (61.50) | 25.69 (3.24) | 51/59 | 20 | Non-osteoporosis vs. osteoporosis | DXA | NR |
8 | Lin W (29) | 2023 | China | R | 2020.9–2022.11 | 68.90 (9.90) | 354 (78.30) | 23.80 (3.70) | 58/109 | 12 | Non-osteoporosis vs. osteoporosis | QCT | 3.0 |
9 | Wang Z (33) | 2023 | China | R | 2015.1–2022.12 | 51.95 (10.94) | 58 (54.70) | 24.67 (4.20); 24.33 (3.95) |
72 | 34 | Normal vs. osteopenia/osteoporosis | DXA | NR |
10 | Courtois EC (31) | 2023 | USA | R | 2018–2022 | 46.10 (NR) | 169 (39.70) | 28.40 (11.30); 26.80 (4.00); 27.60 (1.88) |
4/67 | 355 | Normal vs. osteopenia/osteoporosis | DXA | NR |
11 | Kim AYE (23) | 2023 | USA | R | 2016.1–2021.5 | 64.00 (12.00) | 37 (60.70) | 28.10 (5.90) | 21/19 | 21 | Non-osteoporosis vs. osteoporosis | QCT | NR |
12 | Yin H (28) | 2023 | China | R | 2020.9–2022.10 | 68.70 (10.10) | 208 (80.00) | 23.80 (3.75) | NR/165 | NR | Non-osteoporosisavs. osteoporosis | QCT | 1.5 |
13 | Pu M (25) | 2023 | China | R | 2018.9–2021.9 | 66.10 (9.40) | 100 (100.00) | 25.80 (4.40); 25.00 (3.70); 23.40 (3.60) |
45/32 | 23 | Normal vs. osteoporosis | DXA | 1.5 |
14 | Özmen E (32) | 2023 | Turkey | R | NR | 63.39 (11.11) | 111 (85.40) | 30.99 (5.25) | 63/24 | 43 | Non-osteoporosis vs. osteoporosis | DXA | NR |
15 | Huang W (27) | 2023 | China | R | 2019.1–2022.6 | 66.9 0 (8.00); 68.1 0 (7.10); 71.0 0 (5.60) |
149 (72.00) | 26.00 (3.40); 24.80 (3.00); 23.40 (3.40) |
103/64 | 40 | Normal vs. osteopenia/osteoporosis | DXA | NR |
16 | Liu H (37) | 2023 | China | R | 2017.1–2021.5 | 66.88 (6.33); 68.57 (6.29) | 163 (100.00) | 25.48 (3.55); 24.84 (3.89) |
115 | 48 | Normal vs. osteopenia/osteoporosis | DXA | 1.5 |
17 | Wei Q (34) | 2023 | China | R | 2018.1–2022.8 | 63. 91 (7.41) | 210 (100.00) | NR | 166 | 44 | Normal vs. osteopenia/osteoporosis | DXA | 1.5 |
18 | Huang W (35) | 2023 | China | R | 2020.9–2022.3 | 67.00 (7.10) | 37 (52.10) | 24.22 (2.30); 23.51 (3.12) |
54 | 17 | Normal vs. osteopenia/osteoporosis | DXA | NR |
19 | Wang P (36) | 2023 | China | R | 2019.1–2020.8 | 63.10 (9.80); 70.10 (7.70) | 57 (69.50) | 31.50 (5.40); 28.40 (4.90) |
42 | 40 | Normal vs. osteopenia/osteoporosis | DXA | 1.5 |
a, the number of non-osteoporosis = 95 (osteopenia + normal). SD, standard deviation; BMI, body mass index; BMD, bone mineral density; DXA, dual-energy X-ray absorptiometry; QCT, quantitative computed tomography; R, retrospective; NR, not reported.
Table 2
ID | First author | VBQ method | TP | FP | FN | TN | AUC | Threshold | Sen | Spe |
---|---|---|---|---|---|---|---|---|---|---|
1 | Salzmann SN (19) | Median L1–L4/L3 CSF | 95 | 30 | 33 | 40 | 0.70 | 2.38 | 0.74 | 0.57 |
2 | Kadri A/a (20) | Median L1–L4/L3 CSF | 51 | 4 | 15 | 13 | 0.82 | 3.12 | 0.78 | 0.75 |
3 | Kadri A/b (20) | L1 /L1 CSF | 58 | 5 | 8 | 12 | 0.82 | 3.01 | 0.88 | 0.69 |
4 | Huang W (22) | Mean C2–C7/T1 CSF | 58 | 8 | 5 | 12 | 0.78 | 2.90 | 0.92 | 0.60 |
5 | Haffer H (21) | Median L1–L4/L3 CSF | 147 | 55 | 27 | 38 | 0.67 | 2.18 | 0.84 | 0.40 |
6 | Roch PJ (24) | Mean L1–L4/L3 CSF | 69 | 6 | 39 | 22 | 0.71 | 2.10 | 0.64 | 0.78 |
7 | Chen Z (26) | Median L1–L4/L3 CSF | 94 | 14 | 13 | 13 | NR | NR | 0.87 | 0.48 |
8 | Li W/a (30) | L1 /L1 CSF | 45 | 25 | 14 | 46 | 0.70 | 3.26 | 0.76 | 0.64 |
9 | Li W/b (30) | Median L1–L4/L3 CSF | 41 | 26 | 18 | 45 | 0.67 | 3.20 | 0.69 | 0.63 |
10 | Lin W (29) | Mean L1–L4/mean CSF (L1–L3) | 73 | 21 | 36 | 49 | 0.70 | 2.59 | 0.67 | 0.69 |
11 | Wang Z/a (33) | Mean C2–C7/T2 CSF | 50 | 10 | 22 | 24 | 0.72 | 2.99 | 0.69 | 0.70 |
12 | Wang Z/b (33) | Median C3–C6/C2 CSF | 51 | 11 | 21 | 23 | 0.71 | 3.17 | 0.70 | 0.67 |
13 | Wang Z/c (33) | Median C3–C6/C5 CSF | 45 | 8 | 27 | 26 | 0.71 | 3.00 | 0.62 | 0.76 |
14 | Courtois EC (31) | Median L1–L4/L3 CSF | 39 | 173 | 32 | 182 | 0.55 | 2.50 | 0.54 | 0.51 |
15 | Kim AYE (23) | Median L1–L4/L3 CSF | 11 | 4 | 8 | 38 | 0.75 | 2.60 | 0.58 | 0.90 |
16 | Yin H (28) | Median L1–L4/mean CSF (L1–L3) | 134 | 42 | 31 | 53 | 0.73 | 3.70 | 0.81 | 0.55 |
17 | Pu M (25) | Median L1–L4/L3 CSF | 28 | 9 | 4 | 14 | 0.81 | 3.05 | 0.87 | 0.61 |
18 | Özmen E (32) | Mean L1–L4/L3 CSF | 20 | 59 | 4 | 47 | 0.66 | 2.70 | 0.83 | 0.44 |
19 | Huang W (27) | S1 VBQ = S1/CSF L3 | 129 | 12 | 38 | 28 | 0.82 | 2.93 | 0.77 | 0.70 |
20 | Liu H (37) | Mean L1–L4/L3 CSF | 99 | 12 | 16 | 36 | 0.81 | 3.08 | 0.86 | 0.75 |
21 | Wei Q (34) | Mean L1–L4/L3 CSF | 135 | 16 | 31 | 28 | 0.77 | 3.24 | 0.81 | 0.64 |
22 | Huang W (35) | Mean C2–C7/T1 CSF | 44 | 5 | 10 | 12 | 0.81 | 3.19 | 0.81 | 0.70 |
23 | Wang P (36) | Mean L1–L4/L3 CSF | 34 | 5 | 8 | 35 | 0.93 | 2.98 | 0.81 | 0.88 |
VBQ, vertebral bone quality; TP, true positive; FP, false positive; FN, false negative; TN, true negative; AUC, area under the curve; Sen, sensitivity; Spe, specificity; CSF, cerebrospinal fluid; NR, not report.
Diagnostic accuracy
The Spearman correlation coefficient between the log of sensitivity and the log of 1−specificity was 0.36 (P=0.09), which was not significant; thus, no threshold effect was found in this study. The symmetric SROC curve was plotted, and no “shoulder-arm shape” was found, which provided further evidence that there was no threshold effect. The Cochran-Q test for the DOR showed that heterogeneity due to non-threshold effects was present (Q=63.66, P<0.01). Further, as the I2 of the sensitivity, specificity, PLR, NLR, and DOR in this study were all >50%, the random-effects model was used to combine the evaluation indicators.
The pooled sensitivity was 0.77 (95% CI, 0.73–0.81; I2=76.80%, P<0.001), the pooled specificity was 0.65 (95% CI, 0.59–0.71; I2=79.76%, P<0.001), the pooled PLR was 2.24 (95% CI, 1.90–2.64; I2=63.40%, P<0.001), the pooled NLR was 0.35 (95% CI, 0.29–0.41; I2=73.46%, P<0.001), the pooled AUC was 0.78 (95% CI, 0.74–0.82; P<0.001), and the DOR was 6.49 (95% CI, 4.82–8.73; I2=100%, P<0.001) (Figures 3-6).
Meta-regression
The I2 test revealed obvious heterogeneity among the studies. To analyze the source of the heterogeneity, five covariates (i.e., the reference standard, sex, mean age, region of publication, and VBQ method), were included in the meta-regression analysis to assess their effect on heterogeneity. The results showed that sensitivity was influenced by the reference standard, sex, mean age, region of publication, and VBQ method, while specificity was affected by the VBQ method (Figure 7).
Subgroup analysis
A subgroup analysis based on the magnet field strength was conducted to evaluate the diagnostic performance of the VBQ score for detecting BMD (Figure S1). The pooled sensitivity of the subgroup analysis of five studies that used 1.5-T MRI (25,28,34,36,37) was 0.83 (95% CI, 0.79–0.86; I2=0.00), and the pooled specificity was 0.69 (95% CI, 0.57–0.79; I2=73.82%). An additional subgroup analysis was conducted of 17 studies (19-24,26,27,30-33,35), but the results about magnet field strength were not clear. The pooled sensitivity of this subgroup was 0.76 (95% CI, 0.71–0.81; I2=78.00%), and the pooled specificity was 0.64 (95% CI, 0.56–0.71; I2=80.44%). The sensitivity of the only study (29) that used 3.0-T MRI was 0.67 (95% CI, 0.57–0.76), and the specificity was 0.70 (95% CI, 0.58–0.80).
In the subgroup analysis of cervical VBQ (22,33,35) and lumbar VBQ (19-21,23-26,28-32,34,36,37), the pooled sensitivity was 0.76 (95% CI, 0.65–0.85; I2=78.74%) and 0.78 (95% CI, 0.73–0.82; I2=78.61%), and the pooled specificity was 0.69 (95% CI, 0.60–0.77; I2=0.00) and 0.65 (95% CI, 0.57–0.72; I2=83.54%), respectively (Figure S2).
In the distinguishing normal from osteopenia/osteoporosis subgroup analysis (19-22,24-27,31,33-37), the pooled sensitivity was 0.79 (95% CI, 0.74–0.83; I2=80.63%), and the pooled specificity was 0.66 (95% CI, 0.59–0.72; I2=79.25%). In the distinguishing osteoporosis from non-osteoporosis subgroup analysis (29,30,23,28,32), the pooled sensitivity was 0.73 (95% CI, 0.64–0.81; I2=57.44%), and the pooled specificity was 0.66 (95% CI, 0.53–0.76; I2=84.58%) (Figure S3).
Sensitivity analysis
After removing the studies one by one, no significant effect was found, suggesting that our findings were stable and plausible (Figure 8).
Publication bias analysis
The Deeks’ funnel plot test revealed a P<0.01, which suggested that there was publication bias (Figure 9).
Discussion
MRI is frequently used in the preoperative assessment of spine surgery patients, and recently, its use as a possible alternative to the BMD evaluation has been investigated. Studies have shown that the occurrence of osteoporosis is often characterized by trabecular atrophy and an increase in bone marrow adipocytes (11). Many quantitative methods have been used to measure the bone trabecular microstructure or bone marrow fat content based on differences in SIs within bone tissues (9,10,13,38-40). Changes in these parameters have been found to be negatively correlated with osteoporosis and bone quality. Bandirali et al. (9) first introduced the “M-score” in 2015 as a new MRI-based score simulating DEXA T-score calculation, and reported that it had a diagnostic precision of 84.4% in differentiating between osteoporosis and non-osteoporosis. It was further evaluated by other studies, and these studies reported that it had a better correlation with BMD than other MRI measures (pooled r2=−0.58). However, the clinical utility of the M-score is limited because it relies on signal-to-noise ratios that are specific to the MR system in use, and M-score values vary between devices. In 2019, Ehresman et al. proposed a novel, scanner-independent, T1-weighted MRI-based score for evaluating patient BMD; that is, the VBQ score (13). The VBQ score was calculated using MR volumes acquired by four distinct MR systems, and no significant difference in the VBQ scores was found between the machines tested. Thus, the VBQ score may have greater generalizability and clinical utility than the M-score.
Considerable research has been conducted since Ehresman et al. first described the use of the VBQ score as a diagnostic tool for osteoporosis. The specific VBQ method employed has varied across studies, with some using median or mean measurements at different spinal levels (e.g., L1–L4, L3, C2–C7) and dividing them by CSF for standardization. The studies have reported a range of AUCs (0.55–0.93), sensitivity values (0.54–0.92), and specificity values (0.40–0.90), which shows that the accuracy of the VBQ in assessing BMD varies. We found that the VBQ score had a high accuracy of 0.78 for the diagnosis of bone loss with a pooled sensitivity of 0.77 and a pooled specificity of 0.65. Thus, the VBQ score can be used as a simple, effective tool for differentiating between normal BMD and bone loss. Additionally, the pooled DOR showed that the probability of correctly diagnosing an individual with bone loss was 6.49 times higher than a false-negative diagnosis in a healthy individual.
Heterogeneity was found in this meta-analysis. Specifically, we found that the reference standard, sex, mean age, region of publication, VBQ method, and magnetic field strength were potential sources of heterogeneity. The reference standard for the diagnosis of osteoporosis used in the included studies was either DXA or QCT. A study by Lin et al. (41) of 296 postmenopausal women found that the lumbar BMD measurements were not in complete accord between the QCT and DXA. In addition, another study (29) found that the VBQ score was more strongly correlated with QCT-vBMD than with DXA T-scores. VBQ score results may be influenced by knowledge of reference standard results. Bone loss is observed more often in women and the elderly. Research has reported that the prevalence of osteoporosis in women is 23.10% worldwide, but only 11.70% in men (2). This might explain why age and sex had a significant influence on heterogeneity. In certain characteristic study cohorts, the lack of patients with osteoporosis may adversely distort the diagnostic performance of the VBQ score. Its diagnostic performance may be better in patient populations in which a large proportion of the population suffers from bone loss (e.g., study groups comprising much older and more female participants). For instance, Courtois et al. (31) examined patients with symptomatic degenerative disc disease, with a mean age of 46.10 years (no patient was older than 66 years), and the proportion of female and patients with bone loss was 39.70% and 16.70%, respectively. Courtois et al. found that the diagnostic accuracy of the VBQ score for differentiating between osteoporosis/osteopenia and normal BMD was 0.55, with a sensitivity of 0.54 and a specificity of 0.51. Pu et al. (25) conducted a study of female patients older than 50 years who underwent spinal surgery, and found that the diagnostic accuracy of the VBQ score was 0.81, with a sensitivity of 0.87 and a specificity of 0.61.
In our subgroup analysis of the VBQ measurement region, the heterogeneity of specificity within the cervical subgroup decreased significantly, but no obvious change in the lumbar subgroup was observed. The heterogeneity of sensitivity was not significantly decreased in either subgroup. Razzouk et al. evaluated the associations among cervical, thoracic, and lumbar VBQ scores, and found that the thoracic VBQ score provided surrogate values for the lumbar VBQ score while the cervical VBQ score was distinct from the lumbar VBQ score (42). Therefore, VBQ may vary across different regions of the spine. The diagnostic accuracy of the VBQ score measured in different regions of the spine on the assessment of BMD and the associations among them are still unclear. Further research should be conducted to investigate the optimized calculation of VBQ measures to improve clinical utility of VBQ score for diagnosising BMD.
In addition, the VBQ measurement is based on MRI T1WI, and our subgroup analysis indicated that the field strength contributed to the influence of the diagnostic value of the VBQ score, which is consistent with the findings of Lin et al. (29), who found that the 1.5-T-VBQ score was better able to differentiate between osteoporotic and non-osteoporotic patients than the 3.0-T-VBQ score (AUCs =0.74 and 0.70, respectively). Considering the non-negligible difference in diagnostic performance for osteoporosis between the 1.5-T-VBQ and 3.0-T-VBQ scores, it is crucial to pay attention to the magnetic field strength when assessing the VBQ score.
Further, in this meta-analysis, more than half of the subjects were from China, which was also one of the sources of heterogeneity. Some studies have indicated that when diagnosing osteoporosis in elderly East Asian populations, it is necessary to consider their ethnic-specific bone properties, and have proposed that the BMD threshold should be optimized to accommodate these features (43-45). The 12 studies conducted in China using existing BMD diagnostic criteria might have overestimated the diagnosis of osteoporosis, which might have affected the diagnostic performance of the VBQ score. Due to the differences in BMD between ethnicities, further research needs to be conducted to establish BMD normative benchmarks for different ethnicities, and to determine the degree to which ethnicity should be incorporated in future VBQ assessments for BMD.
The sensitivity analysis showed that the results of our study were stable and reliable. The Deeks’ funnel plot results indicated that there was a high likelihood of publication bias in this study, which might be related to the fact that this study contains only English- and Chinese-language articles. Second, the small sample sizes of the included studies might have also contributed to the publication bias to some extent. Third, some studies might not have reported negative results.
The present study had some limitations. First, while we established rigorous inclusion criteria, publication bias was detected. Second, the majority of the population included in the study was from China, which limits the broad applicability of the findings. Third, all the articles included were retrospective and cross-sectional studies, which might have led to bias. The inclusion of more prospective studies could have helped to determine causality without the need to work backwards to understand outcomes, or identify influencing factors. Finally, bone specimens for histologic analysis could not be obtained. Thus, future studies should be conducted that include histologic analyses to contribute to our understanding of the biological nature of the VBQ score.
Conclusions
This meta-analysis showed that the MRI-based VBQ score has some diagnostic value in the detection of osteoporosis. The VBQ score could serve as a clinically useful tool for opportunistic osteoporosis screening before spine surgery.
Acknowledgments
Funding: None.
Footnote
Reporting Checklist: The authors have completed the PRISMA-DTA reporting checklist. Available at https://qims.amegroups.com/article/view/10.21037/qims-24-532/rc
Conflicts of Interest: All authors have completed the ICMJE uniform disclosure form (available at https://qims.amegroups.com/article/view/10.21037/qims-24-532/coif). The authors have no conflicts of interest to declare.
Ethical Statement: The authors are accountable for all aspects of the work in ensuring that questions related to the accuracy or integrity of any part of the work are appropriately investigated and resolved.
Open Access Statement: This is an Open Access article distributed in accordance with the Creative Commons Attribution-NonCommercial-NoDerivs 4.0 International License (CC BY-NC-ND 4.0), which permits the non-commercial replication and distribution of the article with the strict proviso that no changes or edits are made and the original work is properly cited (including links to both the formal publication through the relevant DOI and the license). See: https://creativecommons.org/licenses/by-nc-nd/4.0/.
References
- Agrawal AC, Garg AK. Epidemiology of Osteoporosis. Indian J Orthop 2023;57:45-8. [Crossref] [PubMed]
- Salari N, Ghasemi H, Mohammadi L, Behzadi MH, Rabieenia E, Shohaimi S, Mohammadi M. The global prevalence of osteoporosis in the world: a comprehensive systematic review and meta-analysis. J Orthop Surg Res 2021;16:609. [Crossref] [PubMed]
- Yaacobi E, Sanchez D, Maniar H, Horwitz DS. Surgical treatment of osteoporotic fractures: An update on the principles of management. Injury 2017;48:S34-40. [Crossref] [PubMed]
- Papaioannou A, Kennedy CC, Ioannidis G, Sawka A, Hopman WM, Pickard L, Brown JP, Josse RG, Kaiser S, Anastassiades T, Goltzman D, Papadimitropoulos M, Tenenhouse A, Prior JC, Olszynski WP, Adachi JD. CaMos Study Group. The impact of incident fractures on health-related quality of life: 5 years of data from the Canadian Multicentre Osteoporosis Study. Osteoporos Int 2009;20:703-14. [Crossref] [PubMed]
- LeBoff MS, Greenspan SL, Insogna KL, Lewiecki EM, Saag KG, Singer AJ, Siris ES. The clinician's guide to prevention and treatment of osteoporosis. Osteoporos Int 2022;33:2049-102. [Crossref] [PubMed]
- Rand T, Seidl G, Kainberger F, Resch A, Hittmair K, Schneider B, Glüer CC, Imhof H. Impact of spinal degenerative changes on the evaluation of bone mineral density with dual energy X-ray absorptiometry (DXA). Calcif Tissue Int 1997;60:430-3. [Crossref] [PubMed]
- Crivelli M, Chain A, da Silva ITF, Waked AM, Bezerra FF. Association of Visceral and Subcutaneous Fat Mass With Bone Density and Vertebral Fractures in Women With Severe Obesity. J Clin Densitom 2021;24:397-405. [Crossref] [PubMed]
- Izadyar S, Golbarg S, Takavar A, Zakariaee SS. The Effect of the Lumbar Vertebral Malpositioning on Bone Mineral Density Measurements of the Lumbar Spine by Dual-Energy X-Ray Absorptiometry. J Clin Densitom 2016;19:277-81. [Crossref] [PubMed]
- Bandirali M, Di Leo G, Papini GD, Messina C, Sconfienza LM, Ulivieri FM, Sardanelli F. A new diagnostic score to detect osteoporosis in patients undergoing lumbar spine MRI. Eur Radiol 2015;25:2951-9. [Crossref] [PubMed]
- Shayganfar A, Khodayi M, Ebrahimian S, Tabrizi Z. Quantitative diagnosis of osteoporosis using lumbar spine signal intensity in magnetic resonance imaging. Br J Radiol 2019;92:20180774. [Crossref] [PubMed]
- Meunier P, Aaron J, Edouard C, Vignon G. Osteoporosis and the replacement of cell populations of the marrow by adipose tissue. A quantitative study of 84 iliac bone biopsies. Clin Orthop Relat Res 1971;147-54. [Crossref] [PubMed]
- Xie Y, Zhang L, Xiong Q, Gao Y, Ge W, Tang P. Bench-to-bedside strategies for osteoporotic fracture: From osteoimmunology to mechanosensation. Bone Res 2019;7:25. [Crossref] [PubMed]
- Ehresman J, Pennington Z, Schilling A, Lubelski D, Ahmed AK, Cottrill E, Khan M, Sciubba DM. Novel MRI-based score for assessment of bone density in operative spine patients. Spine J 2020;20:556-62. [Crossref] [PubMed]
- Chen A, Feng S, Lai L, Yan C. A meta-analysis of the value of MRI-based VBQ scores for evaluating osteoporosis. Bone Rep 2023;19:101711. [Crossref] [PubMed]
- McInnes MDF, Moher D, Thombs BD, McGrath TA, Bossuyt PM, et al. Preferred Reporting Items for a Systematic Review and Meta-analysis of Diagnostic Test Accuracy Studies: The PRISMA-DTA Statement. JAMA 2018;319:388-96. [Crossref] [PubMed]
- Whiting PF, Rutjes AW, Westwood ME, Mallett S, Deeks JJ, Reitsma JB, Leeflang MM, Sterne JA, Bossuyt PM. QUADAS-2 Group. QUADAS-2: a revised tool for the quality assessment of diagnostic accuracy studies. Ann Intern Med 2011;155:529-36. [Crossref] [PubMed]
- Devillé WL, Buntinx F, Bouter LM, Montori VM, de Vet HC, van der Windt DA, Bezemer PD. Conducting systematic reviews of diagnostic studies: didactic guidelines. BMC Med Res Methodol 2002;2:9. [Crossref] [PubMed]
- Higgins J, Green S. Cochrane handbook for systematic reviews of interventions, version 5.1.0. The Cochrane Collaboration. Available online: https://handbook-5-1.cochrane.org/chapter_9/9_5_2_identifying_and_measuring_heterogeneity.htm. Updated March 2011. Accessed 2 October 2017.
- Salzmann SN, Okano I, Jones C, Zhu J, Lu S, Onyekwere I, Balaji V, Reisener MJ, Chiapparelli E, Shue J, Carrino JA, Girardi FP, Cammisa FP, Sama AA, Hughes AP. Preoperative MRI-based vertebral bone quality (VBQ) score assessment in patients undergoing lumbar spinal fusion. Spine J 2022;22:1301-8. [Crossref] [PubMed]
- Kadri A, Binkley N, Hernando D, Anderson PA. Opportunistic Use of Lumbar Magnetic Resonance Imaging for Osteoporosis Screening. Osteoporos Int 2022;33:861-9. [Crossref] [PubMed]
- Haffer H, Muellner M, Chiapparelli E, Moser M, Dodo Y, Zhu J, Shue J, Sama AA, Cammisa FP, Girardi FP, Hughes AP. Bone quality in patients with osteoporosis undergoing lumbar fusion surgery: analysis of the MRI-based vertebral bone quality score and the bone microstructure derived from microcomputed tomography. Spine J 2022;22:1642-50. [Crossref] [PubMed]
- Huang W, Gong Z, Zheng C, Chen Y, Ma X, Wang H, Jiang J. Preoperative Assessment of Bone Density Using MRI-Based Vertebral Bone Quality Score Modified for Patients Undergoing Cervical Spine Surgery. Global Spine J 2024;14:1238-47. [Crossref] [PubMed]
- Kim AYE, Lyons K, Sarmiento M, Lafage V, Iyer S. MRI-Based Score for Assessment of Bone Mineral Density in Operative Spine Patients. Spine (Phila Pa 1976) 2023;48:107-12. [Crossref] [PubMed]
- Roch PJ, Çelik B, Jäckle K, Reinhold M, Meier MP, Hawellek T, Kowallick JT, Klockner FS, Lehmann W, Weiser L. Combination of vertebral bone quality scores from different magnetic resonance imaging sequences improves prognostic value for the estimation of osteoporosis. Spine J 2023;23:305-11. [Crossref] [PubMed]
- Pu M, Zhong W, Heng H, Yu J, Wu H, Jin Y, Zhang P, Shen Y. Vertebral bone quality score provides preoperative bone density assessment for patients undergoing lumbar spine surgery: a retrospective study. J Neurosurg Spine 2023; Epub ahead of print. [Crossref]
- Chen Z, Lei F, Ye F, Yuan H, Li S, Feng D. MRI-based vertebral bone quality score for the assessment of osteoporosis in patients undergoing surgery for lumbar degenerative diseases. J Orthop Surg Res 2023;18:257. [Crossref] [PubMed]
- Huang W, Gong Z, Wang H, Zheng C, Chen Y, Xia X, Ma X, Jiang J. Use of MRI-based vertebral bone quality score (VBQ) of S1 body in bone mineral density assessment for patients with lumbar degenerative diseases. Eur Spine J 2023;32:1553-60. [Crossref] [PubMed]
- Yin H, Lin W, Xie F, He C, Chen T, Zheng G, Wang Z. MRI-based Vertebral Bone Quality Score for Osteoporosis Screening Based on Different Osteoporotic Diagnostic Criteria Using DXA and QCT. Calcif Tissue Int 2023;113:383-92. [Crossref] [PubMed]
- Lin W, He C, Xie F, Chen T, Zheng G, Yin H, Chen H, Wang Z. Assessment of bone density using the 1.5 T or 3.0 T MRI-based vertebral bone quality score in older patients undergoing spine surgery: does field strength matter? Spine J 2023;23:1172-81. [Crossref] [PubMed]
- Li W, Zhu H, Tian H, Tong T, Hua Z, Zhao X, Shen Y, Wang L. Combinations of two imaging parameters to improve bone mineral density (BMD) assessment in patients with lumbar degenerative diseases. BMC Musculoskelet Disord 2023;24:747. [Crossref] [PubMed]
- Courtois EC, Davidson IU, Ohnmeiss DD, Guyer RD. Evaluating alternatives to dual-energy x-ray absorptiometry for assessing bone quality in patients undergoing spine surgery. J Neurosurg Spine 2024;40:84-91. [Crossref] [PubMed]
- Özmen E, Biçer O, Meriç E, Circi E, Barış A, Yüksel S. Vertebral bone quality score for opportunistic osteoporosis screening: a correlation and optimal threshold analysis. Eur Spine J 2023;32:3906-11. [Crossref] [PubMed]
- Wang Z, Zhang J, Chen Q, Huang Y, Song Y, Liu L, Feng G. Different cervical vertebral bone quality scores for bone mineral density assessment for the patients with cervical degenerative disease undergoing ACCF/ACDF: computed tomography and magnetic resonance imaging-based study. J Orthop Surg Res 2023;18:927. [Crossref] [PubMed]
- Wei Q, Liao P, Fan Y, Guo W, Zhan J, Cai D. Diagnostic value of vertebral CT value combined with lumbar MRI vertebral bone mass score for screening bone mass abnormalities in postmenopausal women. The Journal of Cervicodynia and Lumbodynia 2023;44:994-8.
- Huang W, Gong Z, Li Z, Xia X, Ma X, Lv F, Wang H, Jiang J. Research on the MRI/CT-based preoperative bone quality assessment method for patients with cervical degenerative diseases and validation of its diagnostic efficacy. Chin J Orthop 2023;43:697-704.
- Wang P, Wang J, Shi P, Zhu L, Zhang L, Feng X. A new score system based lumbar MRI image to assess bone mineral density. Journal of Chinese Physician 2023;24:667-71.
- Liu H, Yang H, Zeng Z, Wang L, Yang K, Hu Y, Qu B. Lumbar MRI vertebral bone quality score to evaluate the severity of osteoporosis in postmenopausal women. Chin J Tissue Eng Res 2023;27:606-11.
- Ergen FB, Gulal G, Yildiz AE, Celik A, Karakaya J, Aydingoz U. Fat fraction estimation of the vertebrae in females using the T2*-IDEAL technique in detection of reduced bone mineralization level: comparison with bone mineral densitometry. J Comput Assist Tomogr 2014;38:320-4. [Crossref] [PubMed]
- Agrawal K, Agarwal Y, Chopra RK, Batra A, Chandra R, Thukral BB. Evaluation of MR Spectroscopy and Diffusion-Weighted MRI in Postmenopausal Bone Strength. Cureus 2015;7:e327. [Crossref] [PubMed]
- Chang HK, Hsu TW, Ku J, Ku J, Wu JC, Lirng JF, Hsu SM. Simple parameters of synthetic MRI for assessment of bone density in patients with spinal degenerative disease. J Neurosurg Spine 2022;36:414-21. [Crossref] [PubMed]
- Lin W, He C, Xie F, Chen T, Zheng G, Yin H, Chen H, Wang Z. Discordance in lumbar bone mineral density measurements by quantitative computed tomography and dual-energy X-ray absorptiometry in postmenopausal women: a prospective comparative study. Spine J 2023;23:295-304. [Crossref] [PubMed]
- Razzouk J, Ramos O, Ouro-Rodrigues E, Samayoa C, Wycliffe N, Cheng W, Danisa O. Comparison of cervical, thoracic, and lumbar vertebral bone quality scores for increased utility of bone mineral density screening. Eur Spine J 2023;32:20-6. [Crossref] [PubMed]
- Wáng YXJ, Yu W, Leung JCS, Griffith JF, Xiao BH, Diacinti D, Guermazi A, Chan WP, Blake GM. More evidence to support a lower quantitative computed tomography (QCT) lumbar spine bone mineral density (BMD) cutpoint value for classifying osteoporosis among older East Asian women than for Caucasians. Quant Imaging Med Surg 2024;14:3239-47. [Crossref] [PubMed]
- Wáng YXJ, Griffith JF, Blake GM, Diacinti D, Xiao BH, Yu W, Su Y, Jiang Y, Guglielmi G, Guermazi A, Kwok TCY. Revision of the 1994 World Health Organization T-score definition of osteoporosis for use in older East Asian women and men to reconcile it with their lifetime risk of fragility fracture. Skeletal Radiol 2024;53:609-25. [Crossref] [PubMed]
- Wáng YXJ. The definition of spine bone mineral density (BMD)-classified osteoporosis and the much inflated prevalence of spine osteoporosis in older Chinese women when using the conventional cutpoint T-score of -2.5. Ann Transl Med 2022;10:1421. [Crossref] [PubMed]