Detection of intermediate- and high-risk prostate cancer with biparametric magnetic resonance imaging: a systematic review and meta-analysis
Introduction
Prostate cancer (PCa) is the second most common malignant cancer in men worldwide. In 2020, there were 1.4 million new diagnoses, and the incidence of PCa has been increasing over the past decade due to the advent of prostate-specific antigen (PSA) screening (1). Advanced PCa is life threatening, especially in the metastatic stage. Compared with early-stage PCa, advanced PCa is associated with a significantly reduced survival rate and quality of life (2). Consequently, the timely detection and diagnosis of intermediate- and high-risk prostate cancer (IHPC) is crucial.
Previously, PCa screening and diagnosis were made on the basis of rectal examinations, PSA, and systematic transrectal ultrasound-guided prostate biopsy (TUPB). However, these approaches are controversial given the a high false-negative rate for clinically significant cancer (3) and the overtreatment of clinically insignificant lesions (4), which increases the risks of hemorrhage and infection. With improvements in technology and progress of modern medicine, patients now have higher expectations for their prognosis. Multiparametric magnetic resonance imaging (mpMRI) has been increasingly used for the detection, staging and treatment of PCa. Established parameters of mpMRI include T2-weighted imaging (T2WI), diffusion-weighted imaging (DWI), dynamic contrast enhancement (DCE), and magnetic resonance spectroscopy, among others (5). mpMRI is currently the most accurate imaging method for the detection, localization, and staging of PCa, as confirmed in many studies (6). However, compared with mpMRI, biparametric MRI (bpMRI) requires fewer human and physical resources, has a shorter scanning time and lower costs, causes less discomfort, and carries none of the allergy risks associated with long-term gadolinium exposure (7). Thus, the role of DCE has been debated, with some arguing that it is unnecessary in treatment-naïve patients (8).
A previous systematic review found that the performance of bpMRI in the detection of PCa was similar to that of mpMRI (9). In addition, recent studies have shown that the performance of DWI and DCE are similar (10). Considering the disadvantages of DCE, increasing numbers of studies suggest that DCE could be omitted (11) and that patients should undergo bpMRI examinations.
Low-risk PCa has less of physical and mental impacts on patients and better prognosis, whereas IHPC is life threatening. Therefore, the timely detection of IHPC is critical to appropriate treatment in its early stages. However, to date, no systematic review has evaluated the accuracy of bpMRI for the diagnosis of IHPC. Through the maximal collection and extraction of the relevant data and subsequent meta-regression analysis, this study aimed to evaluate bpMRI in the diagnosis of IHPC and determine the impacts of the covariates on sensitivity and specificity (12). We present the following article in accordance with the PRISMA-DTA reporting checklist (available at https://qims.amegroups.com/article/view/10.21037/qims-22-1024/rc) (13).
Methods
The protocol of this systematic review was registered on the PROSPERO (The International Prospective Register of Systematic Review) registry (protocol No. CRD42022326981).
Literature search
A systematic search of the PubMed and Web of Science databases was conducted up to March 15, 2022. The search query was as follows: “((prostate cancer OR prostatic cancer OR prostate neoplasm OR prostatic neoplasm OR prostate tumor OR prostatic tumor OR prostate carcinoma OR prostatic carcinoma OR PCa) AND (biparametric OR bp OR T2-weighted image and DWI OR T2-weighted imaging and DWI)) AND (magnetic resonance imaging OR MRI OR MR)”.
Selection criteria
Studies were eligible for inclusion in this review if they met the following criteria: a particular focus on the T2WI-based and DWI-based detection of IHPC, defined as a Gleason score (GS) ≥7; pathological results of prostate biopsy or prostatectomy as the reference standard; adequate data to calculate diagnostic accuracy with at least 30 patients; and published in English. Conversely, studies were excluded if they were review articles or other literature types with nonoriginal data, such as letter and case report.
Eligible studies were selected as follows. First, 2 radiologists, Y Wang and W Wang, independently evaluated the titles and abstracts of all retrieved articles. Articles meeting the inclusion criteria were retained, whereas the other papers were excluded. The full text of the selected papers was then read by the 2 radiologists, and any papers not meeting the inclusion criteria were excluded. Finally, the same method was used to investigate all the references listed in the selected articles to avoid missing any relevant literature.
Data extraction and quality assessment
A consistent method was used to extract relevant information from articles, including journal name, publication date, name of first author, research design, total number of cases included, MRI machine model, scanning technology, and scoring method. The extracted content also included the method for calculating the total number of lesions (e.g., number of patients or lesions) and the diagnostic gold standard (e.g., biopsy or radical prostatectomy). The statistical results were either obtained directly from the articles or calculated and included true-positive, false-positive, true-negative, and false-negative data. All data were collected by 1 reviewer, Y Wang and checked by another, N Yi; in the case of any disagreements, the data were discussed by the reviewers to reach a consensus.
Each included study was assessed by 2 reviewers, W Wang and L Jiang independently using the Quality Assessment of Diagnosis Accuracy Studies 2 (QUADAS-2) tool (14) and RevMan version 5.4a software (Cochrane). Evaluated items were categorized as yes, no, or “uncertain”, with “uncertain” indicating that the data in the paper were insufficient for a clear evaluation. If there were any discrepancies in this process, a third experienced observer, L Wang was consulted to reach a consensus.
Statistical analysis
The data (true-positive, false-positive, true-negative, and false-negative data) were extracted from each included study to build a 2×2 contingency table, from which we calculated the corresponding the sensitivity, specificity, positive predictive value (PPV), negative predictive value (NPV), and diagnostic odds ratio (DOR). Corresponding forest maps and summary receiver operating characteristic curves (SROC) were generated by using the bivariate random effect model and the antilogic transformation of the predictive model parameters; the sensitivity, specificity, and area under the curve (AUC) of the data overall were calculated to evaluate diagnostic efficacy.
Publication bias was evaluated with Deeks funnel plot of effective sample size and related asymmetric regression test (12). Higgins I2 statistics was used to evaluate the heterogeneity among studies. When I2 was <50%, a fixed-effect model was conducted for meta-analysis, whereas when I2 was >50%, a random model was used (15). Subgroup analysis was performed separately for studies that only used the Prostate Imaging and Reporting Data System (PI-RADS) for diagnosis. To evaluate the impact of covariate differences in all studies on the results of the analysis, we conducted a univariate meta-regression analysis. The covariates included reader experience (≥5 vs. <5 years), scoring system (PI-RADS vs. others), scoring cutoff (≥4 vs. ≥3), T2-weighted planes (multiplanar vs. axial), b value (≥1,400 vs. <1,400 s/mm2), field strength (3 vs. 1.5 T), and study design (prospective vs. retrospective). All data were analyzed using Stata version 16 (StataCorp). P<0.05 (two-tailed) was considered clinically significant.
Results
Literature selection
A flowchart of the study selection process is shown in Figure 1. A literature search was performed for studies published up to March 15, 2022. In all, 532 articles were retrieved from PubMed and 571 were retrieved from Web of Science. After removal of duplicates, the titles and abstracts were reviewed to identify and exclude any irrelevant studies. This left 87 studies that were potentially relevant. The full text of each of these studies was read, which identified 16 studies (6,174 patients) for inclusion in this meta-analysis (16-31). To guarantee that the populations included were nonoverlapping, we did not include articles published by the same authors and determined whether the patients of the included studies were from different hospitals.
Study characteristics
The characteristics of the included studies are presented in Tables 1-3. Of the 16 studies, 13 were retrospective (17-24,26-28,30,31) and 3 were prospective (16,25,29). Nine studies included consecutive patients. The mean (±SD) age of patients was 66.4±4.6 years, and the mean PSA concentration was 9.3±7.6 ng/mL (Table 1). The sample size ranged from 51 to 1,020 patients, with IHPC prevalence ranging from 10% to 58% (mean 31%±12%).
Table 1
First author (reference) | Publication year | Country | Study design | Presentation of data | Age (years), mean [range] or [IQR] | PSA (ng/mL), mean [range] or [IQR] | ||
---|---|---|---|---|---|---|---|---|
By patients* | By lesions* | |||||||
Boesen (16) | 2018 | Denmark | Prospective | Yes [655] | No | 68# [62–72] | 9.2# [6.1–19.9] | |
Cuocolo (17) | 2018 | Italy | Retrospective | Yes [61] | No | 65.75 [44–85] | 7.76 [1.5–63] | |
Doo (18) | 2012 | Korea | Retrospective | No | Yes [93] | 63# [50–72] | 11.5 [4.23–43.83] | |
Fascelli (19) | 2015 | USA | Retrospective | Yes [44] | No | 65 [45–85] | 6.6 [0.9–43.3] | |
Han (20) | 2020 | China | Retrospective | Yes [37] | No | 71.2 [55–83] | 7.39 [5.26–9.70] | |
Kim (21) | 2021 | Korea | Retrospective | Yes [72] | No | 68.4 [61–76] | NA | |
Kuhl (22) | 2017 | Germany | Retrospective | Yes [180] | No | 64.8 [42–80] | 8.5 [3.2–67.5] | |
Lee (23) | 2017 | Korea | Retrospective | Yes [21] | No | 64.6 [57–72] | 6.6 [4.9–8.3] | |
Noh (24) | 2020 | Korea | Retrospective | Yes [158] | No | 66.6 [58–76] | 11.3 [4.0–32.8] | |
Obmann (25) | 2018 | USA | Prospective | No | Yes [62] | 61.8 [44–77] | 8.04 [0.45–64.08] | |
Pan (26) | 2021 | China | Retrospective | Yes [223] | No | 69.64 [62–78] | 13.59 [0.73–41.12] | |
Pesapane (27) | 2021 | Italy | Retrospective | Yes [195] | No | 61.5 [49–84] | 12.0 [4.4–90] | |
Thestrup (28) | 2016 | Denmark | Retrospective | Yes [68] | No | 65 [45–75] | 14 [2.2–120] | |
Van der Leest (29) | 2019 | Netherlands | Prospective | Yes [334] | No | 65# [59–68] | 6.4# [5.0–8.6] | |
Wei (30) | 2020 | China | Retrospective | Yes [97] | No | 69 [52–78] | 6.8 [4.97–8.74] | |
Zawaideh (31) | 2020 | UK | Retrospective | Yes [129] | No | 68 [47–84] | 6.15# [3.35–12.72] |
*, the number in parentheses is the number of positive cases; #, median [IQR]. NA, not applicable; PSA, prostate-specific antigen; IQR, interquartile range.
Table 2
First author (reference) | Reference standard | Cutoff scale | Scanner | Field strength (T) | DWI High b Value (mm2/s) | No. readers | Endorectal coil |
---|---|---|---|---|---|---|---|
Boesen (16) | Systematic TRUS bx and MRI/TRUS fusion targeted | 5-point PI-RADS scale | Philips | 3 | 2,000 | 2 | No |
Cuocolo (17) | Systematic and cognitive targeted TRUS bx | Likert | Siemens | 3 | 1,500 | 2 | No |
Doo (18) | RP | Likert | Siemens | 3 | 1,000 | 2 | No |
Fascelli (19) | Systematic bx and MRI/TRUS fusion targeted bx | SPL | Philips | 3 | 750 | NA | Yes |
Han (20) | TRUS bx | 5-point PI-RADS scale | GE | 3 | 1,400 | 2 | No |
Kim (21) | TRUS bx | 5-point PI-RADS scale | Philips | 3 | 1,500 | 1 | NA |
Kuhl (22) | MRI-guided (in-bore or fusion) bx | 5-point PI-RADS scale | Philips | 3 | 1,400 | 1 | No |
Lee (23) | TRUS bx or MRI/TRUS fusion targeted bx or RP | Modified 3-grade scoring system | Philips | 3 | 1,000 | 2 | No |
Noh (24) | MRI/US FTB and FSB | 5-point PI-RADS scale | Siemens | 3 | NA | 3 | NA |
Obmann (25) | Systematic and targeted TRUS bx | 5-point PI-RADS scale | Siemens | 3 | 1,400 | 1 | NA |
Pan (26) | TRUS bx | 5-point PI-RADS scale | Siemens | 1.5 | NA | 2 | NA |
Pesapane (27) | MRI-targeted biopsies or RP | 5-point PI-RADS scale | Siemens | 1.5 | 1,400 | 2 | Yes |
Thestrup (28) | TRUS bx or MRI/TRUS fusion targeted bx or RP | Questionnaire | Philips | 3 | 2,000 | 2 | NA |
van der Leest (29) | MRI-guided biopsy | 5-point PI-RADS scale | Siemens | 3 | 1,400 | 2 | No |
Wei (30) | Systematic TRUS bx | 5-point PI-RADS scale | Philips | 3 | 2,000 | 2 | NA |
Zawaideh (31) | MRI/US fusion targeted bx | Likert | GE | 3 | 2,000 | 4 | NA |
bx, biopsy; DWI, diffusion-weighted imaging; FSB, fusion template systematic biopsy; FTB, fusion transperineal targeted biopsy; MRI, magnetic resonance imaging; NA, not applicable; PI-RADS, Prostate Imaging Reporting and Data System; RP, radical prostatectomy; SPL, screen-positive lesions; TRUS, transrectal ultrasound; US, ultrasound.
Table 3
First author (reference) reading | Reader experience (years) | TP | FP | FN | TN | Sensitivity (%) | Specificity (%) | PPV (%) | NPV (%) | Accuracy (%) |
---|---|---|---|---|---|---|---|---|---|---|
Boesen (16) | ||||||||||
Reading 1* | >5 | 396 | 319 | 8 | 297 | 98 | 49 | 55 | 97 | 68 |
Reading 2# | >5 | 379 | 206 | 25 | 410 | 94 | 67 | 65 | 94 | 77 |
Cuocolo (17) | ||||||||||
Reading 1* | 16 | 28 | 21 | 0 | 55 | 100 | 72 | 57 | 100 | 80 |
Reading 2# | 16 | 35 | 15 | 3 | 61 | 92 | 80 | 70 | 95 | 84 |
Doo (18) | ||||||||||
Reading 1 | 11 | 75 | 9 | 18 | 90 | 81 | 92 | 89 | 83 | 86 |
Reading 2 | 4 | 50 | 12 | 43 | 87 | 54 | 88 | 81 | 67 | 71 |
Fascelli (19) | NR | 33 | 19 | 1 | 6 | 97 | 24 | 63 | 86 | 66 |
Han (20) | ||||||||||
Reading 1* | 5 | 34 | 24 | 3 | 62 | 92 | 72 | 59 | 95 | 78 |
Reading 2# | 5 | 30 | 10 | 7 | 76 | 81 | 88 | 75 | 92 | 86 |
Kim (21) | ||||||||||
Reading 1* | 7 | 67 | 69 | 5 | 85 | 93 | 55 | 49 | 94 | 67 |
Reading 2# | 7 | 57 | 42 | 15 | 112 | 79 | 73 | 58 | 88 | 75 |
Kuhl (22) | 9 | 138 | 49 | 9 | 346 | 94 | 88 | 74 | 97 | 89 |
Lee (23) | NR | 7 | 23 | 0 | 38 | 100 | 62 | 23 | 100 | 66 |
Noh (24) | ||||||||||
Reading 1* | NR | 99 | 157 | 3 | 41 | 97 | 21 | 39 | 93 | 47 |
Reading 2# | NR | 87 | 67 | 15 | 131 | 85 | 66 | 56 | 90 | 73 |
Obmann (25) | ||||||||||
Reading 1* | 14 | 59 | 52 | 3 | 70 | 95 | 57 | 53 | 96 | 70 |
Reading 2# | 14 | 48 | 14 | 14 | 108 | 77 | 89 | 77 | 89 | 85 |
Pan (26) | ||||||||||
Reading 1* | NR | 172 | 202 | 3 | 153 | 98 | 43 | 46 | 98 | 61 |
Reading 2# | NR | 148 | 75 | 27 | 280 | 85 | 79 | 66 | 91 | 81 |
Pesapane (27) | ||||||||||
Reading 1 | 5 | 164 | 54 | 31 | 182 | 84 | 77 | 39 | 96 | 78 |
Reading 2 | 3 | 156 | 61 | 39 | 175 | 80 | 74 | 35 | 95 | 75 |
Thestrup (28) | ||||||||||
Reading 1 | NR | 64 | 116 | 4 | 20 | 94 | 15 | 36 | 83 | 41 |
Reading 2 | NR | 65 | 116 | 3 | 20 | 96 | 15 | 36 | 87 | 42 |
van der Leest (29) | 5 | 180 | 137 | 10 | 299 | 95 | 69 | 57 | 97 | 77 |
Wei (30) | ||||||||||
Reading 1* | NR | 44 | 121 | 7 | 192 | 86 | 61 | 27 | 96 | 65 |
Reading 2# | NR | 41 | 55 | 10 | 258 | 80 | 82 | 43 | 96 | 82 |
Zawaideh (31) | ||||||||||
Reading 1* | 10 | 87 | 56 | 6 | 115 | 94 | 67 | 61 | 95 | 77 |
Reading 2# | 10 | 74 | 24 | 19 | 147 | 80 | 86 | 76 | 89 | 84 |
*, MRI cutoff score ≥3; #, MRI cutoff score ≥4. FN, false negative; FP, false positive; NPV, negative predictive value; NR, not reported; PPV, positive predictive value; TN, true negative; TP, true positive; MRI, magnetic resonance imaging.
The MRI machines used in each study were from different manufacturers: 14 studies used a 3.0-T MRI (DWI b=750–2,000 mm2/s) (16-25,28-31), and 2 studies used 1.5-T MRI (DWI b=1,400 mm2/s (26,27). Only 2 studies mentioned the use of intrarectal coils during examinations (19,27), and 7 articles did not report specific coil types (21,24-26,28,30,31). Readers’ experience in the studies ranged from 3 to 25 years, with 6 studies not specifying the radiologists’ experience (19,23,24,26,28,30). With regard to scoring systems, 10 studies (16,20-22,24-27,29,30) used the PI-RADS score (32), 3 studies (17,18,31) used a Likert scale (33), and the remaining 3 studies (19,23,28) used other methods, such as a screen-positive vision or a custom questionnaire.
The pathological methods of the gold standard for diagnosis also differed across studies. Radical prostatectomy (RP) was the gold standard in only 1 study (18). The pathological results of the other 15 studies (16,17,19-31) were from needle biopsies with different technologies, including transrectal ultrasound-guided prostate biopsy (TUPB), MRI-guided biopsy, MRI/ultrasound fusion-guided targeted biopsy (Table 2).
Assessment of study quality
The methodological quality of all included studies was evaluated by 2 independent reviewers using the QUADAS-2 tool (14), W Wang and L Jiang, with the summary results shown in Figure 2. The potential reasons for quality bias included sample size and differences in the ethnicity of the participants, the reliability of pathological techniques, the research design type, and the parameter settings of the MRI machines.
Regarding the patient selection domain, 4 studies (18,19,28,29) had an unclear risk of bias because consecutive or random enrollment was not specified, whereas the remaining 12 studies had a low risk of bias. Risk of bias due to the index test was unclear in 1 study because of insufficient information regarding image interpretation (19), but the risk of bias was low in the remaining 15 studies. Five papers had high risk of bias related to the reference standard because the results were interpreted without blinding (23,26-28,31). Regarding the flow and timing, 2 studies (27,28) had a high risk of bias because they used various reference standards (i.e., biopsies or prostatectomy results), 1 study (19) had an unclear risk of bias because it did not specify the type of biopsy used as a reference standard, and the remaining 13 studies had a low risk of bias.
As for applicability concerns, 1 study exclusively assessed patients who received RP, possibly leading to sampling error with a high proportion of IHPC (18). In addition, 7 studies were problematic on its applicability due to the lack of MRI acquisition parameters (18-22,27,30).
Diagnostic accuracy
The sensitivity of bpMRI ranged from 54% to 100%, with specificity ranging from 15% to 92%. The pooled sensitivity of all studies (n=16) was 0.91 (95% CI: 0.87–0.93), specificity 0.67 (95% CI: 0.58–0.76), positive likelihood ratio (LR) 2.8 (95% CI: 2.2–3.6), negative LR 0.14 (95% CI: 0.11–0.18), DOR 20 (95% CI: 15–27). It is well known that using different cutoff values for diagnosis will result in variations in the sensitivity and specificity among the studies, so we created separate forest plots. When the cutoff PI-RADS score was ≥3, the sensitivity, specificity, positive LR, negative LR, and DOR of bpMRI for the detection of IHPC were 0.93 (95% CI: 0.90–0.96; I2=93.8%), 0.59 (95% CI: 0.46–0.71; I2=97.5%), 2.3 (95% CI: 1.7–3.1), 0.11 (95% CI: 0.08–0.16), and 21 (95% CI: 13–23), respectively. With a PI-RADS score cutoff of ≥4, the sensitivity, specificity, positive LR, negative LR, and DOR of bpMRI for the detection of IHPC were 0.85 (95% CI: 0.80–0.89; I2=79.9%), 0.79 (95% CI: 0.74–0.84; I2=90.3%), 4.1 (95% CI: 3.3–5.1), 0.19 (95% CI: 0.15–0.25), and 21 (95% CI: 16–28), respectively (34). Forest plots of the sensitivity and specificity of bpMRI for the diagnosis of IHPC using cutoff values of ≥3 and ≥4 are shown in Figure 3.
The SROC and summary cutoff point of the included studies for cutoff values of ≥3 and ≥4 are shown in Figure 4. The AUC with a cutoff of ≥3 (0.90; 95% CI: 0.87–0.92) was slightly higher than the AUC with a cutoff of ≥4 (0.89; 95% CI: 0.86–0.92), but the values indicate that both cutoff values have high diagnostic accuracy.
The estimated LRs were used to reproduce the 3 clinical decision settings in order to predict different pretest probabilities for the diagnosis of IHPC. The corresponding posttest probability was calculated on a Fagan nomogram (Figure 5). When the bpMRI was negative, the posttest probability of IHPC was 4%, which was a relatively reliable negative prediction when the clinical suspicion of IHPC was low (pretest probability 25%; Figure 5A). When the clinical suspicion of IHPC was high (pretest probability 75%) and bpMRI was positive, the posttest probability of IHPC was 89%, indicating high positive predictive value (Figure 5C). When the readers were uncertain whether it was IHPC (pretest probability 50%), the posttest probability was 73% if bpMRI was positive and 12% if bpMRI was negative. These results prove that bpMRI has a high diagnostic accuracy for IHPC (Figure 5B).
Higgins I2 statistical results indicated high heterogeneity in sensitivity (I2=90%) and specificity (I2=97%) among all studies. Subgroup analysis was implemented for studies that interpreted bpMRI only with PI-RADS. The total sensitivity, specificity, positive LR, negative LR, and DOR of these studies for diagnosing IHPC were 0.91 (95% CI: 0.87–0.94), 0.69 (95% CI: 0.60–0.77), 2.9 (95% CI: 2.3–3.7), 0.13 (95% CI: 0.09–0.18), and 23 (95% CI: 17–31), respectively. The SROC had an AUC of 0.89 (95% CI: 0.86–0.92). The results of Deeks funnel plot and asymmetric test showed that there was no significant publication bias (P=0.52; Figure 6).
Meta-regression analysis
We speculated that the heterogeneity in diagnostic performance could be attributed to the variabilities of studies with regard to pathologies, study design, and MRI parameters. Therefore, we performed a meta-regression analysis stratified to 7 relevant covariates. In these analyses, scoring system (PI-RADS vs. others), score cutoff (≥3 vs. ≥4), T2-weighted planes (multiplanar vs. axial), and study design (prospective vs. retrospective) were significantly associated with sensitivity (all P values<0.05 for all). Reader experience (≥5 vs. <5 years), score cutoff (≥3 vs. ≥4), T2-weighted planes (multiplanar vs. axial), and a high b value (≥1,400 vs. <1,400 s/mm2) resulted in significant heterogeneities in the joint model (all P values <0.05). Table 4 provides details of the meta-regression analysis; Figure 7 shows forest plots of the sensitivity and specificity for the detection of IHPC using bpMRI.
Table 4
Subgroup (covariate) | No. of studies | Sensitivity | Specificity | Chi-squared (LRT) | P value (joint model) | |||
---|---|---|---|---|---|---|---|---|
Value (95% CI) | P value | Value (95% CI) | P value | |||||
Reader experience (years) | 0.35 | 0.06 | 174.45 | <0.05 | ||||
≥5 | 16 | 0.91 (0.88–0.94) | 0.75 (0.69–0.82) | |||||
<5 | 2 | 0.69 (0.46–0.91) | 0.82 (0.68–0.96) | |||||
Scoring system | <0.05 | 0.71 | 0.81 | 0.67 | ||||
PI-RADS | 18 | 0.91 (0.87–0.95) | 0.69 (0.58–0.80) | |||||
Other | 10 | 0.91 (0.86–0.97) | 0.63 (0.47–0.79) | |||||
Scoring cutoff | <0.05 | 0.63 | 53.37 | <0.05 | ||||
≥4 | 9 | 0.85 (0.77–0.93) | 0.80 (0.71–0.89) | |||||
≥3 | 16 | 0.93 (0.90–0.96) | 0.66 (0.56–0.75) | |||||
T2-weighted planes | <0.05 | 0.37 | 7.80 | <0.05 | ||||
Multiplanar | 22 | 0.88 (0.85–0.92) | 0.71 (0.62–0.80) | |||||
Axial | 6 | 0.96 (0.94–0.99) | 0.50 (0.30–0.71) | |||||
b values (s/mm2) | 0.41 | 0.44 | 69.53 | <0.05 | ||||
≥1,400 | 20 | 0.91 (0.88–0.94) | 0.69 (0.58–0.79) | |||||
<1,400 | 4 | 0.85 (0.72–0.98) | 0.72 (0.50–0.94) | |||||
Field strength (T) | 0.10 | 0.47 | 0.31 | 0.86 | ||||
3 | 24 | 0.91 (0.88–0.95) | 0.67 (0.57–0.76) | |||||
1.5 | 4 | 0.89 (0.80–0.98) | 0.70 (0.47–0.92) | |||||
Study design | <0.05 | 0.48 | 3.55 | 0.17 | ||||
Retrospective | 23 | 0.90 (0.86–0.94) | 0.67 (0.57–0.77) | |||||
Prospective | 5 | 0.94 (0.90–0.99) | 0.68 (0.47–0.89) |
CI, confidence interval; LRT, likelihood ratio test; PI-RADS, Prostate Imaging Reporting and Data System.
Discussion
In this meta-analysis and review we evaluated the accuracy of bpMRI in the diagnosis of IHPC (16 studies, 6,174 patients) and found that the overall sensitivity and specificity were 0.91 and 0.67, respectively. In subgroup analysis, we found that the overall sensitivity and specificity was 0.91 and 0.69, respectively, for studies based only on the PI-RADS scoring system. By examining the use of different scoring cutoffs values (3 and 4), it is obvious that the sensitivity and specificity vary among studies. It is well known that a higher cutoff (≥4) will always have a higher specificity and lower sensitivity than a lower cutoff (≥3). However, we did not want to remove or ignore any studies and analyze them separately because, in this field, opinions are not currently consistent, and the purpose of our review was to present this situation. From our perspective, a scoring cutoff of 3 may be more appropriate in order to reduce the ratio of missed diagnoses to malignant patients so that they can be treated in a timely manner; this position was consistent with the majority of studies, with approximately two-thirds of the included literature (19/28) using 3 as the cutoff value. However, in the future, more robust studies and a larger number of patients are needed to provide more convincing data.
For all of the included studies, their separate accuracy ranged from 41% to 89%, which is similar to the result recently published by Zhen et al. (35). We also believe there to be a degree of heterogeneity in the sensitivity and specificity of the included studies, which is worth further discussion and improvement.
First, the emergence of heterogeneity may be related to the number of sequences in the bpMRI protocol. All the included studies used both axial DWI and apparent diffusion coefficient images, but the number of T2WI used differed. Most studies (17-21,23-27,30,31) included at least 2 orientations of T2WI (axial, coronal, or sagittal), but some studies (16,22,28,29) only used transverse T2WI, which might have led to a deviation in the qualitative diagnosis of lesions.
Another reason for the heterogeneity may be the scoring system used in MRI diagnosis. In 10 (16,20-22,24-27,29,30) of the 16 articles in this review, PI-RADS v.2 was used to interpret bpMRI images (32), the Likert scale (33) was used in 3 studies (17,18,31), and special diagnostic scoring criteria were used in another 3 studies (19,23,28). Because the scoring criteria of individual studies were inconsistent, the heterogeneity among studies was further amplified. Although PI-RADS was used to evaluate the possibility of PCa in most studies, the diagnostic thresholds used in these studies were not uniform. Most of the included studies used a threshold of PI-RADS ≥3, but several studies used PI-RADS ≥4. According to PI-RADS v.2 (36), a score of 3 means that PCa is suspected, whereas a score of 4 means that cancer is more likely. In addition, we showed that the diagnostic sensitivity of PI-RADS ≥3 (93%; 95% CI: 90–96%) was higher than that of PI-RADS ≥4 (85%; 95% CI: 77–93%). Therefore, we suggest that standardized operating schemes, including image acquisition and interpretation protocols, be customized in the implementation of bpMRI to reduce heterogeneity.
Heterogeneity was also evident in the standard reference among the included studies. However, it is worth noting that the diagnosis, treatment, and pathology acquisition processes of PCa are different from those of other organs. In the present meta-analysis, only 1 study’s patients only received RP; in the remaining 15 studies, biopsies were used. After all, it is impossible to conduct RP in all patients in whom PCa is suspected. While in prostate biopsy, clinicians usually take 12 punctures, so there is a risk that the tumor will not be in the sampling area, thus increasing false negatives. Although biopsy may cause poor positioning, it has advantages in terms of the number of studies and sample size. Prostate diseases (hyperplasia of the prostate or PCa) tend to have multiple lesions. Therefore, regardless of whether a biopsy or RP is performed, the most reliable comparison is a head-to-head or lesion-to-lesion comparison between MRI and pathology. We believe more comparative studies of this kind should be conducted in the future to increase diagnostic accuracy.
Our study shows that the sensitivity of bpMRI in diagnosing IHPC is higher with a high b value (≥1,400 s/mm2) than with a low b value (<1,400 s/mm2), with the difference in the model evaluating b values reaching statistical significance. There is some evidence suggesting that high (≥1,400 s/mm2) b values are better at detecting IHPC than are standard (<1,400 s/mm2) b values (37). However, with increasing b, there is a decrease in the signal-to-noise ratio of DWI, and the parameters need to be adjusted to improve image quality (38). The best b value for each MRI needs to be set according to the magnetic field strength, quality control, and other factors.
Among the included studies, only 2 were performed based on the lesions (not the patients), representing the current situation of the bpMRI in diagnosing IHPC. Most studies chose to analyze on a patient basis rather than by lesion, and this decision was affected by the research methods. There is thus far no evidence from studies confirming that evaluations by lesions are more exact, and these studies may be needed in the future.
In 2018, Kang et al. (9) compared the performance between bpMRI and mpMRI for diagnosing PCa and reported sensitivity and specificity values of 0.79 and 0.88 for bpMRI, respectively, and 0.79 and 0.89 for mpMRI, respectively. However, by using different selection criteria, we managed to focus on IHPC specifically. Tumors with a GS <7 are considered to be inert and not significant in clinical practice, and can be treated conservatively. However, IHPC has a higher malignancy and worse prognosis, so more radical treatment methods, including RP, are needed. Therefore, we believe that this paper is important for guiding clinical practice and will help improve urologists and radiologists’ understanding of the MRI manifestations of IHPC. In 2019, Woo et al. (39) performed a head-to-head comparison of bpMRI and mpMRI for the diagnosis of clinically significant PCa and found similar sensitivities (0.74 and 0.76, respectively) and specificities (0.90 and 0.89, respectively) values. However, their study design limited their analysis to studies that directly compared bpMRI and mpMRI (39).
In the present study, we found that bpMRI is a simple and effective imaging protocol. Its accuracy is no less than that of mpMRI, but it is faster, cheaper, and safer. It is worth popularizing bpMRI, which could even replace mpMRI in the future. Although our findings indicate that DCE is dispensable, it still has a role in other areas (e.g., posttreatment imaging). We believe that it is necessary and worthwhile biopsying lesions with PI-RADS score ≥3. This study also found that radiologists with >5 years of reading experience had a higher diagnostic accuracy. Compared with the previous meta-analysis (8), our slightly lower sensitivity and specificity may be due to the inclusion of an analysis of reader experience.
There are some limitations of our meta-analysis that need to be addressed. First, the included studies were heterogeneous in terms of the methods used, which affected the general applicability of the summary estimates. Second, it was impossible to ensure uniformity in terms of patient selection because both retrospective and prospective studies were included in the meta-analysis. In the future, determining the diagnostic accuracy of bpMRI could be improved by only including prospective randomized controlled trials in the analysis.
Conclusions
This study shows that bpMRI has a high negative predictive value and diagnostic accuracy in the diagnosis of IHPC, which is helpful for urologists to screen PCa with higher risk and worse prognosis so that the most appropriate treatment can be implemented in a timely manner. Meanwhile, standardization and normalization of the bpMRI protocol for PCa will help improve its wider applicability and reliability.
Acknowledgments
Funding: None.
Footnote
Reporting Checklist: The authors have completed the PRISMA-DTA reporting checklist. Available at https://qims.amegroups.com/article/view/10.21037/qims-22-1024/rc
Conflicts of Interest: All authors have completed the ICMJE uniform disclosure form (available at https://qims.amegroups.com/article/view/10.21037/qims-22-1024/coif). The authors have no conflicts of interest to declare.
Ethical Statement: The authors are accountable for all aspects of the work in ensuring that questions related to the accuracy or integrity of any part of the work are appropriately investigated and resolved.
Open Access Statement: This is an Open Access article distributed in accordance with the Creative Commons Attribution-NonCommercial-NoDerivs 4.0 International License (CC BY-NC-ND 4.0), which permits the non-commercial replication and distribution of the article with the strict proviso that no changes or edits are made and the original work is properly cited (including links to both the formal publication through the relevant DOI and the license). See: https://creativecommons.org/licenses/by-nc-nd/4.0/.
References
- Siegel RL, Miller KD, Fuchs HE, Jemal A. Cancer statistics, 2022. CA Cancer J Clin 2022;72:7-33. [Crossref] [PubMed]
- Hugosson J, Månsson M, Wallström J, Axcrona U, Carlsson SV, Egevad L, Geterud K, Khatami A, Kohestani K, Pihl CG, Socratous A, Stranne J, Godtman RA, Hellström MGÖTEBORG-2 Trial Investigators. Prostate Cancer Screening with PSA and MRI Followed by Targeted Biopsy Only. N Engl J Med 2022;387:2126-37. [Crossref] [PubMed]
- Ahmed HU, El-Shater Bosaily A, Brown LC, Gabe R, Kaplan R, Parmar MK, Collaco-Moraes Y, Ward K, Hindley RG, Freeman A, Kirkham AP, Oldroyd R, Parker C, Emberton MPROMIS study group. Diagnostic accuracy of multi-parametric MRI and TRUS biopsy in prostate cancer (PROMIS): a paired validating confirmatory study. Lancet 2017;389:815-22. [Crossref] [PubMed]
- Abraham NE, Mendhiratta N, Taneja SS. Patterns of repeat prostate biopsy in contemporary clinical practice. J Urol 2015;193:1178-84. [Crossref] [PubMed]
- Asuncion A, Walker PM, Bertaut A, Blanc J, Labarre M, Martin E, Bardet F, Cassin J, Cormier L, Crehange G, Loffroy R, Cochet A. Prediction of prostate cancer recurrence after radiation therapy using multiparametric magnetic resonance imaging and spectroscopy: assessment of prognostic factors on pretreatment imaging. Quant Imaging Med Surg 2022;12:5309-25. [Crossref] [PubMed]
- Getaneh AM, Heijnsdijk EA, de Koning HJ. Cost-effectiveness of multiparametric magnetic resonance imaging and MRI-guided biopsy in a population-based prostate cancer screening setting using a micro-simulation model. Cancer Med 2021;10:4046-53. [Crossref] [PubMed]
- Beomonte Zobel B, Quattrocchi CC, Errante Y, Grasso RF. Gadolinium-based contrast agents: did we miss something in the last 25 years? Radiol Med 2016;121:478-81. [Crossref] [PubMed]
- Alabousi M, Salameh JP, Gusenbauer K, Samoilov L, Jafri A, Yu H, Alabousi A. Biparametric vs multiparametric prostate magnetic resonance imaging for the detection of prostate cancer in treatment-naïve patients: a diagnostic test accuracy systematic review and meta-analysis. BJU Int 2019;124:209-20. [Crossref] [PubMed]
- Kang Z, Min X, Weinreb J, Li Q, Feng Z, Wang L. Abbreviated Biparametric Versus Standard Multiparametric MRI for Diagnosis of Prostate Cancer: A Systematic Review and Meta-Analysis. AJR Am J Roentgenol 2019;212:357-65. [Crossref] [PubMed]
- Thaiss WM, Moser S, Hepp T, Kruck S, Rausch S, Scharpf M, Nikolaou K, Stenzl A, Bedke J, Kaufmann S. Head-to-head comparison of biparametric versus multiparametric MRI of the prostate before robot-assisted transperineal fusion prostate biopsy. World J Urol 2022;40:2431-8. [Crossref] [PubMed]
- Tamada T, Kido A, Yamamoto A, Takeuchi M, Miyaji Y, Moriya T, Sone T. Comparison of Biparametric and Multiparametric MRI for Clinically Significant Prostate Cancer Detection With PI-RADS Version 2.1. J Magn Reson Imaging 2021;53:283-91. [Crossref] [PubMed]
- Deeks JJ, Macaskill P, Irwig L. The performance of tests of publication bias and other sample size effects in systematic reviews of diagnostic test accuracy was assessed. J Clin Epidemiol 2005;58:882-93. [Crossref] [PubMed]
- Page MJ, McKenzie JE, Bossuyt PM, Boutron I, Hoffmann TC, Mulrow CD, et al. The PRISMA 2020 statement: an updated guideline for reporting systematic reviews. BMJ 2021;372: [PubMed]
- Whiting PF, Rutjes AW, Westwood ME, Mallett S, Deeks JJ, Reitsma JB, Leeflang MM, Sterne JA, Bossuyt PM. QUADAS-2 Group. QUADAS-2: a revised tool for the quality assessment of diagnostic accuracy studies. Ann Intern Med 2011;155:529-36. [Crossref] [PubMed]
- Higgins JP, Thompson SG, Deeks JJ, Altman DG. Measuring inconsistency in meta-analyses. BMJ 2003;327:557-60. [Crossref] [PubMed]
- Boesen L, Nørgaard N, Løgager V, Balslev I, Bisbjerg R, Thestrup KC, Winther MD, Jakobsen H, Thomsen HS. Assessment of the Diagnostic Accuracy of Biparametric Magnetic Resonance Imaging for Prostate Cancer in Biopsy-Naive Men: The Biparametric MRI for Detection of Prostate Cancer (BIDOC) Study. JAMA Netw Open 2018;1:e180219. [Crossref] [PubMed]
- Cuocolo R, Stanzione A, Rusconi G, Petretta M, Ponsiglione A, Fusco F, Longo N, Persico F, Cocozza S, Brunetti A, Imbriaco M. PSA-density does not improve bi-parametric prostate MR detection of prostate cancer in a biopsy naïve patient population. Eur J Radiol 2018;104:64-70. [Crossref] [PubMed]
- Doo KW, Sung DJ, Park BJ, Kim MJ, Cho SB, Oh YW, Ko YH, Yang KS. Detectability of low and intermediate or high risk prostate cancer with combined T2-weighted and diffusion-weighted MRI. Eur Radiol 2012;22:1812-9. [Crossref] [PubMed]
- Fascelli M, Rais-Bahrami S, Sankineni S, Brown AM, George AK, Ho R, Frye T, Kilchevsky A, Chelluri R, Abboud S, Siddiqui MM, Merino MJ, Wood BJ, Choyke PL, Pinto PA, Turkbey B. Combined Biparametric Prostate Magnetic Resonance Imaging and Prostate-specific Antigen in the Detection of Prostate Cancer: A Validation Study in a Biopsy-naive Patient Population. Urology 2016;88:125-34. [Crossref] [PubMed]
- Han C, Liu S, Qin XB, Ma S, Zhu LN, Wang XY. MRI combined with PSA density in detecting clinically significant prostate cancer in patients with PSA serum levels of 4~10ng/mL: Biparametric versus multiparametric MRI. Diagn Interv Imaging 2020;101:235-44. [Crossref] [PubMed]
- Kim MJ, Park SY. Biparametric Magnetic Resonance Imaging-Derived Nomogram to Detect Clinically Significant Prostate Cancer by Targeted Biopsy for Index Lesion. J Magn Reson Imaging 2022;55:1226-33. [Crossref] [PubMed]
- Kuhl CK, Bruhn R, Krämer N, Nebelung S, Heidenreich A, Schrading S. Abbreviated Biparametric Prostate MR Imaging in Men with Elevated Prostate-specific Antigen. Radiology 2017;285:493-505. [Crossref] [PubMed]
- Lee DH, Nam JK, Lee SS, Han JY, Lee JW, Chung MK, Park SW. Comparison of Multiparametric and Biparametric MRI in First Round Cognitive Targeted Prostate Biopsy in Patients with PSA Levels under 10 ng/mL. Yonsei Med J 2017;58:994-9. [Crossref] [PubMed]
- Noh TI, Tae JH, Kim HK, Shim JS, Kang SG, Sung DJ, Cheon J, Lee JG, Kang SH. Diagnostic Accuracy and Value of Magnetic Resonance Imaging-Ultrasound Fusion Transperineal Targeted and Template Systematic Prostate Biopsy Based on Bi-parametric Magnetic Resonance Imaging. Cancer Res Treat 2020;52:714-21. [Crossref] [PubMed]
- Obmann VC, Pahwa S, Tabayayong W, Jiang Y, O'Connor G, Dastmalchian S, Lu J, Shah S, Herrmann KA, Paspulati R, MacLennan G, Ponsky L, Abouassaly R, Gulani V. Diagnostic Accuracy of a Rapid Biparametric MRI Protocol for Detection of Histologically Proven Prostate Cancer. Urology 2018;122:133-8. [Crossref] [PubMed]
- Pan JF, Su R, Cao JZ, Zhao ZY, Ren DW, Ye SZ, Huang RD, Tao ZL, Yu CL, Jiang JH, Ma Q. Modified Predictive Model and Nomogram by Incorporating Prebiopsy Biparametric Magnetic Resonance Imaging With Clinical Indicators for Prostate Biopsy Decision Making. Front Oncol 2021;11:740868. [Crossref] [PubMed]
- Pesapane F, Acquasanta M, Meo RD, Agazzi GM, Tantrige P, Codari M, Schiaffino S, Patella F, Esseridou A, Sardanelli F. Comparison of Sensitivity and Specificity of Biparametric versus Multiparametric Prostate MRI in the Detection of Prostate Cancer in 431 Men with Elevated Prostate-Specific Antigen Levels. Diagnostics (Basel) 2021.
- Thestrup KC, Logager V, Baslev I, Møller JM, Hansen RH, Thomsen HS. Biparametric versus multiparametric MRI in the diagnosis of prostate cancer. Acta Radiol Open 2016;5:2058460116663046. [Crossref] [PubMed]
- van der Leest M, Israël B, Cornel EB, Zámecnik P, Schoots IG, van der Lelij H, Padhani AR, Rovers M, van Oort I, Sedelaar M, Hulsbergen-van de Kaa C, Hannink G, Veltman J, Barentsz J. High Diagnostic Performance of Short Magnetic Resonance Imaging Protocols for Prostate Cancer Detection in Biopsy-naïve Men: The Next Step in Magnetic Resonance Imaging Accessibility. Eur Urol 2019;76:574-81. [Crossref] [PubMed]
- Wei CG, Chen T, Zhang YY, Pan P, Dai GC, Yu HC, Yang S, Jiang Z, Tu J, Lu ZH, Shen JK, Zhao WL. Biparametric prostate MRI and clinical indicators predict clinically significant prostate cancer in men with "gray zone" PSA levels. Eur J Radiol 2020;127:108977. [Crossref] [PubMed]
- Zawaideh JP, Sala E, Shaida N, Koo B, Warren AY, Carmisciano L, Saeb-Parsy K, Gnanapragasam VJ, Kastner C, Barrett T. Diagnostic accuracy of biparametric versus multiparametric prostate MRI: assessment of contrast benefit in clinical practice. Eur Radiol 2020;30:4039-49. [Crossref] [PubMed]
- Weinreb JC, Barentsz JO, Choyke PL, Cornud F, Haider MA, Macura KJ, Margolis D, Schnall MD, Shtern F, Tempany CM, Thoeny HC, Verma S. PI-RADS Prostate Imaging - Reporting and Data System: 2015, Version 2. Eur Urol 2016;69:16-40. [Crossref] [PubMed]
- Dickinson L, Ahmed HU, Allen C, Barentsz JO, Carey B, Futterer JJ, Heijmink SW, Hoskin P, Kirkham AP, Padhani AR, Persad R, Puech P, Punwani S, Sohaib A, Tombal B, Villers A, Emberton M. Scoring systems used for the interpretation and reporting of multiparametric MRI for prostate cancer detection, localization, and characterization: could standardization lead to improved utilization of imaging within the diagnostic pathway? J Magn Reson Imaging 2013;37:48-58. [Crossref] [PubMed]
- Cronin P, Kelly AM, Altaee D, Foerster B, Petrou M, Dwamena BA. How to Perform a Systematic Review and Meta-analysis of Diagnostic Imaging Studies. Acad Radiol 2018;25:573-93. [Crossref] [PubMed]
- Zhen L, Liu X, Yegang C, Yongjiao Y, Yawei X, Jiaqi K, Xianhao W, Yuxuan S, Rui H, Wei Z, Ningjing O. Accuracy of multiparametric magnetic resonance imaging for diagnosing prostate Cancer: a systematic review and meta-analysis. BMC Cancer 2019;19:1244. [Crossref] [PubMed]
- Ding Z, Song D, Wu H, Tian H, Ye X, Liang W, Xu J, Dong F. Development and validation of a nomogram based on multiparametric magnetic resonance imaging and elastography-derived data for the stratification of patients with prostate cancer. Quant Imaging Med Surg 2021;11:3252-62. [Crossref] [PubMed]
- Ueno Y, Kitajima K, Sugimura K, Kawakami F, Miyake H, Obara M, Takahashi S. Ultra-high b-value diffusion-weighted MRI for the detection of prostate cancer with 3-T MRI. J Magn Reson Imaging 2013;38:154-60. [Crossref] [PubMed]
- Xi Y, Liu A, Olumba F, Lawson P, Costa DN, Yuan Q, Khatri G, Yokoo T, Pedrosa I, Lenkinski RE. Low-to-high b value DWI ratio approaches in multiparametric MRI of the prostate: feasibility, optimal combination of b values, and comparison with ADC maps for the visual presentation of prostate cancer. Quant Imaging Med Surg 2018;8:557-67. [Crossref] [PubMed]
- Woo S, Suh CH, Kim SY, Cho JY, Kim SH, Moon MH. Head-to-Head Comparison Between Biparametric and Multiparametric MRI for the Diagnosis of Prostate Cancer: A Systematic Review and Meta-Analysis. AJR Am J Roentgenol 2018;211:W226-41. [Crossref] [PubMed]