A classifier-combined method for grading breast cancer based on Dempster-Shafer evidence theory
Introduction
Breast cancer is a serious, life-threatening malignancy in women which accounts for 15.4% of all cancer-related deaths (1,2). Due to the underlying tumor heterogeneity, breast cancers of the same clinical stage can have completely different treatment responses and prognoses. Therefore, in this new era of precision medicine, there is a pressing clinical need for a noninvasive method that enables this tumor heterogeneity to be quantified so that individualized treatment can be tailored, and the treatment response can be promptly evaluated.
Among all the sequences in a routine breast magnetic resonance imaging (MRI) protocol, dynamic contrast-enhanced MRI (DCE-MRI) is the backbone. Not only is it the most sensitive method for detecting breast cancer (3) but, more importantly, it can also reflect the tumor hemodynamic heterogeneity, including micro-vessel density and vascular permeability differences (4).
In the clinical setting, the histologic grade for breast cancer is determined by assessing the glandular and tubular differentiation, nuclear pleomorphism, and mitotic activity, which are hallmark characteristics of tumor aggressiveness. In general, high-grade tumors have poorer differentiation and more dismal prognoses than do low-grade tumors (5). The preoperative histological grading of breast cancer is critical for clinical decision-making, including breast cancer surgery and subsequent treatment planning. Therefore, the ability to noninvasively predict the histological grade of breast cancer is critical.
Texture features acquired from breast DCE-MRI images can reflect the microvascular density and spatial permeability heterogeneity within a region of interest (ROI), which is useful for identifying different types of breast lesions (6). Also, pharmacokinetic parameters obtained by fitting the temporal signal intensity profile of T1-weighted images using a modified Tofts model (such as Ktrans) can reflect the hemodynamic heterogeneity of tumor tissue (7,8). Therefore, both DCE-MRI-based texture features and pharmacokinetic parameters have potential for the noninvasive preoperative histological grading of breast cancers.
With the widespread application of artificial intelligence in the field of medical image processing seen in recent years, texture feature analysis based on machine learning has achieved groundbreaking results in tumor classification tasks, such as outcome prediction (9), neoadjuvant chemotherapy response prediction (10), and molecular subtype classification (11). However, most of the studies to date have used only spatial or temporal features to achieve the best classification effect (9,10). Pharmacokinetic parameters obtained from DCE-MRI can be used to explore permeability and perfusion changes within a tumor, and thus could potentially be integrated into a conventional texture-based machine learning model to further improve the classification of tumors (11).
Dempster-Shafer (D-S) evidence theory is an inference method that can handle uncertain information (12). Different classifiers have different data processing abilities, and thus, varying levels of fault tolerance and applicability. Using D-S evidence theory, the classification effects of multiple classifiers can be combined to obtain more accurate classification results. In the fields of urban planning and natural image classification, D-S evidence theory was found to substantially improve the classification results of a model based on a single classifier by combining multiple classifiers [support vector machine (SVM) and k-nearest neighbor (KNN)], with a 20% to 30% increase in accuracy (13-15). Therefore, this combination could be expected to improve classification accuracy; however, its potential in the medical field has yet to be explored.
We hypothesized that breast cancers of different pathological grades show spatial and hemodynamic heterogeneity on DCE-MRI, and that this spatial and hemodynamic heterogeneity can be quantified using texture features and hemodynamic parameters. Therefore, in this study, we investigated and evaluated the performance of a machine learning model based on the combination of pharmacokinetic parameters and texture features derived from DCE-MRI, using D-S evidence theory, in predicting the preoperative histologic grade of breast cancer. We present the following article in accordance with the TRIPOD reporting checklist (available at https://qims.amegroups.com/article/view/10.21037/qims-22-652/rc).
Methods
Study participants
This study was conducted in accordance with the Declaration of Helsinki (as revised in 2013) and was approved by the ethics committee of Cancer Hospital Chinese Academy of Sciences, Shenzhen Hospital. The requirement for individual consent was waived for this retrospective analysis.
Thirty-three female patients with a histologically confirmed diagnosis of breast cancer who underwent baseline DCE-MRI between December 2019 and July 2020 were retrospectively included. The inclusion criteria were as follows: (I) the histology of each breast lesion was confirmed as invasive breast carcinoma of no special type by biopsy or surgical specimen; (II) each breast cancer lesion had a histologic grade of I–III, and the maximal diameter of the primary breast lesion exceeded 1 cm; (III) conventional MRI and DCE-MRI were performed within the 2 weeks before treatment; and (IV) in one DCE-MRI scan, ten phases had been acquired. The DCE-MRI images were reviewed by two board-certified radiologists (MW and YR). Images of poor quality, including those with overt signal loss, motion artifacts, or geometric distortion, were excluded from the analysis. The clinical characteristics of the enrolled patients are summarized in Table 1.
Table 1
IBC-NST (n = 33) | Data |
---|---|
Age (years) | 50.6±8.3 |
Histologic subtype | |
IBC-NST | 19 |
Mixed type (IBC-NST + DCIS) | 14 |
Size in the largest diameter (≤2 cm/2–5 cm/>5 cm) | 22/11/0 |
TNM stage | |
T stage (T1/T2/T3/T4) | 13/18/1/1 |
N stage (N0/N1/N2/N3) | 16/5/7/5 |
M stage (M0/M1) | 33/0 |
Histologic grade | |
Grade I/II/III | 13/12/8 |
Proliferation protein (ki-67 index) | |
≤20% | 18 |
>20% | 15 |
Metastatic status of ALNs (positive/negative) | 18/15 |
ER (positive/negative) | 25/8 |
PR (positive/negative) | 23/10 |
HER-2 (positive/negative) | 17/16 |
Molecular subtypes | |
Luminal A | 2 |
Luminal B | 22 |
HER-2 enriched | 9 |
Triple negative | 0 |
Perineural invasion | |
Positive/negative | 7/17 |
Not available | 9 |
Vascular invasion | |
Positive/negative | 15/9 |
Not available | 9 |
The data in the table are presented as mean ± standard deviation or frequency. IBC-NST, invasive breast carcinoma of no special type; DCIS, ductal carcinoma in situ; TNM, tumor, node, metastasis; ALN, axillary lymph node; ER, estrogen receptor; PR, progesterone receptor; HER-2, human epidermal growth factor receptor.
MRI protocols
The MRI examinations were performed on a 3T MR scanner (Discovery MR 750w, General Electric Healthcare, Waukesha, WI, USA) using an eight-channel phase-array breast coil. All the participants had undergone DCE-MRI using a three-dimension T1-weighted fast spoiled gradient-echo sequence with the imaging parameters set as follows: repetition time/echo time =4.5/2.1 ms, field of view =360 mm × 360 mm, image matrix = 320×320, slice thickness =1.4 mm with no gap, and flip angle =12°. A total of 10 periods (35–55 s per period), with one pre-contrast and nine post-contrast dynamic periods, were obtained. For dynamic MRI, gadoteric acid meglumine (Gd-DOTA, Dotarem, Guerbet, Roissy CdG Cedex, France) was injected into the antecubital vein at a rate of 2 mL/s (total dose, 0.1 mmol/kg of body weight) using a power injector (Ulrich, Germany) via a 20-gauge needle, which was followed by a 20 mL saline flush.
Image analysis
All the MRI images were transferred to an in-house MATLAB platform developed by X.W. for analysis. A total of 489 slices with breast cancer lesions (171 grade I, 140 grade II, and 178 grade III) were used for analysis. All the lesions were initially segmented manually on the third phase of the DCE-MR images by a radiologist (M.W. or Y.R., each with 5 years of experience in breast imaging) and then reviewed by a senior radiologist (J.W., with more than 10 years of experience in breast imaging). All three radiologists were blinded to the histology of the lesions. Any discrepancies were resolved through discussion. For each slice, 78 original texture features were extracted using the first-order histogram (FH; 18 features), gray-level co-occurrence matrix (GLCM; 23 features), neighborhood gray-tone difference matrix (NGTDM; 5 features), gray-level run-length matrix (GLRLM; 13 features), gray-level size zone matrix (GLSZM; 13 features), and 6 shape features. The data processing flowchart is shown in Figure 1.
A total of 390 (78×5) texture features, including the original features and the features after two-dimensional discrete wavelet transform in four directions [components of the approximation (CA), components of the diagonal detail (CD), components of the horizontal detail (CH), components of the vertical detail (CV)], were finally extracted for each slice. Five pharmacokinetic parameters, including the volume transfer constant of contrast agent leaked into the extravascular extracellular space from the plasma (Ktrans), the rate constant of contrast agent reflux to the plasma (Kep), the fractional extravascular extracellular space volume (Ve), the fractional plasma volume (Vp), and the area under the time-intensity curve (AUC), were calculated using the modified Tofts model (7). The time-transformed contrast concentration profile was calculated first using the temporal signal intensity profile (see Eq. [1]), after which the kinetic parameters (Ktrans and Ve) were calculated by fitting the temporal concentration profile using a modified two-compartment kinetic model (see Eq. [2]).
where TR is the repetition time (msec), T10 is the so-called native relaxation time, is the tissue contrast agent concentration, r1 (mM−1s−1) is the longitudinal contrast agent relaxation coefficient, and α is the flip angle.
where kel is the excretion rate and represents the loss of contrast agent in the system, denotes the contrast agent concentration in the vascular space at time t=0, and t0 is the time before the injection of the contrast agent.
The data set was first divided into a training set and a test set at a ratio of 7:3. Next, to reduce the dimension, principal component analysis (PCA) was performed for all pharmacokinetic parameters and texture features. Then, the reduced dimension features were used in training sets for the three different types of classifiers (random forest, SVM, and KNN). This paper employed two-fold cross-validation in the training phase.
Study method
In D-S evidence theory, all answers that can be thought of for a question are put into the identification framework. All the answers within the identification framework are mutually exclusive, whereas the answer to the question is unique. Each answer can be regarded as a proposition, and each proposition has a degree of confidence called the basic probability assignment (BPA, also known as the M function). The m(A) reflects the degree of reliability of A.
For the same problem, different evidence sources will yield different BPA values, and the BPA values of different evidence sources can be processed using the orthogonal sum to obtain a new basic probability distribution function (16). For the ith classifier, first, a sample was selected for training, the best recognition rate of the classifier was found, and then the probability distribution function was designed according to the distribution of the sample category. The BPA estimation formula is as follows:
where is the number of test samples that actually belong to the tth class when they are judged to be the jth class by the ith classifier, and α is the uncertainty factor.
The D-S evidence theory synthesis rule can be interpreted as follows:
where ; and m1(l), m2(j), and m3(h) represent the basic probability value that the classification result of the first classifier is pathology grade l, grade j, and grade h, respectively.
The BPAs of the three classifiers, namely m1, m2, and m3, were calculated. Each classifier has three classification levels, and the correct and incorrect prediction results can be represented using a 3×3 matrix. Elements within the matrix represent the BPA for that classifier. The BPA matrices obtained by the three classifiers are denoted by M1, M2, and M3, respectively. M1, M2, and M3 are input into Eq. [4] to obtain the synthetic result G. Assuming that the combined BPA value is G (k), the formula for the decision-making process is as follows:
which satisfies , where j is the decision-making category. According to the recognition rate of SVM, Random Forest, and KNN, this paper considered the distribution of training samples to construct the BPA and conducted evidence synthesis and decision-making based on the synthesis rules of D-S evidence theory. Receiver operating characteristic curve analysis was performed for each model built. The overall algorithm flowchart is shown in Figure 2.
Results
Clinical characteristics of the participants
A total of 489 slices with breast cancer lesions (171 grade I, 140 grade II, and 178 grade III) were used for analysis. Figure 3 shows representative MRI images of the first, third, fifth, seventh, and ninth phases acquired in the corresponding acquisition times for grade I–III breast cancer.
Comparison between the single classifier results and the classification results based on D-S evidence theory
After reducing the dimension of 390 texture features extracted from slices, 113 new features were selected. The data were divided into training and testing sets at a ratio of 7:3 and then sent to three different classifiers (SVM, random forest, and KNN) for training and testing. The performance of the machine learning techniques was evaluated in terms of accuracy, sensitivity, and specificity (Table 2).
Table 2
Histologic grade | SVM | Random Forest | KNN | |||||
---|---|---|---|---|---|---|---|---|
Specificity | Sensitivity | Specificity | Sensitivity | Specificity | Sensitivity | |||
Grade I | 0.716 | 0.990 | 0.761 | 0.836 | 0.894 | 0.967 | ||
Grade II | 1.000 | 0.629 | 0.816 | 0.738 | 0.791 | 0.810 | ||
Grade III | 0.934 | 0.798 | 0.804 | 0.774 | 0.936 | 0.830 | ||
Accuracy | 0.828 | 0.789 | 0.878 |
SVM, support vector machine; KNN, K-nearest neighbor.
From the case classification results using a single classifier (Table 2), KNN had the highest accuracy among the three classifiers. However, in terms of specificity and sensitivity in classifying breast cancers of different pathological grades, the performance of KNN was not always the best. For example, in the recognition of pathology grade II, the sensitivity of the SVM classifier was 1.0, which was much higher than that of KNN or Random Forest; however, in the classification of pathology grade I, the sensitivity of the SVM classifier was lower than that of KNN and Random Forest. These observations show that the different classifiers have different advantages in classifying slices at different pathological levels. Table 3 shows the results of the classification evaluation index of the D-S evidence theory combined with multiple classifiers. The accuracy rate of the D-S evidence theory was 0.929; its sensitivity for the different grades was 0.896 for grade I, 0.976 for grade II, and 0.814 for grade III; and its specificity for the different grades was 0.972 for grade I, 0.796 for grade II, and 0.833 for grade III.
Table 3
Evaluation indicator | Grade I | Grade II | Grade III |
---|---|---|---|
Specificity | 0.896 | 0.976 | 0.814 |
Sensitivity | 0.972 | 0.796 | 0.833 |
Accuracy | 0.929 |
D-S, Dempster-Shafer.
According to the experimental results, the receiver operating characteristic curve of each level was drawn, and the average AUC value was calculated (Figure 4). The average AUC value of the D-S-based method reached 0.896, which was higher than that of the methods using a single classifier (SVM: 0.829, Random Forest: 0.727, and KNN: 0.835.
Discussion
In this study, we investigated the performance of a combination of multiple classifiers based on D-S evidence theory in classifying the histologic grade of breast cancer and compared it with the performances of three classical single classifiers. Our proposed method outperformed the single classical classifiers, which indicates that D-S evidence theory can effectively combine the effects of multiple classifiers to obtain higher classification accuracy.
The three classifiers included in this study have different recognition situations in different categories and have unique advantages for different histologic grades. Using D-S evidence theory combined with multiple classifiers eliminated the blind spots of the single classifiers for certain pathology levels by combining the recognition advantages of each classifier for those pathology levels, to achieve a balanced recognition of all pathology levels (17,18). According to the training results and the sample size, the BPA values of the three classifiers were calculated and the evidence was synthesized, and from this, the decision was made. Since the KNN method mainly relies on a limited number of neighboring samples around to determine the class to which it belongs, rather than the method of discriminating class domains, it is more advantageous for classifying sample sets with more intersection or overlap of class domains. Moreover, KNN involves no assumptions on the data, is highly accurate, and is insensitive to outliers (19). Therefore, these factors might explain our observation that KNN had the best overall performance among the three classifiers.
Regarding SVM classifiers, K secondary classification SVM classifiers need to be trained for the sample data of the K categories during training. The sample data of the ith SVM subcategory is marked as a positive category, while the sample data not belonging to category I are marked as a negative category. Therefore, the negative category sample data appears for each level of the pathological section, and the data of the negative category samples are much larger than those of the positive category samples. Consequently, class imbalance appears, and this situation tends to become more serious with the increase of training data. For example, in histologic grade II recognition, the specificity of the SVM classifier reaches 1 but its sensitivity is only 0.6286.
As previously reported, using D-S evidence theory combined with multiple classifiers can achieve a balanced recognition of classification tasks. Consistently, we observed that the accuracy and average AUC of the fusion strategy proposed in this paper were higher than those of single classifiers for breast histologic grade classification, indicating that our fusion strategy can integrate the advantages of different classifiers to improve the classification performance. Therefore, the results of this study demonstrate that based on D-S evidence theory, the knowledge of multiple classifiers can be fused at the same time using the Dempster synthesis rules, and the prediction results of multiple predictors (SVM, KNN, and Random Forest) can be fused to improve the classification accuracy.
This study had a number of limitations. First, the sample size in this study was relatively limited. However, our study mainly focused on verifying the proof of concept of using D-S evidence theory to merge and take advantage of multiple classical classifiers. To augment our dataset, two-dimensional slices were used instead of three-dimensional volume to extract texture features and pharmacokinetic parameters. Second, the PCA method was applied in this study to solve the “curse of dimensionality” problem. However, PCA is a linear reduction method, which might have resulted in the loss of some valuable information. In the future, the appropriateness of PCA should be evaluated and more feature selection methods should be explored. Third, this paper applied the D-S formula to the SVM, Random Forest, and KNN classifiers without analyzing the independence between the different classifier-based models; this aspect should be further investigated in the future. Fourth, this paper did not apply bootstrapping techniques to obtain confidence intervals of the classification performance. Finally, we arbitrarily chose KNN, SVM, and Random Forest, as three of the most commonly used classifiers. Many other commonly used classifiers, such as naïve Bayes, logistic regression, and decision tree classifiers, were not explored in our study. However, the primary aim of this work was to explore the potential benefit of D-S evidence theory in taking advantage of multiple classifiers, not to explore the performance of specific classifiers themselves.
Conclusions
The results of this study suggest that the effective combination of multiple classifiers under D-S evidence theory can improve histological grade prediction in breast cancer.
Acknowledgments
Funding: The study was partially supported by the Key Laboratory for Magnetic Resonance and Multimodality Imaging of Guangdong Province (grant No. 2020B1212060051); the Key Technology and Equipment R&D Program of Major Science and Technology Infrastructure of Shenzhen (grant No. 202100102 and 202100104); the Guangdong Innovation Platform of Translational Research for Cerebrovascular Diseases, the Shenzhen Basic Research Program (grant No. KCXFZ202002011010360); Shenzhen Clinical Research Center for Cancer (grant No. [2021] 287); and the Shenzhen High-level Hospital Construction Fund.
Footnote
Reporting Checklist: The authors have completed the TRIPOD reporting checklist. Available at https://qims.amegroups.com/article/view/10.21037/qims-22-652/rc
Conflicts of Interest: All authors have completed the ICMJE uniform disclosure form (available at https://qims.amegroups.com/article/view/10.21037/qims-22-652/coif). The authors have no conflicts of interest to declare.
Ethical Statement: The authors are accountable for all aspects of the work in ensuring that questions related to the accuracy or integrity of any part of the work are appropriately investigated and resolved. The study was conducted in accordance with the Declaration of Helsinki (as revised in 2013) and was approved by the ethics committee of Cancer Hospital Chinese Academy of Medical Sciences, Shenzhen Hospital. The requirement for individual consent for this retrospective analysis was waived.
Open Access Statement: This is an Open Access article distributed in accordance with the Creative Commons Attribution-NonCommercial-NoDerivs 4.0 International License (CC BY-NC-ND 4.0), which permits the non-commercial replication and distribution of the article with the strict proviso that no changes or edits are made and the original work is properly cited (including links to both the formal publication through the relevant DOI and the license). See: https://creativecommons.org/licenses/by-nc-nd/4.0/.
References
- Momenimovahed Z, Salehiniya H. Epidemiological characteristics of and risk factors for breast cancer in the world. Breast Cancer (Dove Med Press) 2019;11:151-64. [Crossref] [PubMed]
- Chen H, Wu K, Wang M, Wang F, Zhang M, Zhang P. A standard mastectomy should not be the only recommended breast surgical treatment for non-metastatic inflammatory breast cancer: A large population-based study in the Surveillance, Epidemiology, and End Results database 18. Breast 2017;35:48-54. [Crossref] [PubMed]
- Abramson RG, Li X, Hoyt TL, Su PF, Arlinghaus LR, Wilson KJ, Abramson VG, Chakravarthy AB, Yankeelov TE. Early assessment of breast cancer response to neoadjuvant chemotherapy by semi-quantitative analysis of high-temporal resolution DCE-MRI: preliminary results. Magn Reson Imaging 2013;31:1457-64. [Crossref] [PubMed]
- Holli K, Lääperi AL, Harrison L, Luukkaala T, Toivonen T, Ryymin P, Dastidar P, Soimakallio S, Eskola H. Characterization of breast cancer types by texture analysis of magnetic resonance images. Acad Radiol 2010;17:135-41. [Crossref] [PubMed]
- Yankeelov TE, Lepage M, Chakravarthy A, Broome EE, Niermann KJ, Kelley MC, Meszoely I, Mayer IA, Herman CR, McManus K, Price RR, Gore JC. Integration of quantitative DCE-MRI and ADC mapping to monitor treatment response in human breast cancer: initial results. Magn Reson Imaging 2007;25:1-13. [Crossref] [PubMed]
- Turkki R, Byckhov D, Lundin M, Isola J, Nordling S, Kovanen PE, Verrill C, von Smitten K, Joensuu H, Lundin J, Linder N. Breast cancer outcome prediction with tumour tissue images and machine learning. Breast Cancer Res Treat 2019;177:41-52. [Crossref] [PubMed]
- Tahmassebi A, Wengert GJ, Helbich TH, Bago-Horvath Z, Alaei S, Bartsch R, Dubsky P, Baltzer P, Clauser P, Kapetas P, Morris EA, Meyer-Baese A, Pinker K. Impact of Machine Learning With Multiparametric Magnetic Resonance Imaging of the Breast for Early Prediction of Response to Neoadjuvant Chemotherapy and Survival Outcomes in Breast Cancer Patients. Invest Radiol 2019;54:110-7. [Crossref] [PubMed]
- Wu T, Sultan LR, Tian J, Cary TW, Sehgal CM. Machine learning for diagnostic ultrasound of triple-negative breast cancer. Breast Cancer Res Treat 2019;173:365-73. [Crossref] [PubMed]
- Chitalia RD, Kontos D. Role of texture analysis in breast MRI as a cancer biomarker: A review. J Magn Reson Imaging 2019;49:927-38. [Crossref] [PubMed]
- Henderson S, Purdie C, Michie C, Evans A, Lerski R, Johnston M, Vinnicombe S, Thompson AM. Interim heterogeneity changes measured using entropy texture features on T2-weighted MRI at 3.0 T are associated with pathological response to neoadjuvant chemotherapy in primary breast cancer. Eur Radiol 2017;27:4602-11. [Crossref] [PubMed]
- Machireddy A, Thibault G, Tudorica A, Afzal A, Mishal M, Kemmer K, Naik A, Troxell M, Goranson E, Oh K, Roy N, Jafarian N, Holtorf M, Huang W, Song X. Early Prediction of Breast Cancer Therapy Response using Multiresolution Fractal Analysis of DCE-MRI Parametric Maps. Tomography 2019;5:90-8. [Crossref] [PubMed]
- Chen C, Wang JZ, Chang HY, Li J. Lane Detection of Multi-visual-features Fusion Based on D-S Theory. Chin Contr Conf. 2011:3047-52.
- Tang YC, Wu DD, Liu ZJ. A new approach for generation of generalized basic probability assignment in the evidence theory. Pattern Anal Appl 2021;24:1007-23. [Crossref]
- Liu ZG, Huang LQ, Zhou K, Denoeux T. Combination of Transferable Classification With Multisource Domain Adaptation Based on Evidential Reasoning. IEEE Trans Neural Netw Learn Syst 2021;32:2015-29. [Crossref] [PubMed]
- Zhou R, Fang WP, Wu JS. A risk assessment model of a sewer pipeline in an underground utility tunnel based on a Bayesian network. Tunnelling and Underground Space Technology 2020;103:103473. [Crossref]
- Kumar V, Bhatele M. Proceedings of all India seminar on biomedical engineering 2012 (AISOBE 2012). Springer New Delhi 2012.
- Singh R, Vatsa M, Noore A, Singh SK. DS theory based fingerprint classifier fusion with update rule to minimize training time. IEICE Electronics Express 2006;3:429-35. [Crossref]
- Kisku DR, Tistarelli M, Sing JK, Gupta P. Face Recognition by Fusion of Local and Global Matching Scores using DS Theory: An Evaluation with Uni-classifier and Multi-classifier Paradigm. 2009 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops, Miami, FL, USA, 2009, pp. 60-65.
- Yan WF, Wu GX, Li CZ, Zhou L. Evidence Theory of One-dimensional Compression KNN Classification Method. Advanced Materials Research 2011;143-144:1337-41. [Crossref]