Original Article
Reproducibility and non-redundancy of radiomic features extracted from arterial phase CT scans in hepatocellular carcinoma patients: impact of tumor segmentation variability
Abstract
Background: The reproducibility and non-redundancy of radiomic features are challenges in accelerating the clinical translation of radiomics. In this study, we focused on the robustness and non-redundancy of radiomic features extracted from computed tomography (CT) scans in hepatocellular carcinoma (HCC) patients with respect to different tumor segmentation methods.
Methods: Arterial enhanced CT images were retrospectively randomly obtained from 106 patients. As a training data set, 26 HCC patients were used to calculate the features’ reproducibility and redundancy. Another data set (55 HCC patients and 25 healthy volunteers) was used for classification. The GrowCut and GraphCut semiautomatic segmentation methods were implemented in 3D Slicer software by two independent observers, and manual delineation was performed by five abdominal radiation oncologists to acquire the gross tumor volume (GTV). Seventy-one radiomic features were extracted from GTVs using Imaging Biomarker Explorer (IBEX) software, including 17 tumor intensity statistical features, 16 shape features and 38 textural features. For each radiomic feature, intraclass correlation coefficient (ICC) and hierarchical clustering were used to quantify its reproducibility and redundancy. Features with ICC values greater than 0.75 were considered reproducible. To generate the number of non-redundancy feature subgroups, the R2 statistic method was used. Then, a classification model was built using a support vector machine (SVM) algorithm with 10-fold cross validation, and area under ROC curve (AUC) was used to evaluate the utility of non-redundant feature extraction by hierarchical clustering.
Results: The percentages of excellent reproducible features in the manual delineation group, GraphCut and GrowCut segmentation group were 69% [49], 73% [52] and 79% [56], respectively. Sixty-five percent [46] of the features showed strong robustness for all segmentation methods. The optimal number of cluster subgroup were 9, 13 and 11 for manual delineation, GraphCut and GrowCut segmentation, respectively. The optimal cluster subgroup number was 6 for all groups when the collectively high reproducibility features were selected for clustering. The ROC analysis of radiomics classification model with and without feature reduction for healthy liver and HCC had an AUC value of 0.857 and 0.721 respectively.
Conclusions: Our study demonstrates that variations exist in the reproducibility of quantitative imaging features extracted from tumor regions segmented using different methods. The reproducibility and non-redundancy of the radiomic features rely greatly on the tumor segmentation in HCC CT images. We recommend that the most reliable and uniform radiomic features should be selected in the clinical use of radiomics. Classification experiments with feature reduction showed that radiomic features were effective in identifying healthy liver and HCC.
Methods: Arterial enhanced CT images were retrospectively randomly obtained from 106 patients. As a training data set, 26 HCC patients were used to calculate the features’ reproducibility and redundancy. Another data set (55 HCC patients and 25 healthy volunteers) was used for classification. The GrowCut and GraphCut semiautomatic segmentation methods were implemented in 3D Slicer software by two independent observers, and manual delineation was performed by five abdominal radiation oncologists to acquire the gross tumor volume (GTV). Seventy-one radiomic features were extracted from GTVs using Imaging Biomarker Explorer (IBEX) software, including 17 tumor intensity statistical features, 16 shape features and 38 textural features. For each radiomic feature, intraclass correlation coefficient (ICC) and hierarchical clustering were used to quantify its reproducibility and redundancy. Features with ICC values greater than 0.75 were considered reproducible. To generate the number of non-redundancy feature subgroups, the R2 statistic method was used. Then, a classification model was built using a support vector machine (SVM) algorithm with 10-fold cross validation, and area under ROC curve (AUC) was used to evaluate the utility of non-redundant feature extraction by hierarchical clustering.
Results: The percentages of excellent reproducible features in the manual delineation group, GraphCut and GrowCut segmentation group were 69% [49], 73% [52] and 79% [56], respectively. Sixty-five percent [46] of the features showed strong robustness for all segmentation methods. The optimal number of cluster subgroup were 9, 13 and 11 for manual delineation, GraphCut and GrowCut segmentation, respectively. The optimal cluster subgroup number was 6 for all groups when the collectively high reproducibility features were selected for clustering. The ROC analysis of radiomics classification model with and without feature reduction for healthy liver and HCC had an AUC value of 0.857 and 0.721 respectively.
Conclusions: Our study demonstrates that variations exist in the reproducibility of quantitative imaging features extracted from tumor regions segmented using different methods. The reproducibility and non-redundancy of the radiomic features rely greatly on the tumor segmentation in HCC CT images. We recommend that the most reliable and uniform radiomic features should be selected in the clinical use of radiomics. Classification experiments with feature reduction showed that radiomic features were effective in identifying healthy liver and HCC.