Cervical cancer segmentation based on medical images: a literature review

Xiu Wang; Chaolu Feng; Mingxu Huang; Shiqi Liu; He Ma; Kun Yu

doi:10.21037/qims-24-369

Review Article

Cervical cancer segmentation based on medical images: a literature review

Xiu Wang¹ , Chaolu Feng^2,3, Mingxu Huang³, Shiqi Liu¹, He Ma^1,2, Kun Yu^1,2

¹College of Medicine and Biological Information Engineering, Northeastern University, Shenyang, China; ²Key Laboratory of Intelligent Computing in Medical Image, Ministry of Education, Shenyang, China; ³School of Computer Science and Engineering, Northeastern University, Shenyang, China

Contributions: (I) Conception and design: K Yu, C Feng, H Ma; (II) Administrative support: K Yu; (III) Provision of study materials or patients: K Yu; (IV) Collection and assembly of data: X Wang, M Huang, S Liu; (V) Data analysis and interpretation: X Wang, M Huang; (VI) Manuscript writing: All authors; (VII) Final approval of manuscript: All authors.

Correspondence to: Kun Yu, PhD. College of Medicine and Biological Information Engineering, Northeastern University, Chuangxin Street 195, Shenyang 110016, China; Key Laboratory of Intelligent Computing in Medical Image, Ministry of Education, No. 3-11, Wenhua Road, Heping District, Shenyang 110819, China. Email: yukun@bmie.neu.edu.cn.

Background and Objective: Cervical cancer clinical target volume (CTV) outlining and organs at risk segmentation are crucial steps in the diagnosis and treatment of cervical cancer. Manual segmentation is inefficient and subjective, leading to the development of automated or semi-automated methods. However, limitation of image quality, organ motion, and individual differences still pose significant challenges. Apart from numbers of studies on the medical images’ segmentation, a comprehensive review within the field is lacking. The purpose of this paper is to comprehensively review the literatures on different types of medical image segmentation regarding cervical cancer and discuss the current level and challenges in segmentation process.

Methods: As of May 31, 2023, we conducted a comprehensive literature search on Google Scholar, PubMed, and Web of Science using the following term combinations: “cervical cancer images”, “segmentation”, and “outline”. The included studies focused on the segmentation of cervical cancer utilizing computed tomography (CT), magnetic resonance (MR), and positron emission tomography (PET) images, with screening for eligibility by two independent investigators.

Key Content and Findings: This paper reviews representative papers on CTV and organs at risk segmentation in cervical cancer and classifies the methods into three categories based on image modalities. The traditional or deep learning methods are comprehensively described. The similarities and differences of related methods are analyzed, and their advantages and limitations are discussed in-depth. We have also included experimental results by using our private datasets to verify the performance of selected methods. The results indicate that the residual module and squeeze-and-excitation blocks module can significantly improve the performance of the model. Additionally, the segmentation method based on improved level set demonstrates better segmentation accuracy than other methods.

Conclusions: The paper provides valuable insights into the current state-of-the-art in cervical cancer CTV outlining and organs at risk segmentation, highlighting areas for future research.

Keywords: Cervical cancer; medical image segmentation; deep learning; review

Submitted Feb 26, 2024. Accepted for publication May 20, 2024. Published online Jun 11, 2024.

doi: 10.21037/qims-24-369

Introduction

Cervical cancer is a highly malignant tumor that originates in the cervix and is among the most lethal cancer globally. Epidemiologic studies have shown that almost all cervical cancers are caused by persistent infection with one of approximately 15 high-risk human papillomavirus (HPV) types (1). According to 2020 World Health Organization statistics on 36 malignant tumors in 185 countries, cervical cancer ranks as the fourth most common malignant tumor in women. Each year, there are approximately 600,000 new cases of cervical cancer worldwide, resulting in around 340,000 deaths (2).

The treatment method of cervical cancer varies according to the stage of cervical cancer. Cervical cancer is categorized into four stages based on the location of the tumor. For patients with stage I B3 and II A2 cervical cancer, which is also locally advanced cervical cancer, the most common treatment is a combination of radiotherapy and chemotherapy (3). Radiotherapy includes conformal radiotherapy, intensity-modulated radiation therapy (IMRT) and precision radiotherapy. IMRT is the most widely used radiation technique for cervical cancer, as it provides high precision treatment dose to the tumor and reduces the dose to the organs at risk (4). For patients with severe parametrial and paravaginal involvement, their radiotherapy plans include external radiation therapy and brachytherapy (5). External radiation therapy involves irradiating the radiation source at a fixed distance from a lesion in the body. Brachytherapy, on the other hand, is a type of precision radiotherapy in which an interpolation needle is inserted inside the tumor for irradiation, based on the size, site of invasion, and depth of invasion of the cervical tumor. This provides additional local irradiation to the residual lesion and the cervix, completing the radiation therapy dose requirements.

The guideline for radiation therapy is to find a balance between delivering a sufficient dose to the tumor while minimizing radiation exposure to surrounding areas and healthy organs. The accurate outlining of the clinical target volume (CTV) and organs at risk is essential for developing, evaluating, and optimizing radiation treatment plans. Generally, the target volume includes gross target volume (GTV), CTV, and planning target volume (PTV). According to consensus guidelines for CTV delineation, the CTV should include cervical tumor, cervical, uterine, parametrial, ovarian, vaginal tissue, and lymph node CTVs (6), while organs at risk for cervical cancer radiation therapy generally include the bladder, vagina, rectum, sigmoid colon, small intestine, and bilateral femoral bones adjacent to the cervix. CTV, PTV and organs at risk outlines are shown in Figure 1. In order to maximize the treatment efficacy, the precise contouring of the CTV and adjacent normal organs is crucial in the radiation planning for cervical cancer. During radiation therapy, several imaging sessions are needed to verify the tumor’s location and constantly adjust the radiation plan due to uncertain factors such as tumor regression and movement of organs in the body, including the effect of bladder filling level and rectal movement (7). The organ movement is shown in Figure 2.

Figure 1 CTV and the location of organs at risk. CTV, clinical target volume; PTV, planning target volume.

Figure 2 Changes of bladder, colon, intestine, rectum and CTV movement under different CT scans (bladder: green, rectum: blue, intestines: pink, colon: yellow, CTV: red). CTV, clinical target volume; CT, computed tomography.

Modern imaging techniques, such as magnetic resonance imaging (MRI), computed tomography (CT) and 18F-fluorodeoxyglucose positron emission tomography (FDG-PET), play a crucial role in adjusting the radiation dose to different target volumes and minimizing the impact of radiation on healthy organs. By providing accurate information on tumor location and surrounding structures, image-guided radiotherapy has significantly improved the accuracy of brachytherapy for cervical cancer and the treatment outcomes for patients.

Currently, the segmentation of cervical cancer’s CTV and organs at risk is mainly performed manually by physicians, which is time-consuming and prone to subjective errors. Therefore, it is highly demanding for accurate, efficient, and objective segmentation methods in cervical cancer diagnosis. Ghose et al. (8) firstly reviewed the segmentation of CT and magnetic resonance (MR) images by using registration method for cervical cancer in 2015. Recently, Yang et al. (9) introduced the studies on the segmentation of CT images by using deep learning method, while Zaki et al. (10) reviewed the graph-based method of segmenting cervical cancer based on colposcopic images and Pap smears. Apart from numbers of studies on the automatic medical images’ segmentation, a comprehensive review for cervical cancer is lacking.

Therefore, starting from three modalities of CT/MR/PET images, this paper reviews the segmentation methods in the field of cervical cancer. Both traditional and deep learning methods are comprehensively described. The similarities and differences of related methods are analyzed, and their advantages and limitations are discussed in-depth. By reviewing the relevant literature and methods of cervical cancer medical image segmentation, we can fully understand the current research status and latest progress in the field, including segmentation methods of different modes of images, technical characteristics, and applications. Through the comparison and analysis of various segmentation methods, their advantages, limitations, and applicable scenarios can be evaluated. This helps researchers and clinicians to choose appropriate segmentation methods and improve the accuracy and stability of segmentation results. The results indicate that the residual module and squeeze-and-excitation blocks module can significantly improve the performance of the model. Additionally, the segmentation method based on improved level set demonstrates better segmentation accuracy than other methods. Our finding provides valuable insights into the current state-of-the-art in cervical cancer CTV outlining and organs at risk segmentation, highlighting areas for future research. We present this article in accordance with the Narrative Review reporting checklist (available at https://qims.amegroups.com/article/view/10.21037/qims-24-369/rc).

Methods

Literature search method

For this review, the detailed research strategies are shown in Table 1 and 64 eligible papers have been included in our reference list.

Table 1

The search strategy summary

Items	Specification
Date of search	June 10, 2023
Databases and other sources searched	Google Scholar, PubMed, and Web of Science
Search terms used	Use a combination of the following terms: “cervical cancer images”, “segmentation”, and “outline”
Timeframe	January 1, 2000 to May 31, 2023
Inclusion and exclusion criteria	No language restrictions were set during the search, and cervical cancer segmentation literature using colposcopy images, Pap Smear, and cell images were excluded
Selection process	Two researchers independently conducted the literature screening, and in case of any disagreements, a third researcher made the final judgment

Integration of information

At present, the segmentation methods of cervical cancer radiological CTV and organs at risk by CT, MRI and PET images are mainly divided into two categories, which are traditional segmentation methods and deep learning segmentation methods. Traditional segmentation methods, including registration, atlas segmentation, level set, region growth, graph cut, and watershed segmentation, have been widely applied. In recent years, deep learning-based segmentation methods, such as UNet, V-Net, and their improved networks, have gained people’s attention due to their superior performance. The percentage of these methods in the retrieved papers is shown in Figure 3.

Figure 3 Summary of methods for searching papers. FCM, fuzzy c-means; KNN, K nearest neighbors; LDA, linear discriminant analysis; GAN, generative adversarial network.

Among the 64 papers retrieved, the earliest one was published in 2011 and the number of cervical cancer image segmentation papers has increased substantially since 2019. The statistics of paper publication year and image modality are shown in Figure 4. The images characteristics of these three modalities differ in their ability to present cervical cancer CTV and organs at risk. CT can obtain continuous thin layer images by multi-phase scanning and enable observation of tumors from different angles by reconstruction techniques. It can display the size, depth of infiltration, and invasion of the primary tumor, as well as identify whether the cancer has spread to bones and other places. CT imaging is also not affected by metal. However, CT images have poor resolution and contrast, which can make it difficult to distinguish smaller tumors, and the upper edge of the cervix may not be clearly displayed. Compared with CT images, MRI images have higher tissue resolution and can achieve accurate diagnosis of cervical cancer. It can also perform multi-directional and multi-sequence scans to understand the cervical cancer site, parametrial invasion, lymph node metastasis. Internal pelvic organs and inter-tissue signals can be observed to identify the cervical cancer stage. MRI images with high contrast are important for cervical cancer patients, as the lesions’ large geometry and the soft tissue borders are not obvious in the images. But MRI is susceptible to metal, prone to artifacts and relatively expensive. Compared to the first two imaging modalities, PET complements MRI in assessing the extent of localized disease and helps to depict the margins of invasive tumors. In cases where the tumor extends upward into the uterine cavity and caudally into the vaginal cuff, PET is particularly helpful (11). FDG-PET is more sensitive in detecting pelvic and para-aortic lymph node metastases. Detection of lymph nodes by PET allows physicians to modify treatment plans and enables radiation oncologists to expand the radiation treatment volume to include metastatic lymph nodes (12).

Figure 4 The statistics of paper publication year and image modality. (A) Summary of the year of publication of the retrieved papers, (B) modal summary of the retrieved papers. CT, computed tomography; MR, magnetic resonance; PET, positron emission tomography.

Current state of cervical cancer segmentation methods for different image techniques

Cervical cancer segmentation based on CT images

CT-based brachytherapy for cervical cancer is widely used in treatment centers for cervical cancer radiation therapy programs. Many automatic segmentation methods relying on CT images to segment target areas and organs at risk have been proposed. Because cervical cancer and normal cervical regions have similar attenuation in CT images and cannot be accurately distinguished, there is less literature on the segmentation of cervical cancer CT images using traditional methods. By searching Google Scholar, PubMed, and Web of Science websites, only four articles met the requirements. Putri et al. (13) used the fuzzy C-mean method to achieve segmentation of the cervical region and localization of cervical cancer in 2015. Other three studies utilized atlas for their segmentation. With the rapid development of deep learning, K nearest neighbors (KNN) (14) and convolutional neural networks (CNNs) have been used to segment cervical cancer CT images. In a recent study, Wang et al. (15) used ResUNet as a segmentation model to compare the variability of resident and model’s learning ability. The model was first trained using a gold standard outlined by a particular senior physician. In the meanwhile, residents were mentored by the same senior physician for eight months. The segmentation results obtained by the model segmentation were compared with those obtained by the residents. There was little difference in segmentation accuracy between two. However, the model takes only 2 minutes to complete segmentation, while the residents required 90 minutes to complete the same task. This suggests that the model was much more efficient than manual segmentation by residents. The implementation of deep learning models holds tremendous promise for enhancing both the efficiency and accuracy of medical image segmentation. UNet, V-Net and their variants have shown state-of-the-art performance in various image segmentation tasks.

Atlas-based segmentation

One commonly used traditional method is atlas-based segmentation. It involves deformable image registration to obtain the contours of the target area and the organs at risk. To be more specific, an accurately labeled image is obtained as an atlas, and then the image to be segmented is mapped onto the atlas using the registration method. The resulting transformation is then used to propagate the atlas labels onto the new image, generating a segmentation result. The whole process is depicted in Figure 5. The segmentation accuracy of the atlas method is closely related to the quality of the atlas library. Typically, the accuracy is higher when the images in the atlas library are more similar to the image being segmented. Langerak et al. (16), Kim et al. (17) and Li et al. (18) have used the atlas method in their studies for the segmentation of cervical cancer. The effect of atlas library size on the accuracy and efficiency of the automatic atlas-based segmentation (ABAS) method has also been explored by Kim et al. (17) and Li et al. (18). However, they reached opposite conclusions regarding the optimal atlas library size. Kim et al. (17) showed experimentally that the worst outline results were obtained with a small number of patients in the atlas. They found that increasing the number of patients in the atlas improved the segmentation accuracy. On the other hand, Li et al. (18) concluded that no significant differences were found in the segmentation accuracy, measured using the dice similarity coefficient (DSC) and hausdorff distance (HD) metrics, between different atlas library sizes. These differing conclusions highlight the importance of carefully selecting the optimal atlas library size for the specific application at hand. While increasing the size of the atlas library can improve segmentation accuracy, it may not always be necessary, as a small atlas library combined with appropriate registration algorithms can also yield accurate results.

Figure 5 Atlas segmentation process.

Two-dimensional UNet (2D UNet)

In recent years, the UNet network and its variants have emerged as some of the most popular architectures for this task (19-30). UNet was developed for biomedical image segmentation at the Computer Science Department of the University of Freiburg (31). The network is based on the fully convolutional network and its architecture has been modified and extended to work with fewer training images and to yield more precise segmentations. The structure of a typical UNet network consists of a contracting path, which captures context and reduces the spatial resolution of the input image, and an expanding path, which recovers the spatial information and generates the segmentation mask. Segmentation of a 512×512 image using either UNet or its variants can take less than a second on a modern GPU. UNet adopts an encoder-decoder structure, in which the encoder learns features of different scales through multiple convolutional and pooling operations, and the decoder maps global features to the pixel level. The inclusion of skip connections facilitates the transmission of information to capture features of objects at different scales and levels, enhancing the model’s understanding of the target (32). Moreover, the encoder-decoder approach can be trained end-to-end, enabling the network to simultaneously learn feature extraction and pixel-level classification, simplifying model design and training processes. Additionally, the flexibility to modify the encoder, decoder, and skip connections further enhances the model’s feature extraction and fusion capabilities (33,34). Mohammadi et al. (35) and Liu et al. (36,37) successively used the improved UNet to segment cervical cancer and achieved accurate segmentation of cervical cancer CTV and organs-at risk (OARs). To get the 3D information of CT scans, Liu et al. (6) and Huang et al. (28) designed the model as a 2.5D architecture by assigning different amount of adjacent slices into the three channels. The output was the delineation result of the middle slice. To gain a comprehensive overview of 2D UNet-based segmentation, summaries of representative methods and their performance on segmenting cervical cancer images are listed in Table 2. The datasets they used were unpublished private datasets, and the evaluation metrics were chosen as DSC and HD, which are commonly used in the segmentation field.

Table 2

Summary list of 2D UNet-based CT image segmentation technique for cervical cancer

Author, year	Method	Number of patients	Data volume	Evaluation metrics	Result	Highlights
Wang J 2022 (21)	U-shaped network	375 patients	NA	DSC	CTV: 77%	The network consists of three encoders and three decoders, with skip connections in the middle
				DSC	Organs at risk: 88–93%
				95% HD	CTV: 5.81
				95% HD	Organs at risk: 1.03–2.96
Wang J 2023 (26)	U-shaped network	60 patients	NA	DSC	HRCTV: 87%	The segmentation results were scored by two experienced radiation oncologists
					Bladder: 94%
					Rectum: 86%
					Sigmoid: 79%
					Small intestine: 92%
				HD	HRCTV: 1.45
					Bladder: 4.52
					Rectum: 2.52
					Sigmoid: 10.92
					Small intestine: 8.83
Mohammadi R 2021 (35)	ResUNet	113 patients	NA	DSC	Bladder: 95.7%	ResUNet deep convolutional neural network architecture is used, which uses long and short jump connections to improve the accuracy of the feature extraction process and segmentation
					Rectum: 96.6%
					Sigmoid colon: 92.2%
				HD	Bladder: 4.05
					Rectum: 1.96
					Sigmoid colon: 3.15
Liu Z 2020 (36)	Improved UNet	105 patients	NA	DSC	Bladder: 92.4%	(I) Formulated this OARs segmentation problem as a binary pixel-level classification problem (II) The convolutional layers in the UNet are replaced by Context Aggregation Blocks (III) Use the Squeeze-Extract block to assign different weights to each channel, thus, to reweight each organ mask importance
					Bone marrow: 85.4%
					Rectum: 79.1%
					Small intestine: 83.3%
					Spinal cord: 82.7%
				HD	Bladder: 5.098
					Bone marrow: 1.993
					Rectum: 5.949
					Small intestine: 5.281
					Spinal cord: 3.269
Liu Z 2021 (37)	Improved DpnUNet	237 patients	NA	DSC	CTV: 88%	(I) A novel adversarial deep-learning-based auto-segmentation model is hence proposed (II) All encoder and decoder components in UNet are replaced with DPN components (III) A three-stage multicenter randomized controlled evaluation was used, including performance metrics, oncologist evaluation and the Turing imitation test
Liu Z 2021 (37)	Improved DpnUNet	237 patients	NA	95% HD	CTV: 3.46

2D UNet, two-dimensional UNet; CT, computed tomography; NA, not applicable; DSC, dice similarity coefficient; CTV, clinical target volume; HD, hausdorff distance; HRCTV, high-risk clinical target volume; DPN, Dual Path Networks.

Three-dimensional UNet (3D UNet)

Medical images are volumetric data because their slices have continuity, the anatomical environment of 3D images is much more complex compared to 2D images. 3D CNN do not need to process the input volume in a 2D slice by slice manner. They can take the whole picture as input to the model. In cervical cancer image segmentation, networks with 3D UNet and V-Net as the basic architecture are commonly used (38-49). The structure of V-Net network is shown in Figure 6. Ding et al. (44) verified the feasibility of the V-Net network to outline CTV and organs at risk in cervical cancer. In comparison with the 2D network UNet, the study found that the V-Net performed significantly better in the colon segmentation. Table 3 shows the summary of papers on segmentation of cervical cancer target areas and organs at risk using V-net-based architecture.

Figure 6 V-Net network structure. The figure was reproduced from Ding et al. (44) under the terms of the Creative Commons Attribution-NonCommercial-NoDerivs 4.0 International License (CC BY -NC-ND 4.0).

Table 3

Summary of V-shaped network-based CT image segmentation technique for cervical cancer and its performance

Author, year	Method	Number of patients	Data volume	Highlights
Ding Y 2022 (44)	3D V-net	130	NA	The CT images of the upper and lower layers are integrated into a new input data to form a pseudo-3D image for learning. The data processing time of input terminal is reduced, so the learning effect and efficiency of CT image are improved
Ju Z 2020 (45)	Dense V-Network	190	NA	The isodose lines of 5, 10, 15 and 20 Gy in the radiotherapy scheme were transformed into structures for learning in the neural network to predict a suitable location for ovarian transposition
Ju Z 2021 (46)	Dense V-Net	133	NA	Dense V-Net is a deep learning network that integrates two deep learning models of Dense Net and V-Net
Ju Z 2021 (46)	Dense V-Net	133	NA	The residual connections are used between convolution operations to break network symmetry and enhance the sensitivity of gradient calculations
Ma CY 2022 (47)	VB-Net	535	NA	The convolution, normalization, and activation layers in V-Net are replaced by a bottleneck structure, which is the B in VB-Net
Ma CY 2022 (47)	VB-Net	535	NA	During the network training process, the multi-scale strategy with a 3D network was applied, by which we first trained a coarse-scale network for rapid positioning of target area and then a fine-scale segmentation model for precisely delineating targets’ contours based on previous coarse-scale network output
Ma CY 2022 (48)	Registration and VB-Net	107	NA	The performance of four cervical cancer target volume segmentation methods were compared, which were rigid registration method, deformable registration method, rigid registration and VB-Net combination, and deformable registration and VB-Net combination
Rhee DJ 2020 (49)	3D V-net and 2D FCN-8s	NA	2,254	Registration between the planning CT and the during-treatment CT is applied to align the CT series and the corresponding contour. The aligned planning CT and the corresponding CTV contour are then used as another two channels of input to the network. The output of the network is the estimated CTV contour on the during-treatment CT
Rhee DJ 2020 (49)	3D V-net and 2D FCN-8s	NA		In order to overcome the problem that the GPU memory was not sufficient to train the full-resolution CT images, the size of the input image is adjusted to segment the primary CTV, and the center of mass of the primary CTV is estimated. Then the box that surrounds the primary CTV is cropped and placed on the center of mass predicted in the original CT scan. Finally, the V-Net segmentation model is applied to the cropped 3D image

CT, computed tomography; NA, not applicable; 3D, three-dimensional; CTV, clinical target volume; GPU, graphics processing unit.

Jiang et al. (50) and Xiao et al. (51) validated the feasibility of RefineNet segmentation of cervical cancer and proved that RefineNetPlus3D achieves better performance than 2D-RefineNet in the organ segmentation task. The RefineNet structure is shown in Figure 7. In general, 3D neural networks can obtain contextual connections between images and obtain better segmentation results. Table 4 lists the segmentation performance of the representative methods of cervical cancer image segmentation based on 3D U-shaped network.

Figure 7 RefineNet structure. The figure was reproduced from Jiang et al. (51) under the terms of the Creative Commons Attribution-NonCommercial-NoDerivs 4.0 International License (CC BY -NC-ND 4.0). RCU, residual convolution unit; CRP, chained residual pooling.

Table 4

Summary of 3D U-shaped network-based CT image segmentation technique for cervical cancer and its performance

Author, year	Method	Number of patients	Data volume	Evaluation metrics	Result
Wang Z 2020 (15)	3D ResUNet	125	NA	DSC	CTV: 86%
					Bladder: 91%
					Femoral head right: 88%
					Femoral head left: 88%
					Small intestine: 86%
					Rectum: 81%
				HD	CTV: 14.84
					Bladder: 7.82
					Femoral head right: 6.18
					Femoral head left: 6.17
					Small intestine: 22.21
					Rectum: 7.04
Sartor H 2020 (19)	3D Full Convolutional Network	75 cases of cervical cancer and 191 cases of anorectal cancer	NA	DSC	Femoral head left: 92%
					Femoral head right: 91%
					Bladder: 83%
					Bowel: 86%
					CTVNs: 81%
Beekman C 2022 (20)	3D UNet	84	NA	DSC	CTV: 87%
Chung SY 2023 (27)	3D EfficientNet-B0	180	NA	DSC	CTV: 80%
				DSC	Bladder: 88%
				HD	CTV: 13
				HD	Bladder: 6.93
Zhang D 2020 (38)	DSD-UNet	91	91	DSC	HRCTV: 82.9%
					Bladder: 86.9%
					Small intestine: 80.3%
					Sigmoid colon: 64.5%
					Rectum: 82.1%
				HD	HRCTV: 8.1
					Bladder: 12.1
					Small intestine: 27.8
					Sigmoid colon: 19.6
					Rectum: 9.2
Shi J 2021 (39)	RA-CTVNet	462	NA	DSC	CTV: 79.2%
Yi H 2021 (40)	GML	87	NA	DSC	CTV: 81.6%
Yi H 2021 (40)	GML	87	NA	HD	CTV: 5.672
Chang Y 2021 (41)	Improved 3D UNet	400	NA	DSC	CTV: 88.2%
Chang Y 2021 (41)	Improved 3D UNet	400	NA	95% HD	CTV: 6.853
Chen A 2022 (42)	UNet	127	127	DSC	OAR: 82–96%
Chen A 2022 (42)	UNet	127	127	95% HD	OAR: 2.30–17.31
Chang JH 2021 (43)	3D UNet and LSTM	51	136	DSC	HRCTV: 87%
					GTV: 72%
					Bowel: 72%
					Foley: 95%
					Bladder: 86%
					Rectum: 77%
					Sigmoid colon: 73%
					Uterus: 93%

3D UNet, three-dimensional UNet; CT, computed tomography; NA, not applicable; DSC, dice similarity coefficient; CTV, clinical target volume; HD, Hausdorff distance; CTVNs, clinical target volume of lymph nodes; HRCTV, high-risk clinical target volume; RA-CTVNet, area-aware reweight strategy and recursive refinement strategy; GML, global multi-level attention; LSTM, long short-term memory; OARs, organs-at risk; GTV, gross target volume.

One of the main drawbacks of the 3D networks is that their memory consumption is very large. A 3D UNet requires a specific amount of input data in the spatial dimension, and many patients’ data may not meet the input requirements of the network. Therefore, Chang et al. (43) proposed a method for segmentation of high-risk clinical target volume (HRCTV) and GTV of cervical cancer that combines 3D UNet with long short-term memory (LSTM) network. This method was developed to address the challenges of segmenting cervical cancer images containing organs at risk such as bladder, bowel, and uterus. The use of LSTM networks with bidirectional convolution allows the network to achieve good performance with only seven consecutive CT images, reducing the limitations associated with spatial dimensionality and memory. The LSTM structure is shown in Figure 8.

Figure 8 LSTM structure. Ct-1 represents the memory cell internal state, ht represents the hidden state. LSTM, long short-term memory.

While 3D networks have high advantages in processing medical images, they often face various optimization difficulties such as overfitting and gradient disappearance. These challenges arise due to the large size and complexity of 3D images. To address these challenges, researchers have developed various techniques. Ju et al. (45) combined two network models, Dense Net and V-Net, to improve the ability of the model to extract features and to solve the problems of gradient disappearance and insufficient training data. To minimize the training time, the two models were trained separately and then fused together when the best training results were achieved. The parameters of both Dense Net and V-Net models were optimized separately. The fusion layer is fine-tuned when the loss functions of both models reached their optimal values. This process allowed the Dense V-Net model to achieve the best network fusion in the shortest time and improving the model’s accuracy and robustness. They experimentally demonstrated that the segmentation results of the fusion model were higher than either single model (46).

In addition to fusing the two network models to improve the model segmentation capability, the researchers have utilized the registration in combination with the network model. Ma et al. (47) proposed a novel VB-Net network for cervical cancer segmentation. This network reduced the model parameters by replacing the convolutional, normalization, and activation layers in V-Net with a bottleneck structure. This replacement made the model easier to be generalized. The model outlining results were compared with those of primary, secondary, and advanced doctors. The accuracy of the outlining obtained using the VB-Net model was comparable to that of advanced doctors. This suggests that the VB-Net method has great potential for assisting doctors in clinical practice. They further investigated the effect of combining VB-Net network with registration to improve segmentation accuracy (48). The performance of four cervical cancer target area segmentation methods were compared, namely, rigid registration method, deformable registration method, combination of rigid registration and VB-Net, and combination of deformable registration and VB-Net. The experimental results showed that the segmentation results after combining the deep learning network with the registration were higher than those of the registration method alone. This highlights the potential of combining deep learning methods with traditional image processing techniques to achieve more accurate medical image segmentation.

Cervical cancer segmentation based on MR images

MRI is the most advanced equipment in radiological examination. Compared with the single parameter index of CT, MRI is capable of performing multiple weighted imaging and responding to different characteristics of different tissues. It can also perform multi-directional imaging with high image contrast, resulting in clear soft tissue images that are crucial for cervical cancer images segmentation. The main methods for segmentation of cervical cancer target areas and organs at risk using MRI images include atlas registration methods, level sets, region growth, UNet networks, and their variants.

Traditional methods

Before deep learning became popular, atlas-based, and registration-based segmentation methods were the primary techniques used for segmentation of cervical cancer organs at risk in MR images. Registration is the process of aligning two images by estimating the coordinate geometric transformation. The two images used for registration are called the moving image and the target image, respectively. During the coordinate transformation, the appropriate similarity metric is optimized to align the two images. The key to the registration technique depends on the optimization method used to achieve the best transformation (52). As early as 2011, Berendsen et al. (53) used an atlas and registration method to segment the bladder in cervical cancer organs at risk. In their study, the registration was divided into two parts. Firstly, a global transformation involves the B spline method to obtain a roughly aligned image. And then, a local registration of the image used a statistical model combined with a prior knowledge to improve the accuracy of the registration. The accuracy of DSC was 0.5 when the global transformation was applied, and the accuracy was further improved to 0.67 after local registration. This value was higher than the accuracy of bladder segmentation reached by other traditional methods. However, when large and complex deformations are encountered, it is common for the registration to end up in a local minimum. To address this issue, Berendsen et al. (54) proposed incorporating prior knowledge-based regularization terms into the registration and using statistical models to improve the accuracy of the inter-patient registration. The considerable discrepancy between organ structure and size in the MR images of the two patients can cause great difficulties for registration. However, the proposed method successfully achieved segmentation of CTV with bladder for cervical cancer, and its effectiveness was validated using the leave-one-out method.

Since the tumor size changes during radiation treatment, tumor regression in the registration of MR Images after different doses of radiotherapy poses a challenge for registration. Lu et al. (55) defined the registration problem as a mixture of two different distributions, the tumor category, and the normal tissue category. They described the statistical image grayscale changes for both categories. These mixture distributions were weighted by the tumor detection map, which assigned its abnormality probability to each voxel. The Jacobi determinant of the transformation was also constrained, which ensures the smoothness of the transformation and simulates the tumor regression process. Their study demonstrated that the method was highly suitable for applications in image-guided radiotherapy and computer-aided diagnosis. To improve the model’s performance, a Bayesian framework was utilized for registration, detection, and segmentation of cervical tumors in T2-weighted MR images (56). This method also generated tumor probability maps, which provided a better understanding of the tumor location and extent. The registration uses data to align the planned day images to the treatment day images. Then, the segmentation extends the level set model using shape prior information. One of their innovations is to perform non-rigid registration considering not only the organ surface but also the intensity. The intensity matching takes into account the distribution of both tumor and normal tissues. They used the tumor probability map to mix and weight the two distributions, thereby linking registration, detection, and segmentation together in an interdependent manner. This approach achieved higher accuracy across all tasks.

In atlas-based segmentation, the accuracy of multi-atlas segmentation is higher compared to single atlas. One potential challenge of using multi-atlas segmentation is that as the number of images in the atlas increases, the computational volume of segmentation can become significantly larger. Moreover, because there are many organs that can be endangered in cervical cancer, a multi-atlas registration approach is often required for each of these organs to select the appropriate atlas, leading to a substantial computational demand. To overcome these problems, Daly et al. (52) proposed a multi-atlas segmentation method, which can automatically select atlases based on the similarity metric. In order to reduce the computational effort during the registration process, a two-step registration was performed. First, a global radiometric registration is performed to obtain preliminary registration results. Then, a local non-rigid registration is applied for fine matching. This approach can significantly reduce the computational time required while still achieving accurate registration results.

Because of the intricated structure of the human abdomen, there is often intensity overlap in the images and noise can have a negative impact on the image quality, making it more difficult for distinguishing cervical tumors from other structures in a single image. Kao et al. (57) aligned T2-weighted images with diffusion-weighted MR images using a mutual information method and used the Confederative Maximum a Posterior (CMAP) algorithm to automatically segment cervical tumors. To mitigate the influence of surrounding structure such as the rectum and bladder walls, the authors segmented these structures first and then segmented the cervical tumor within the region of interest.

In addition to atlas and registration segmentation methods, traditional segmentation methods such as level set, region growing, and watershed are also widely used. Garg et al. (58) demonstrated an effective approach for segmenting cervical cancer tumor regions in MR images by combining thresholding and watershed techniques, which allowed for improved accuracy of the segmentation and successful identification of the tumor regions. Su et al. (59) presented a global adaptive region growing algorithm that combines image global information with region growing to achieve automatic threshold initialization segmentation. The proposed method sets reasonable thresholds based on different grayscale features of different images, which reduces human involvement and minimizes subjectivity and uncertainty. The algorithm was compared with the traditional methods, such as Region Growing, CV Level Set, and Threshold square, and outperformed these methods. Khoulqi et al. (60) introduced a region growth to achieve segmentation of cervical tumor regions. To obtain more information about the tumor, they used axial and vectorial views of MR images. They preprocessed the images by K-means clustering to enhance the image quality before applying the region growth algorithm for segmentation. The authors also classified the segmented images using the cervical cancer classification criteria described in FIGO (International Federation of Gynecology and Obstetrics). However, this method was overly dependent on the quality of the images after preprocessing. Torheim et al. (61) used linear discriminant analysis (LDA) for voxel classification to achieve segmentation of cervical cancer tumor regions on multimodal MR images.

Deep learning methods

In the same way as segmenting cervical cancer in CT images, deep learning segmentation methods have been widely applied for segmentation of cervical cancer CTV with organs at risk on MR images. Bnouni et al. (62) first proposed a dynamic multiscale CNN forest to achieve segmentation of cervical tumors on T2-weighted MR images, involving aggregating different CNNs and obtaining bidirectional information flow between two consecutive CNNs. Since the low contrast between organs endangered and the real tumor in MRI images, cervical cancer is difficult for segmentation. Bnouni et al. (63) then proposed a new synergetic multiplex network (SMN) for the segmentation of pelvic multi-organs using multi-view (MV) MRI. This method is based on the multi-stage deep learning architecture of cyclic GAN. The generator and discriminator structures in cyclic GAN are shown in Figures 9,10. The SMN enhances the spatial coherence between adjacent pixels within the same tissue, making it easy to distinguish various organs at risk with only small contrast differences. Zabihollahy et al. (64,65) evaluated 2D Attention UNet, 3D UNet and 3D Dense UNet in the segamentaiton of CTVs and ORAs by using MRI images. Rodríguez Outeiral et al. (29) used nnUNet to segment cervical cancer and evaluated model performance based on FIGO stage and GTV volume. Lin et al. (66) used DeepLab V3+ as the pre-trained model and then adjusted the training data size and fine-tuning layers through transfer learning. Although the selection and union of models are important for image segmentation, the models become less effective for image segmentation due to image noise, intensity inhomogeneity, and other factors. To overcome this, Bnouni et al. (67) proposed a framework for automatic image preprocessing using histogram, smoothing, sharpening, and morphological processing methods. The preprocessed image is segmented using three identical CNN models. A voting mechanism is used to obtain the final segmented image. The approach was found to improve the segmentation accuracy compared to a baseline CNN model, and the effect of image preprocessing on segmentation was demonstrated.

Figure 9 Generator structure. ReLU, rectified linear unit.

Figure 10 Discriminator structure. ReLU, rectified linear unit.

Apart from image preprocessing, the use of prior knowledge can also improve the accuracy of cervical cancer segmentation (20). In practical, the definition of the cervical CTV is partially driven by clinical experience, which leads to different degrees of the inclusions of parametrial and vaginal regions. The boundaries of CTVs do not strictly correspond to the visible boundaries on medical images, which creates great difficulties in segmenting the CTV using CNNs. Incorporating prior knowledge about the expected shape and location of the cervical cancer can improve accuracy by guiding the segmentation process. Since the target area of cervical cancer is located behind the bladder, Zabihollahy et al. (65) used information about the location of the bladder and the target area to improve the segmentation accuracy of the target area.

In addition, researchers have developed a combination of 2D models with 3D models in MR images to improve model segmentation performance (23), overcoming the limitations of insufficient 3D data and lack of continuity in 2D data. Another approach is to use 2.5D models. Yoganathan et al. (68) used MV MR images combined with axial, coronal and sagittal images to construct a 2.5D model. The final experimental results of Yoganathan et al. (68) showed that the segmentation accuracy of the 2.5D network for the target area of cervical cancer with the organs at risk was higher than that of the 2D model. Gou et al. (69) proposed the MVFA-Net network to overcome specific challenges such as large MR slice thickness and image intensity inhomogeneity to achieve segmentation of cervical tumors. A multi-view attention (MVA) block was improved based on the residual network by replacing its convolutional layers with a MV block and a channel attention (C-Att) block. In the proposed MV block, multiple convolutional branches with different convolutional kernel sizes are used to target the problem of inconsistent spatial resolution in different views. Thus, they have leveraged information from the context of MR images to improve accuracy.

One of the most significant features of MRI is that it has multiple sequences that provide a rich source of information. However, using a single MRI sequence for segmentation tasks can limit the accuracy of the results. To address this, researchers have explored the use of multiple MRI sequences to improve segmentation accuracy. Huang et al. (70) proposed a modified FuseNet network to segment the organs at risk of cervical cancer and prostate cancer. The network used an attention mechanism to fuse multimodal images from T1-weighted, T2-weighted, and enhanced Dixon T1-weighted sequences. Similarly, Wang et al. (71) proposed a 3D CNN model to segment the full tumor region, core tumor region, and enhanced tumor region in multi-sequence MR cervical images. The model features a jump structure and residual connections to address gradient dispersion and a Group Normalization layer for faster convergence. Jin et al. (30) used EfficientNet as an encoder for UNet++ to achieve cervical cancer segmentation in multi-sequence MR Images. The use of multiple MRI sequences in segmentation tasks has shown promising results, highlighting the importance of leveraging the full potential of MRI data. Table 5 summarizes the literatures on MR image segmentation of cervical cancer using U-shaped networks.

Table 5

Summary of U-shaped network-based MR image segmentation technique for cervical cancer and its performance

Author, year	Method	Number of patients	Data volume	Evaluation metrics	Result
Kano Y 2021 (23)	2DUNet and 3DUNet	98	NA	DSC	Tumor (median value): 83%
Kano Y 2021 (23)	2DUNet and 3DUNet	98	NA	HD	Tumor (median value): 4.7
Lu P 2022 (24)	AugMS-Net	NA	894	DSC	Tumor: 87.21%
Lu P 2022 (24)	AugMS-Net	NA	894	HD	Tumor: 0.7012
Lin YC 2020 (25)	UNet	169	NA	DSC	Tumor: 82%
Rodríguez Outeiral R 2023 (29)	3D nnUNet	195	524	DSC	GTV: 73%
Rodríguez Outeiral R 2023 (29)	3D nnUNet	195	524	95% HD	GTV: 6.8
Jin S 2023 (30)	UNet++	228	NA	DSC	CTV: 78.6%
Jin S 2023 (30)	UNet++	228	NA	95% HD	CTV: 3.779
Zabihollahy F 2021 (64)	3D UNet and 3D Dense UNet	181	283	DSC	Bladder: 93%
					Rectum: 87%
					Sigmoid colon: 80%
Zabihollahy F 2022 (65)	2D Attention UNet and 3D UNet	125	213	DSC	CTV: 85%
Zabihollahy F 2022 (65)	2D Attention UNet and 3D UNet	125	213	95% HD	CTV: 3.7
Gou S 2022 (69)	MVFA-Net	160	160	DSC	Tumor: 74.4%
Gou S 2022 (69)	MVFA-Net	160	160	95% HD	Tumor: 11.18
Huang S 2021 (70)	Improvement of UNet	87	84	DSC	Bladder: 89.8%
				DSC	Rectum: 78.1%
				95% HD	Bladder: 8.738
				95% HD	Rectum: 11.775

MR, magnetic resonance; NA, not applicable; DSC, dice similarity coefficient; HD, Hausdorff distance; GTV, gross target volume; CTV, clinical target volume.

Cervical cancer segmentation based on PET images

18F-FDG is often used as a radiotracer during PET imaging to reveal metabolic activities in vivo (72). When patients are injected with 18F-FDG, PET scanners can construct images that reflect the distribution of 18F-FDG in the human body. Since tumor cells are more metabolically active than normal cells, tumor areas in the patient’s body will be more easily observed in PET images, reflecting the shape and size of tumors. 18F-FDG has been widely used for the diagnosis and staging of tumors (73). Several studies have used 18F-FDG PET for radiation therapy planning for different types of tumors (74,75). While CT and MR images provide detailed anatomical information, they may not be sufficient for accurately distinguishing between tumors and normal tissues based on metabolic activity. Therefore, PET imaging, either alone or in combination with CT and MR imaging, has been widely used for the segmentation of cervical cancer.

Tumor volumes in PET images are influenced by threshold selection (76), which causes interference in the outlining of tumor. Erlich et al. (77) evaluated semi-automatic segmentation algorithms for GTV depiction in cervical cancer patients using PET images. They compared metabolic PET-derived volumes with MR-based anatomical volumes using three different threshold values, GTV2SD, GTV40% and GTV50%. GTV2SD was defined as pixels with 2 standard deviations of the mean plus liver intensity, while GTV40% and GTV50% were defined as pixels with 40% and 50% of the maximum tumor intensity, respectively. The comparison validated the GTV2SD method is more accurate for PET tumor segmentation in cervical cancer patients. Monteiro et al. (78) verified the effectiveness of the region growth algorithm for segmentation of cervical tumor regions in PET images. They combined the information from PET, CT, and MR images. Firstly, the MR images were affine aligned with PET/CT images to ensure the correspondence between each patient image. Then, a region growth algorithm was used to segment the tumor regions. To improve the accuracy and reliability of the segmentation results, a multi-criteria decision process was implemented by applying four classifiers KNN, LDA, PROAFTN (79) and Naive Bayes to different image pattern combinations. Recently, Baydoun et al. (80) implemented automatic segmentation of cervical tumors on small datasets of PET/MR images using shallow UNet networks, by invoking the concepts of focus and sequential training to achieve accurate segmentation on small datasets.

PET images are highly sensitive to the metabolic activity of tissues and organs, including the bladder, which can interfere with accurate segmentation of cervical tumors. To overcome this issue, several studies have been focused on separating the bladder from the tumor before segmentation. One approach is to use morphological manipulation, regional growth, and other image processing methods to remove the influence of the bladder (81). Another approach is to construct a hyper-image by combining CT and PET images to depict rough tumor regions based on tissue specificity. Additionally, gradient information from the hyper-image can be introduced into the level set function to reduce the influence of the bladder on tumor segmentation (82). Mu et al. (83) segmented cervical tumors by adding intensity and gradient field information to the level set framework after Gaussian filtering of PET images. In addition to this, they also investigated the effect of image texture features on cervical cancer staging and verified that texture features are highly correlated with tumor staging and can provide valuable prognostic information. Overall, their methods offer a more precise and comprehensive approach to cervical tumor segmentation. Chen et al. (84) segmented cervical tumors in PET images by a two-stage segmentation. First, coarse segmentation was performed by measuring voxel local similarity using a graphical cut method. Then, the initial segmentation was fine-tuned using a similarity-based variational model. This was done by using the information that the tumor shape and location did not vary much between adjacent slices, which helped to improve the accuracy of the final segmentation results. The authors then (85) proposed a prior information constrained (PIC)-spatial information embedded CNN model (S-CNN) by combining prior knowledge into CNN model to separate bladder and cervical tumor before segmenting cervical tumor. Since roundness has been shown to be a feature of cervical tumors in previous studies (86-88) and the location of the bladder and cervical tumor is fixed, with the bladder always in front of the cervical tumor, adding this prior knowledge to the CNN model reduces the influence of the bladder on subsequent cervical tumor segmentation. Iantsen et al. (22), on the other hand, used manual separation of the bladder from the cervical tumor. They determined the volume of the cervical tumor using the FLAB algorithm, which was used as the gold standard. Then, they used a modified 3D UNet network to achieve automatic segmentation of the cervical tumor, where the maximum pooling operation in the network was replaced with downsampling and a concurrent spatial and channel squeeze and excitation (scSE) module was included in the residual block. The original UNet network was optimized, and the model applicability was assessed using five-fold cross-validation.

Current challenges and limitations

The main treatment modality for middle and advanced cervical cancer involves a combination of external pelvic irradiation therapy, intracavitary brachytherapy, and chemotherapy. In brachytherapy, precise positioning of the tumor area and organs at risk is critical for doctors to determine the appropriate radiation dose and develop accurate outlines for the CTV and organs at risk. This process enables reasonable dose planning that can deliver a high radiation dose to the tumor area while minimizing damage to healthy surrounding organs. Failure to achieve precise positioning can result in deviations in plan evaluation and dose projection, negatively impacting the effectiveness of tumor treatment.

The segmentation of cervical tumors and organs at risk is based on image data from three modalities: CT, MR, and PET, and the segmentation methods are broadly classified into two categories: deep learning segmentation methods based on UNet networks and their variants, and traditional methods that rely on registration and atlas for segmentation.

Comparison between the atlas-based and deep learning based automatic segmentation

Berendsen et al. (53) used a single-atlas approach to segment the bladder in cervical cancer patients by combining global rigid registration and local registration. However, the volume and location differences between the target image and the atlas image were too large, which significantly decreased the resulting segmentation accuracy. Multi-atlas registration can solve the problems that arise from single-atlas registration (52,54). It relies more on the topology established between the moving image and the target image (17). Since the soft tissue resolution of CT images is low, the segmentation method of atlas is more dependent on the image quality. Rhee et al. (49) compared the difference in the results of cervical cancer segmentation using the atlas method with the deep learning method. The accuracy of segmentation of CT images of cervical cancer using atlas is relatively low compared with that using deep learning methods, among which U- and V-shaped networks and their variants are the most frequently used networks for cervical cancer segmentation (35-41,43-47). Additionally, UNet can be trained from scratch to achieve accurate segmentation results with very little labeled training data, which is important for medical image segmentation. Because the process of acquiring medical image datasets may involve patient privacy and other issues, and medical image annotation is time-consuming and laborious, medical image datasets are generally small and difficult to obtain.

MR imaging can distinguish between normal soft tissue and tumor-infiltrated soft tissue and can characterize deformable structures with excellent visualization. MR images of cervical cancer can be segmented using an atlas-registration method. This process involves a transformation between the two images. In atlas-based automatic segmentation methods, the segmentation structure from the atlas library can be propagated to the target image using a deformable image registration algorithm. Deformable image registration methods can be divided into two categories, an intensity method based on image gray values and another one based on image features. Considering the effect of cervical tumor fading during the treatment process, in the registration of cervical cancer images, feature-based methods have better results than intensity-based registration methods (89). Lu et al. (56) proposed to use a maximum a posteriori (MAP) framework based on a non-rigid registration considering organ surface registration and intensity matching, and the final model segmentation achieved an accuracy comparable to manual segmentation.

Existing segmentation accuracy of CTV and OARs

The most commonly used assessment metric for cervical cancer image segmentation is the use of quantitative measures such as DSC and HD for comparison with gross target contours, and Tables 6-8 show the papers using DSC/HD as an assessment metric and their segmentation results.

Table 6

Summary of performance of papers using DSC/HD as an evaluation metric for segmentation of the bladder

Author, year	Method	Number of patients	Data volume	Mode of image	Evaluation metrics	Result
Liu Z 2020 (6)	DpnUNet	237	22,356	CT	DSC	91%
Liu Z 2020 (6)	DpnUNet	237	22,356	CT	HD	4.05
Wang Z 2020 (15)	3D ResUNet	125	NA	CT	DSC	91%
Wang Z 2020 (15)	3D ResUNet	125	NA	CT	HD	7.82
Kim N 2020 (17)	Atlas	75	NA	CT	DSC	54%
Kim N 2020 (17)	Atlas	75	NA	CT	HD	60.2
Li Y 2022 (18)	Atlas	140	140	CT	DSC	86.6%
Li Y 2022 (18)	Atlas	140	140	CT	HD	1.591
Sartor H 2020 (19)	3D Full Convolutional Network	75 cases of cervical cancer and 191 cases of anorectal	NA	CT	DSC	83%
Wang J 2023 (26)	U-shaped network	60	NA	CT	DSC	94%
Wang J 2023 (26)	U-shaped network	60	NA	CT	HD	4.52
Chung SY 2023 (27)	3D EfficientNet-B0	182	NA	CT	DSC	88%
Chung SY 2023 (27)	3D EfficientNet-B0	182	NA	CT	HD	6.93
Mohammadi R 2021 (35)	ResUNet	113	NA	CT	DSC	95.7%
Mohammadi R 2021 (35)	ResUNet	113	NA	CT	HD	4.05
Liu Z 2020 (36)	Improvement of UNet	105	NA	CT	DSC	92.4%
Liu Z 2020 (36)	Improvement of UNet	105	NA	CT	HD	5.098
Zhang D 2020 (38)	DSD-UNet	91	91	CT	DSC	86.9%
Zhang D 2020 (38)	DSD-UNet	91	91	CT	HD	12.1
Chang JH 2021 (43)	3D UNet and LSTM	51	136	CT	DSC	86%
Ding Y 2022 (44)	3DV-net	130	NA	CT	DSC	94%
Ding Y 2022 (44)	3DV-net	130	NA	CT	HD	4.52
Ju Z 2020 (45)	Dense V-Network	190	NA	CT	DSC	95%
Ju Z 2020 (45)	Dense V-Network	190	NA	CT	HD	0.65
Rhee DJ 2020 (49)	3DV-net and 2D FCN-8s	NA	2254	CT	DSC	89%
Rhee DJ 2020 (49)	3DV-net and 2D FCN-8s	NA	2254	CT	HD	1.07
Jiang X 2021 (50)	RefineNet	200	NA	CT	DSC	86.0%
Jiang X 2021 (50)	RefineNet	200	NA	CT	HD	19.981
Xiao C 2022 (51)	RefineNetPlus3D	313	44,222	CT	DSC	97%
Berendsen FF 2011 (53)	Atlas	17	NA	MR	DSC	67%
Berendsen FF 2013 (54)	Alignment	17	84	MR	DSC	73%
Berendsen FF 2013 (54)	Alignment	17	84	MR	HD	20
Bnouni N 2020 (63)	Synergetic Multiplex Network (SMN)	15	NA	MR	DSC	95.75%
Zabihollahy F 2021 (64)	3-D UNet and 3-D Dense UNet	181	283	MR	DSC	93%
Huang S 2021 (70)	Improvement of UNet	87	84	MR	DSC	89.8%
Huang S 2021 (70)	Improvement of UNet	87	84	MR	95% HD	8.738

DSC, dice similarity coefficient; HD, Hausdorff distance; CT, computed tomography; NA, not applicable; MR, magnetic resonance.

Table 7

Summary of performance of papers using DSC/HD as an evaluation index for segmentation of CTV in cervical cancer

Author, year	Method	Number of patients	Data volume	Mode of image	Evaluation metrics	Result
Liu Z 2020 (6)	DpnUNet	237	22,356	CT	DSC	CTV: 86%
Liu Z 2020 (6)	DpnUNet	237	22,356	CT	HD	CTV: 5.34
Wang Z 2020 (15)	3D ResUNet	125	NA	CT	DSC	CTV: 86%
Wang Z 2020 (15)	3D ResUNet	125	NA	CT	HD	CTV: 14.84
Kim N 2020 (17)	Atlas	75	NA	CT	DSC	CTV: 79%
Kim N 2020 (17)	Atlas	75	NA	CT	HD	CTV: 19.7
Li Y 2022 (18)	Atlas	140	140	CT	DSC	CTV: 81.6%
Li Y 2022 (18)	Atlas	140	140	CT	HD	CTV: 2.195
Beekman C 2022 (20)	3D UNet	84	NA	CT	DSC	CTV: 87%
Wang J 2022 (21)	U-shaped network	375	NA	CT	DSC	CTV: 77%
Wang J 2022 (21)	U-shaped network	375	NA	CT	95% HD	CTV: 5.81
Wang J 2023 (26)	U-shaped network	60	NA	CT	DSC	HRCTV: 87%
Wang J 2023 (26)	U-shaped network	60	NA	CT	HD	HRCTV: 1.45
Chung SY 2023 (27)	3D EfficientNet-B0	180	NA	CT	DSC	CTV: 80%
Chung SY 2023 (27)	3D EfficientNet-B0	180	NA	CT	HD	CTV: 13
Huang M 2023 (28)	Improved MNet	53	5,438	CT	DSC	CTV: 88.28%
Huang M 2023 (28)	Improved MNet	53	5,438	CT	95% HD	CTV: 3.2013
Jin S 2023 (30)	UNet++	228	NA	MR	DSC	CTV: 78.6%
Jin S 2023 (30)	UNet++	228	NA	MR	95% HD	CTV: 3.779
Liu Z 2021 (37)	Improvement of DpnUNet	237	NA	CT	DSC	CTV: 88%
Liu Z 2021 (37)	Improvement of DpnUNet	237	NA	CT	HD	CTV: 3.46
Zhang D 2020 (38)	DSD-UNet	91	91	CT	DSC	HR-CTV: 82.9%
Zhang D 2020 (38)	DSD-UNet	91	91	CT	HD	HR-CTV: 8.1
Shi J 2021 (39)	RA-CTVNet	462	NA	CT	DSC	CTV: 79.2%
Yi H 2021 (40)	GML	87	NA	CT	DSC	CTV: 81.6%
Yi H 2021 (40)	GML	87	NA	CT	HD	CTV: 5.672
Chang Y 2021 (41)	Improved 3D UNet	400	NA	CT	DSC	CTV: 88.2%
Chang Y 2021 (41)	Improved 3D UNet	400	NA	CT	95% HD	CTV: 6.853
Chang JH 2021 (43)	3D UNet and LSTM	51	136	CT	DSC	HRCTV: 87%
Ding Y 2022 (44)	3DV-net	130	NA	CT	DSC	CTV: 85%
Ding Y 2022 (44)	3DV-net	130	NA	CT	HD	CTV: 11.2
Ju Z 2021 (46)	Dense V-Net	133	NA	CT	DSC	CTV: 82%
Ju Z 2021 (46)	Dense V-Net	133	NA	CT	HD	CTV: 1.86
Ma CY 2022 (47)	VB-Net	535	NA	CT	DSC	CTV: 70%
Ma CY 2022 (47)	VB-Net	535	NA	CT	HD	CTV: 22.44
Ma CY 2022 (48)	Registration and VB-Net	107	NA	CT	DSC	CTV: 89%
Ma CY 2022 (48)	Registration and VB-Net	107	NA	CT	HD	CTV: 6.14
Rhee DJ 2020 (49)	3DV-net and 2D FCN-8s	NA	2254	CT	DSC	CTV: 86%
Rhee DJ 2020 (49)	3DV-net and 2D FCN-8s	NA	2254	CT	HD	CTV: 2.02
Jiang X 2021 (50)	RefineNet	200	NA	CT	DSC	CTV: 86.1%
Jiang X 2021 (50)	RefineNet	200	NA	CT	HD	CTV: 6.005
Xiao C 2022 (51)	RefineNetPlus3D	313	44,222	CT	DSC	CTV: 82%
Berendsen FF 2013 (54)	Registration	17	84	MR	DSC	CTV: 57%
Berendsen FF 2013 (54)	Registration	17	84	MR	HD	CTV: 36
Mu W 2015 (83)	FCM-LSGF	42	NA	PET	DSC	CTV: 91.78%
Mu W 2015 (83)	FCM-LSGF	42	NA	PET	HD	CTV: 7.94

DSC, dice similarity coefficient; HD, Hausdorff distance; CTV, clinical target volume; NA, not applicable; CT, computed tomography; MR, magnetic resonance; HRCTV, high-risk clinical target volume; PET, positron emission tomography.

Table 8

Summary of performance of papers using DSC/HD as an evaluation index for segmentation of rectum in cervical cancer

Author, year	Method	Number of patients	Data volume	Mode of image	Evaluation metrics	Result
Wang Z 2020 (15)	3D ResUNet	125	NA	CT	DSC	81%
Wang Z 2020 (15)	3D ResUNet	125	NA	CT	HD	7.04
Li Y 2022 (18)	Atlas	140	140	CT	DSC	68.5%
Li Y 2022 (18)	Atlas	140	140	CT	HD	2.508
Wang J 2023 (26)	U-shaped network	60	NA	CT	DSC	86%
Wang J 2023 (26)	U-shaped network	60	NA	CT	HD	2.52
Mohammadi R 2021 (35)	ResUNet	113	NA	CT	DSC	96.6%
Mohammadi R 2021 (35)	ResUNet	113	NA	CT	HD	1.96
Liu Z 2020 (36)	Improvement of UNet	105	NA	CT	DSC	79.1%
Liu Z 2020 (36)	Improvement of UNet	105	NA	CT	HD	5.949
Zhang D 2020 (38)	DSD-UNet	91	91	CT	DSC	82.1%
Zhang D 2020 (38)	DSD-UNet	91	91	CT	HD	9.2
Chang JH 2021 (43)	3D UNet and LSTM	51	136	CT	DSC	77%
Ding Y 2022 (44)	3DV-net	130	NA	CT	DSC	85%
Ding Y 2022 (44)	3DV-net	130	NA	CT	HD	4.35
Ju Z 2020 (45)	Dense V-Network	190	NA	CT	DSC	87%
Ju Z 2020 (45)	Dense V-Network	190	NA	CT	HD	0.79
Rhee DJ 2020 (49)	3DV-net and 2D FCN-8s	NA	2,254	CT	DSC	81%
Rhee DJ 2020 (49)	3DV-net and 2D FCN-8s	NA	2,254		HD	1.66
Jiang X 2021 (50)	RefineNet	200	NA	CT	DSC	85.8%
Jiang X 2021 (50)	RefineNet	200	NA	CT	HD	12.273
Xiao C 2022 (51)	RefineNetPlus3D	313	44,222	CT	DSC	91%
Zabihollahy F 2021 (64)	3-D UNet and 3-D Dense UNet	181	283	MR	DSC	87 %
Huang S 2021 (70)	Improvement of UNet	87	84	MR	DSC	78.1%
Huang S 2021 (70)	Improvement of UNet	87	84	MR	HD	11.775

DSC, dice similarity coefficient; HD, Hausdorff distance; NA, not applicable; CT, computed tomography; LSTM, long short-term memory; MR, magnetic resonance.

For cervical cancer with organs at risk segmentation the most used methods were based on the UNet architecture for improvement. For organs at risk segmentation, taking the bladder as an example, as shown in Table 6, the DSC of these methods basically reached more than 90%, with the lowest DSC for software-based, mapping, registration, and full convolution methods, which may be related to differences in the size and shape of the patient’s bladder. The segmentation results of CTV are a bit worse than bladder, as shown in Table 7, and the DSC basically reached more than 80%, which is due to the complex composition and blurred boundary of CTV. The highest DSC of CTV among these methods was achieved by Mu et al. (83) using a level set approach to CTV segmentation combining intensity and gradient field information, with a DSC result of 91.78%. This is not much different from their accuracy in segmenting cervical tumors (82), which is due to the fact that CTV is obtained from GTV expansion, which is the cervical tumor region. The level set method showed high accuracy in segmenting cervical cancer. In addition, 3D models generally have higher accuracy in the segmentation of cervical cancer compared to 2D models (20,37,41).

Squeeze-and-excitation (SE), res, and dilated convolution

The UNet-based methods used by the researchers differ significantly from each other. One part fuses the UNet network with another network, and the fused network has the advantages of two single networks, which improves the segmentation of the fused network (45,46) and the fused model segmentation DSC accuracy is 0.07 higher than that of the single model (46), but its training time is higher than that of the single model. Another part of them embeds a part of other networks into the UNet network or replaces the convolutional layers in UNet and V-Net as a way to overcome the problem of unclear boundaries of the CTV of cervical cancer, which endangers organs with different sizes, shapes and locations (6,35,37,39,41,43,44,47).

The full use of SE blocks and residual blocks in a UNet network can improve the ability of the model to acquire certain features (15,35,38-41). SE blocks automatically acquire the importance of each feature channel by a squeeze operation and an excitation operation. This allows the network to promote useful features and suppress less informative features for the current task according to this importance. The residual block can be used to avoid gradient disappearance. This is because the residual connection allows the gradients to flow directly through the block, avoiding the vanishing gradient problem that can occur in deep neural networks. Residual blocks also help to preserve the details of the image, especially at the boundaries of abnormal cells, by allowing the model to learn residual mapping functions that capture the fine-grained details of the image. In addition to SE blocks and residual blocks, dilation convolution is also widely used in cervical cancer segmentation to improve the model’s ability to extract features. During the model segmentation, the feature maps are gradually down sampled to capture semantic contextual information at different image scales. Additionally, through the down sampling process, a larger receptive field is achieved, which can help improve the model’s ability to localize object boundaries and other fine details in the image. However, this down sampling process will reduce image resolution and lose information. Dilated convolution is proposed in response to this effect. It can obtain features in a larger receptive field without increasing the number of parameters. When being used in bottleneck structures, dilated convolution allows the model to better handle multi-scale information, preserve the image resolution, and improve the model’s ability to identify cervical cancer boundaries (36,38,40). Experiments by Mohammadi (35), Shi (39), etc. proved that the addition of residual structure and SE block can significantly improve the ability of the model to segment cervical cancer organs at risk.

Since there is no public data set in the field of cervical cancer segmentation at present, in order to verify the generalization of res module and SE module, as well as explore the application of transformer-based U-Net networks, we used a private data set from our lab to verify the performance of selected methods. The data set included CT images of 53 cervical cancer patients, which were divided into training set, validation set, and test set according to the ratio of 8:1:1. The experimental results are shown in Table 9. We have obtained similar conclusions as pervious reports. By adding residual structure and SE module on the basis of UNet, the accuracy of cervical cancer CTV segmentation can be significantly improved. However, when residual structure and SE module are added into UNet, the network segmentation result is lower than that of ResUNet. Therefore, the effective combination of residual structure and SE module should be determined according to the nature of the data set.

Table 9

Comparison of the CTV segmentation accuracy on our dataset by selected methods (mean)

Model	DSC	HD95
UNet	0.8670	3.9143
Res UNet	0.8759	3.1488
SE UNet	0.8727	3.4188
SE Res UNet	0.8750	3.3923
Swin UNet	0.8466	4.6447

CTV, clinical target volume; DSC, dice similarity coefficient; HD, Hausdorff distance; SE, squeeze-and-excitation.

High precision of segmentation and low clinical availability

Evaluation metrics such as mean DSC and HD are objective and represent the degree to which geometrically modeled segmentation results resemble the underlying facts, providing good reproducibility but not incorporating physician judgment (37). Since the CTV boundaries of cervical cancer depend on the definition of other tissues and organs in a given region, its boundaries are not clear, and the CTV may contain regions such as lymph nodes. These areas are small but clinically important, but performance indicators such as DSC and HD treat these important areas as the same as other normal tissues. Although high DSC/HD has been achieved in experiments, in clinical practice it may omit important regions and make the model much less valuable to effectively assess its accuracy and applicability in real clinical settings (19,37). Sartor et al. (19) achieved a quantitative assessment of 0.82 for CTV by DSC, but qualitative results suggest that CNN segmentation for CTV is mostly unacceptable in clinical settings. Therefore, physician assessment as well as Turing test are important for model evaluation.

Inaccurate dose calculation

The evaluation indexes of cervical cancer target area and organs at risk include dose indexes besides geometric indexes such as DSC and HD. Wang et al. (21) used dose metrics such as D_mean and V100 to measure the segmentation accuracy of the model, where D_mean was defined as the average dose received by the structure and V100 was defined as the volume of CTV receiving 100% of the prescribed dose. Their results showed that most of the segmentation contours were more accurate, and the radiotherapy dose met the clinical requirements, but for CTV the dose did not meet the clinical requirements and needed further correction by radiation oncologists. Chen et al. (42) focused on the same point as Wang et al. (21), who used automatically segmented and manually segmented organ contours for treatment plan optimization to explore the dose differences between automatically segmented and manually segmented treatment plans for organs at risk and target areas. The results showed that the dose distribution in the target area was unaffected when automatically segmented organ contours were used to design the treatment plan, whereas the effect of automatic segmentation on OAR dose was complex and required physician modification if necessary. This suggests that DL-based methods do not produce accurate dosimetric endpoints in cohorts of cervical cancer patients compared to standard manual contour lines.

No public dataset

Since there are few studies on brachytherapy for cervical cancer, no published imaging datasets during brachytherapy for cervical cancer have been seen in published papers, and all experiments have been performed with private datasets. However, because there is no uniform clinical standard for CTV and organs at risk segmentation in cervical cancer, the personal experience of different oncologists in their segmentation varies, leading to different results in their manual segmentation (90,91). In addition, the use of imaging equipment parameters, different image acquisition protocols and the diversity of tumor staging in different centers make the data appear variable. Chang et al. (41) found that the variation of data from the same hospital was less than that from different hospitals, and too much variation in the dataset made it difficult to generalize the model, so consistent segmentation criteria with a common dataset is important for measuring the accuracy of the model.

Conclusions

This paper summarizes the segmentation methods of cervical cancer and organs at risk based on three modality images and introduces the applicable segmentation methods for different images and the improvements between different methods. Accurate segmentation of cervical cancer target areas and organs at risk requires extensive clinical experience for clinicians. In practical applications, computer-aided segmentation to help doctors obtain accurate segmentation results of CTV and organs at risk can reduce a lot of marking work and variability for doctors, making radiotherapy planning more effective and reliable and avoiding irreversible damage to patients caused by overtreatment.

Although many methods have been proposed to achieve segmentation of CTV for cervical cancer, the complexity of CTV composition and image quality have prevented its segmentation from achieving higher accuracy. Among them, the best segmentation results were achieved by methods based on improved level sets. In addition, because the datasets were not uniform, which prevented meaningful comparisons between the model proposed in the article and other models, it is important to establish a standard and uniform public dataset for cervical cancer and organs at risk segmentation in the future.

Acknowledgments

Funding: This work was supported by the National Natural Science Foundation of China (grant No. U20A20373).

Footnote

Reporting Checklist: The authors have completed the Narrative Review reporting checklist. Available at https://qims.amegroups.com/article/view/10.21037/qims-24-369/rc

Conflicts of Interest: All authors have completed the ICMJE uniform disclosure form (available at https://qims.amegroups.com/article/view/10.21037/qims-24-369/coif). All authors report that this work was supported by the National Natural Science Foundation of China (grant No. U20A20373). The authors have no other conflicts of interest to declare.

Ethical Statement: The authors are accountable for all aspects of the work in ensuring that questions related to the accuracy or integrity of any part of the work are appropriately investigated and resolved.

Open Access Statement: This is an Open Access article distributed in accordance with the Creative Commons Attribution-NonCommercial-NoDerivs 4.0 International License (CC BY-NC-ND 4.0), which permits the non-commercial replication and distribution of the article with the strict proviso that no changes or edits are made and the original work is properly cited (including links to both the formal publication through the relevant DOI and the license). See: https://creativecommons.org/licenses/by-nc-nd/4.0/.

References

Hellner K, Münger K. Human papillomaviruses as therapeutic targets in human cancer. J Clin Oncol 2011;29:1785-94. [Crossref] [PubMed]
Sung H, Ferlay J, Siegel RL, Laversanne M, Soerjomataram I, Jemal A, Bray F. Global Cancer Statistics 2020: GLOBOCAN Estimates of Incidence and Mortality Worldwide for 36 Cancers in 185 Countries. CA Cancer J Clin 2021;71:209-49. [Crossref] [PubMed]
Karlsson L, Thunberg P, With A, Mordhorst LB, Persliden J. 3D image-based adapted high-dose-rate brachytherapy in cervical cancer with and without interstitial needles: measurement of applicator shift between imaging and dose delivery. J Contemp Brachytherapy 2017;9:52-8. [Crossref] [PubMed]
van de Bunt L, van der Heide UA, Ketelaars M, de Kort GA, Jürgenliemk-Schulz IM. Conventional, conformal, and intensity-modulated radiation therapy treatment planning of external beam radiotherapy for cervical cancer: The impact of tumor regression. Int J Radiat Oncol Biol Phys 2006;64:189-96. [Crossref] [PubMed]
Chargari C, Peignaux K, Escande A, Renard S, Lafond C, Petit A, Lam Cham Kee D, Durdux C, Haie-Méder C. Radiotherapy of cervical cancer. Cancer Radiother 2022;26:298-308. [Crossref] [PubMed]
Liu Z, Liu X, Guan H, Zhen H, Sun Y, Chen Q, Chen Y, Wang S, Qiu J. Development and validation of a deep learning algorithm for auto-delineation of clinical target volume and organs at risk in cervical cancer radiotherapy. Radiother Oncol 2020;153:172-9. [Crossref] [PubMed]
Sun R, Mazeron R, Chargari C, Barillot I. CTV to PTV in cervical cancer: From static margins to adaptive radiotherapy. Cancer Radiother 2016;20:622-8. [Crossref] [PubMed]
Ghose S, Holloway L, Lim K, Chan P, Veera J, Vinod SK, Liney G, Greer PB, Dowling J. A review of segmentation and deformable registration methods applied to adaptive cervical cancer radiation therapy treatment planning. Artif Intell Med 2015;64:75-87. [Crossref] [PubMed]
Yang C, Qin LH, Xie YE, Liao JY. Deep learning in CT image segmentation of cervical cancer: a systematic review and meta-analysis. Radiat Oncol 2022;17:175. [Crossref] [PubMed]
Zaki N, Qin W, Krishnan A. Graph-based methods for cervical cancer segmentation: Advancements, limitations, and future directions. AI Open 2023;4:42-55.
Kusmirek J, Robbins J, Allen H, Barroilhet L, Anderson B, Sadowski EA. PET/CT and MRI in the imaging assessment of cervical cancer. Abdom Imaging 2015;40:2486-511. [Crossref] [PubMed]
Dutta S, Nguyen NP, Vock J, Kerr C, Godinez J, Bose S, Jang S, Chi A, Almeida F, Woods W, Desai A, David R, Karlsson UL, Altdorfer GInternational Geriatric Radiotherapy Group. Image-guided radiotherapy and -brachytherapy for cervical cancer. Front Oncol 2015;5:64. [Crossref] [PubMed]
Putri ER, Nasrulloh AV, Fahrudin AE. Coloring of Cervical Cancer’s Ct Images to Localize Cervical Cancer. International Journal of Electrical and Computer Engineering 2015;5:304-10.
Purwono RRPA, Purwanti E, Rulaningtyas R. Segmentation of cervical cancer CT-scan images using K-nearest neighbors method. AIP Conference Proceedings. AIP Conf Proc 2020;2314:040009.
Wang Z, Chang Y, Peng Z, Lv Y, Shi W, Wang F, Pei X, Xu XG. Evaluation of deep learning-based auto-segmentation algorithms for delineating clinical target volume and organs at risk involving data for 125 cervical cancer patients. J Appl Clin Med Phys 2020;21:272-9. [Crossref] [PubMed]
Langerak T, Heijkoop S, Quint S, Mens JW, Heijmen B, Hoogeman M. Towards automatic plan selection for radiotherapy of cervical cancer by fast automatic segmentation of cone beam CT scans. Med Image Comput Comput Assist Interv 2014;17:528-35.
Kim N, Chang JS, Kim YB, Kim JS. Atlas-based auto-segmentation for postoperative radiotherapy planning in endometrial and cervical cancers. Radiat Oncol 2020;15:106. [Crossref] [PubMed]
Li Y, Wu W, Sun Y, Yu D, Zhang Y, Wang L, Wang Y, Zhang X, Lu Y. The clinical evaluation of atlas-based auto-segmentation for automatic contouring during cervical cancer radiotherapy. Front Oncol 2022;12:945053. [Crossref] [PubMed]
Sartor H, Minarik D, Enqvist O, Ulén J, Wittrup A, Bjurberg M, Trägårdh E. Auto-segmentations by convolutional neural network in cervical and anorectal cancer with clinical structure sets as the ground truth. Clin Transl Radiat Oncol 2020;25:37-45. [Crossref] [PubMed]
Beekman C, van Beek S, Stam J, Sonke JJ, Remeijer P. Improving predictive CTV segmentation on CT and CBCT for cervical cancer by diffeomorphic registration of a prior. Med Phys 2022;49:1701-11. [Crossref] [PubMed]
Wang J, Chen Y, Xie H, Luo L, Tang Q. Evaluation of auto-segmentation for EBRT planning structures using deep learning-based workflow on cervical cancer. Sci Rep 2022;12:13650. [Crossref] [PubMed]
Iantsen A, Ferreira M, Lucia F, Jaouen V, Reinhold C, Bonaffini P, Alfieri J, Rovira R, Masson I, Robin P, Mervoyer A, Rousseau C, Kridelka F, Decuypere M, Lovinfosse P, Pradier O, Hustinx R, Schick U, Visvikis D, Hatt M. Convolutional neural networks for PET functional volume fully automatic segmentation: development and validation in a multi-center setting. Eur J Nucl Med Mol Imaging 2021;48:3444-56. [Crossref] [PubMed]
Kano Y, Ikushima H, Sasaki M, Haga A. Automatic contour segmentation of cervical cancer using artificial intelligence. J Radiat Res 2021;62:934-44. [Crossref] [PubMed]
Lu P, Fang F, Zhang H, Ling L, Hua K. AugMS-Net:Augmented multiscale network for small cervical tumor segmentation from MRI volumes. Comput Biol Med 2022;141:104774. [Crossref] [PubMed]
Lin YC, Lin CH, Lu HY, Chiang HJ, Wang HK, Huang YT, Ng SH, Hong JH, Yen TC, Lai CH, Lin G. Deep learning for fully automated tumor segmentation and extraction of magnetic resonance radiomics features in cervical cancer. Eur Radiol 2020;30:1297-305. [Crossref] [PubMed]
Wang J, Chen Y, Tu Y, Xie H, Chen Y, Luo L, Zhou P, Tang Q. Evaluation of auto-segmentation for brachytherapy of postoperative cervical cancer using deep learning-based workflow. Phys Med Biol 2023; [Crossref]
Chung SY, Chang JS, Kim YB. Comprehensive clinical evaluation of deep learning-based auto-segmentation for radiotherapy in patients with cervical cancer. Front Oncol 2023;13:1119008. [Crossref] [PubMed]
Huang M, Feng C, Sun D, Cui M, Zhao D. Segmentation of Clinical Target Volume From CT Images for Cervical Cancer Using Deep Learning. Technol Cancer Res Treat 2023;22:15330338221139164. [Crossref] [PubMed]
Rodríguez Outeiral R, González PJ, Schaake EE, van der Heide UA, Simões R. Deep learning for segmentation of the cervical cancer gross tumor volume on magnetic resonance imaging for brachytherapy. Radiat Oncol 2023;18:91. [Crossref] [PubMed]
Jin S, Xu H, Dong Y, Hao X, Qin F, Xu Q, Zhu Y, Cong F. Automatic cervical cancer segmentation in multimodal magnetic resonance imaging using an EfficientNet encoder in UNet++ architecture. International Journal of Imaging Systems and Technology 2023;33:362-77.
Ronneberger O, Fischer P, Brox T. U-net: Convolutional networks for biomedical image segmentation. In: Navab N, Hornegger J, Wells W, Frangi A. editors. Medical Image Computing and Computer-Assisted Intervention – MICCAI 2015. Lecture Notes in Computer Science(), vol 9351. Springer, 2015:234-41.
Verma R, Kumar N, Patil A, Kurian NC, Rane S, Graham S, et al. MoNuSAC2020: A Multi-Organ Nuclei Segmentation and Classification Challenge. IEEE Trans Med Imaging 2021;40:3413-23. [Crossref] [PubMed]
Zunair H, Ben Hamza A. Sharp U-Net: Depthwise convolutional network for biomedical image segmentation. Comput Biol Med 2021;136:104699. [Crossref] [PubMed]
Zunair H, Hamza AB. Masked supervised learning for semantic segmentation. arXiv preprint arXiv: 2210.00923, 2022.
Mohammadi R, Shokatian I, Salehi M, Arabi H, Shiri I, Zaidi H. Deep learning-based auto-segmentation of organs at risk in high-dose rate brachytherapy of cervical cancer. Radiother Oncol 2021;159:231-40. [Crossref] [PubMed]
Liu Z, Liu X, Xiao B, Wang S, Miao Z, Sun Y, Zhang F. Segmentation of organs-at-risk in cervical cancer CT images with a convolutional neural network. Phys Med 2020;69:184-91. [Crossref] [PubMed]
Liu Z, Chen W, Guan H, Zhen H, Shen J, Liu X, Liu A, Li R, Geng J, You J, Wang W, Li Z, Zhang Y, Chen Y, Du J, Chen Q, Chen Y, Wang S, Zhang F, Qiu J. An Adversarial Deep-Learning-Based Model for Cervical Cancer CTV Segmentation With Multicenter Blinded Randomized Controlled Validation. Front Oncol 2021;11:702270. [Crossref] [PubMed]
Zhang D, Yang Z, Jiang S, Zhou Z, Meng M, Wang W. Automatic segmentation and applicator reconstruction for CT-based brachytherapy of cervical cancer using 3D convolutional neural networks. J Appl Clin Med Phys 2020;21:158-69. [Crossref] [PubMed]
Shi J, Ding X, Liu X, Li Y, Liang W, Wu J. Automatic clinical target volume delineation for cervical cancer in CT images using deep learning. Med Phys 2021;48:3968-81. [Crossref] [PubMed]
Yi H, Shi J, Yan B, Xue X, An H, Zhang H. Global Multi-Level Attention Network for the Segmentation of Clinical Target Volume In The Planning CT For Cervical Cancer. 2021 IEEE 18th International Symposium on Biomedical Imaging (ISBI), Nice, France, 2021:230-3.
Chang Y, Wang Z, Peng Z, Zhou J, Pi Y, Xu XG, Pei X. Clinical application and improvement of a CNN-based autosegmentation model for clinical target volumes in cervical cancer radiotherapy. J Appl Clin Med Phys 2021;22:115-25. [Crossref] [PubMed]
Chen A, Chen F, Li X, Zhang Y, Chen L, Chen L, Zhu J. A Feasibility Study of Deep Learning-Based Auto-Segmentation Directly Used in VMAT Planning Design and Optimization for Cervical Cancer. Front Oncol 2022;12:908903. [Crossref] [PubMed]
Chang JH, Lin KH, Wang TH, Zhou YK, Chung PC. Image segmentation in 3d brachytherapy using convolutional LSTM. Journal of Medical and Biological Engineering 2021;41:636-51.
Ding Y, Chen Z, Wang Z, Wang X, Hu D, Ma P, Ma C, Wei W, Li X, Xue X, Wang X. Three-dimensional deep neural network for automatic delineation of cervical cancer in planning computed tomography images. J Appl Clin Med Phys 2022;23:e13566. [Crossref] [PubMed]
Ju Z, Wu Q, Yang W, Gu S, Guo W, Wang J, Ge R, Quan H, Liu J, Qu B. Automatic segmentation of pelvic organs-at-risk using a fusion network model based on limited training samples. Acta Oncol 2020;59:933-9. [Crossref] [PubMed]
Ju Z, Guo W, Gu S, Zhou J, Yang W. BMC Cancer 2021;21:243. [Crossref] [PubMed]
Ma CY, Zhou JY, Xu XT, Guo J, Han MF, Gao YZ, Du H, Stahl JN, Maltz JS. Deep learning-based auto-segmentation of clinical target volumes for radiotherapy treatment of cervical cancer. J Appl Clin Med Phys 2022;23:e13470. [Crossref] [PubMed]
Ma CY, Zhou JY, Xu XT, Qin SB, Han MF, Cao XH, Gao YZ, Xu L, Zhou JJ, Zhang W, Jia LC. Clinical evaluation of deep learning-based clinical target volume three-channel auto-segmentation algorithm for adaptive radiotherapy in cervical cancer. BMC Med Imaging 2022;22:123. [Crossref] [PubMed]
Rhee DJ, Jhingran A, Rigaud B, Netherton T, Cardenas CE, Zhang L, Vedam S, Kry S, Brock KK, Shaw W, O'Reilly F, Parkes J, Burger H, Fakie N, Trauernicht C, Simonds H, Court LE. Automatic contouring system for cervical cancer using convolutional neural networks. Med Phys 2020;47:5648-58. [Crossref] [PubMed]
Jiang X, Wang F, Chen Y, Yan S. RefineNet-based automatic delineation of the clinical target volume and organs at risk for three-dimensional brachytherapy for cervical cancer. Ann Transl Med 2021;9:1721. [Crossref] [PubMed]
Xiao C, Jin J, Yi J, Han C, Zhou Y, Ai Y, Xie C, Jin X. RefineNet-based 2D and 3D automatic segmentations for clinical target volume and organs at risks for patients with cervical cancer in postoperative radiotherapy. J Appl Clin Med Phys 2022;23:e13631. [Crossref] [PubMed]
Daly A, Yazid H, Solaiman B, Essoukri Ben Amara N. Multiatlas-based segmentation of female pelvic organs: Application for computer-aided diagnosis of cervical cancer. Int J Imaging Syst Technol 2021;31:302-12.
Berendsen, FF, van der Heide UA, Langerak TR, Kotte ANTJ, Pluim JPW. Segmentation of Cervical Images by Inter-subject Registration with a Statistical Organ Model. In: Yoshida H, Sakas G, Linguraru MG. editors. Abdominal Imaging. Computational and Clinical Applications. ABD-MICCAI 2011. Lecture Notes in Computer Science, vol 7029. Springer, Berlin, Heidelberg, 2011:240-7.
Berendsen FF, Van Der UAH, Langerak TR, Kotte ANTJ, Pluim JPW. Free-form image registration regularized by a statistical shape model: application to organ segmentation in cervical MR. Comput Vis Image Underst 2013;117:1119-27.
Lu C, Duncan JS. A Non-rigid Registration Framework That Accommodates Pathology Detection. In: Suzuki K, Wang F, Shen D, Yan P. editors. Machine Learning in Medical Imaging. MLMI 2011. Lecture Notes in Computer Science, vol 7009. Springer, Berlin, Heidelberg, 2011:83-90.
Lu C, Chelikani S, Jaffray DA, Milosevic MF, Staib LH, Duncan JS. Simultaneous nonrigid registration, segmentation, and tumor detection in MRI guided cervical cancer radiation therapy. IEEE Trans Med Imaging 2012;31:1213-27. [Crossref] [PubMed]
Kao Y, Li W, Xue H, Ren C, Tian J. An automatic tumor segmentation framework of cervical cancer in T2-weighted and diffusion weighted magnetic resonance images. Medical Imaging 2013. Image Processing. SPIE 2013;8669:905-12.
Garg S, Urooj S, Vijay R. Detection of cervical cancer by using thresholding & watershed segmentation. 2015 2nd International Conference on Computing for Sustainable Global Development (INDIACom), New Delhi, India, 2015:555-9.
Su Y, Sun W, Shi Y, Han F, Ma H, Kang Y. A Globally Adaptive Region Growing Method for Cervical Tumor Segmentation Based on MR Images. 2019 IEEE International Conference on Signal, Information and Data Processing (ICSIDP), Chongqing, China, 2019:1-6.
Khoulqi I, Idrissi N. Segmentation and classification of cervical cancer. 2020 IEEE 6th International Conference on Optimization and Applications (ICOA), Beni Mellal, Morocco, 2020:1-7,
Torheim T, Malinen E, Hole KH, Lund KV, Indahl UG, Lyng H, Kvaal K, Futsaether CM. Autodelineation of cervical cancers using multiparametric magnetic resonance imaging and machine learning. Acta Oncol 2017;56:806-12. [Crossref] [PubMed]
Bnouni N, Rekik I, Rhim MS, Amara NEB. Dynamic Multi-scale CNN Forest Learning for Automatic Cervical Cancer Segmentation. In: Shi Y, Suk HI, Liu M. editors. Machine Learning in Medical Imaging. MLMI 2018. Lecture Notes in Computer Science(), Springer, 2018;11046:19-27.
Bnouni N, Rekik I, Rhim MS, Ben Amara NE. Context-Aware Synergetic Multiplex Network for Multi-organ Segmentation of Cervical Cancer MRI. In: Rekik I, Adeli E, Park SH, Valdés Hernández MdC. editors. Predictive Intelligence in Medicine. PRIME 2020. Lecture Notes in Computer Science(), Springer, 2020;12329:1-11.
Zabihollahy F, Viswanathan AN, Schmidt EJ, Morcos M, Lee J. Fully automated multiorgan segmentation of female pelvic magnetic resonance images with coarse-to-fine convolutional neural network. Med Phys 2021;48:7028-42. [Crossref] [PubMed]
Zabihollahy F, Viswanathan AN, Schmidt EJ, Lee J. Fully automated segmentation of clinical target volume in cervical cancer from magnetic resonance imaging with convolutional neural network. J Appl Clin Med Phys 2022;23:e13725. [Crossref] [PubMed]
Lin YC, Lin Y, Huang YL, Ho CY, Chiang HJ, Lu HY, Wang CC, Wang JJ, Ng SH, Lai CH, Lin G. Generalizable transfer learning of automated tumor segmentation from cervical cancers toward a universal model for uterine malignancies in diffusion-weighted MRI. Insights Imaging 2023;14:14. [Crossref] [PubMed]
Bnouni N, Amor H B, Rekik I, Rhim M S, Solaiman B, Amara N E B. Boosting CNN learning by ensemble image preprocessing methods for cervical cancer segmentation. 2021 18th International Multi-Conference on Systems, Signals & Devices (SSD), Monastir, Tunisia, 2021:264-9.
Yoganathan SA, Paul SN, Paloor S, Torfeh T, Chandramouli SH, Hammoud R, Al-Hammadi N. Automatic segmentation of magnetic resonance images for high-dose-rate cervical cancer brachytherapy using deep learning. Med Phys 2022;49:1571-84. [Crossref] [PubMed]
Gou S, Xu Y, Yang H, Tong N, Zhang X, Wei L, Zhao L, Zheng M, Liu W. Automated cervical tumor segmentation on MR images using multi-view feature attention network. Biomed Signal Process Control 2022;77:103832.
Huang S, Cheng Z, Lai L, Zheng W, He M, Li J, Zeng T, Huang X, Yang X. Integrating multiple MRI sequences for pelvic organs segmentation via the attention mechanism. Med Phys 2021;48:7930-45. [Crossref] [PubMed]
Wang B, Zhang Y, Wu C, Wang F. Multimodal MRI Analysis of Cervical Cancer on the Basis of Artificial Intelligence Algorithm. Contrast Media Mol Imaging 2021;2021:1673490. [Crossref] [PubMed]
Ametamey SM, Honer M, Schubiger PA. Molecular imaging with PET. Chem Rev 2008;108:1501-16. [Crossref] [PubMed]
Podoloff DA, Ball DW, Ben-Josef E, Benson AB 3rd, Cohen SJ, Coleman RE, Delbeke D, Ho M, Ilson DH, Kalemkerian GP, Lee RJ, Loeffler JS, Macapinlac HA, Morgan RJ Jr, Siegel BA, Singhal S, Tyler DS, Wong RJ. NCCN task force: clinical utility of PET in a variety of tumor types. J Natl Compr Canc Netw 2009;7:S1-26. [Crossref] [PubMed]
Hong TS, Killoran JH, Mamede M, Mamon HJ. Impact of manual and automated interpretation of fused PET/CT data on esophageal target definitions in radiation planning. Int J Radiat Oncol Biol Phys 2008;72:1612-8. [Crossref] [PubMed]
Faria SL, Menard S, Devic S, Sirois C, Souhami L, Lisbona R, Freeman CR. Impact of FDG-PET/CT on radiotherapy volume delineation in non-small-cell lung cancer and correlation of imaging stage with pathologic findings. Int J Radiat Oncol Biol Phys 2008;70:1035-8. [Crossref] [PubMed]
Ford EC, Kinahan PE, Hanlon L, Alessio A, Rajendran J, Schwartz DL, Phillips M. Tumor delineation using PET in head and neck cancers: threshold contouring and lesion volumes. Med Phys 2006;33:4280-8. [Crossref] [PubMed]
Erlich F, Camisão C, Nogueira-Rodrigues A, Altino S, Ferreira CG, Mamede M. 18F-FDG-PET-based tumor delineation in cervical cancer: threshold contouring and lesion volumes. Rev Esp Med Nucl Imagen Mol 2013;32:162-6. [Crossref] [PubMed]
Monteiro ALR, Machado AMC, Lewer MHM. A Multicriteria Method for Cervical Tumor Segmentation in Positron Emission Tomography. 2014 IEEE 27th International Symposium on Computer-Based Medical Systems, New York, NY, USA, 2014:205-8.
Belacel N. Multicriteria assignment method PROAFTN: Methodology and medical application. Eur J Oper Res 2000;125:175-83.
Baydoun A, Xu K, Bethell LA, Zhou F, Heo JU, Zhao K, Fredman ET, Ellis RJ, Qian P, Muzic RF, Traughber B. Auto-contouring FDG-PET/MR images for cervical cancer radiation therapy: An intelligent sequential approach using focally trained, shallow U-Nets. Intell Based Med 2021;5:100026.
Arbonès DR, Jensen HG, Loft A, Af Rosenschöld PM, Hansen AE, Igel C, Darkner S. Automatic FDG-PET-based tumor and metastatic lymph node segmentation in cervical cancer. Medical Imaging 2014: Image Processing. SPIE 2014;9034:1053-60.
Mu W, Chen Z, Shen W, Yang F, Liang Y, Dai R, Wu N, Tian J. A Segmentation Algorithm for Quantitative Analysis of Heterogeneous Tumors of the Cervix With ¹⁸F-FDG PET/CT. IEEE Trans Biomed Eng 2015;62:2465-79. [Crossref] [PubMed]
Mu W, Chen Z, Liang Y, Shen W, Yang F, Dai R, Wu N, Tian J. Staging of cervical cancer based on tumor heterogeneity characterized by texture features on (18)F-FDG PET images. Phys Med Biol 2015;60:5123-39. [Crossref] [PubMed]
Chen L, Shen C, Zhou Z, Maquilan G, Thomas K, Folkert MR, Albuquerque K, Wang J. Accurate segmenting of cervical tumors in PET imaging based on similarity between adjacent slices. Comput Biol Med 2018;97:30-6. [Crossref] [PubMed]
Chen L, Shen C, Zhou Z, Maquilan G, Albuquerque K, Folkert MR, Wang J. Automatic PET cervical tumor segmentation by combining deep learning and anatomic prior. Phys Med Biol 2019;64:085019. [Crossref] [PubMed]
Yang WT, Lam WW, Yu MY, Cheung TH, Metreweli C. Comparison of dynamic helical CT and dynamic MR imaging in the evaluation of pelvic lymph nodes in cervical carcinoma. AJR Am J Roentgenol 2000;175:759-66. [Crossref] [PubMed]
Liyanage SH, Roberts CA, Rockall AG. MRI and PET scans for primary staging and detection of cervical cancer recurrence. Womens Health (Lond) 2010;6:251-67; quiz 268-9.
Bourgioti C, Chatoupis K, Moulopoulos LA. Current imaging strategies for the evaluation of uterine cervical cancer. World J Radiol 2016;8:342-54. [Crossref] [PubMed]
Staring M, van der Heide UA, Klein S, Viergever MA, Pluim JP. Registration of cervical MRI using multifeature mutual information. IEEE Trans Med Imaging 2009;28:1412-21. [Crossref] [PubMed]
Weiss E, Richter S, Krauss T, Metzelthin SI, Hille A, Pradier O, Siekmeyer B, Vorwerk H, Hess CF. Conformal radiotherapy planning of cervix carcinoma: differences in the delineation of the clinical target volume. A comparison between gynaecologic and radiation oncologists. Radiother Oncol 2003;67:87-95. [Crossref] [PubMed]
Saarnak AE, Boersma M, van Bunningen BN, Wolterink R, Steggerda MJ. Inter-observer variation in delineation of bladder and rectum contours for brachytherapy of cervical cancer. Radiother Oncol 2000;56:37-42. [Crossref] [PubMed]

Cite this article as: Wang X, Feng C, Huang M, Liu S, Ma H, Yu K. Cervical cancer segmentation based on medical images: a literature review. Quant Imaging Med Surg 2024;14(7):5176-5204. doi: 10.21037/qims-24-369

Cervical cancer segmentation based on medical images: a literature review

Introduction

Methods

Literature search method

Table 1

Integration of information

Current state of cervical cancer segmentation methods for different image techniques

Cervical cancer segmentation based on CT images

Atlas-based segmentation

Two-dimensional UNet (2D UNet)

Table 2

Three-dimensional UNet (3D UNet)

Table 3

Table 4

Cervical cancer segmentation based on MR images

Traditional methods

Deep learning methods

Table 5

Cervical cancer segmentation based on PET images

Current challenges and limitations

Comparison between the atlas-based and deep learning based automatic segmentation

Existing segmentation accuracy of CTV and OARs

Table 6

Table 7

Table 8

Squeeze-and-excitation (SE), res, and dilated convolution

Table 9

High precision of segmentation and low clinical availability

Inaccurate dose calculation

No public dataset

Conclusions

Acknowledgments

Footnote

References

Article Options

Download Citation

Share