Robust thoracic CT image registration with environmental adaptability using dynamic Welsch’s function and hierarchical structure-awareness strategy
Introduction
Thoracic computed tomography (CT) image registration serves as a fundamental basis for various clinical applications, including disease progression estimation (1,2), image-guided surgery (3,4), radiation-therapy planning (5), tumor tracking (6), and real-time CT reconstruction (7-9). All these applications rely on precise and robust estimation of spatial anatomical correspondences. For instance, in the field of surgical quantitative imaging, aligning anatomical structures over time is essential for precise monitoring of disease progression and tumor radiotherapy responses. Additionally, in surgical planning, precise registration of preoperative and intraoperative images is critical for accurate navigation and targeting throughout the procedure. Registration methods can generally be classified into two categories: optimization-based and deep learning (DL)-based. DL-based methods initially dominated the field of medical image segmentation (10-13), and in recent years, with the introduction of VoxelMorph (14), they have rapidly expanded into the field of medical image registration (15-17). Although DL-based registration methods have improved efficiency, there is still room for improvement in robustness and generalization compared to optimization-based methods (18).
Within the scope of thoracic CT imaging, different anatomical structures exhibit varied image features. For instance, the air within the lungs and other internal organs possesses relatively homogeneous intensities, whereas intensity variations are observed in the blood vessels within the lungs. Therefore, selecting an appropriate and consistent dissimilarity metric is challenging.
Moreover, it is necessary to consider the different motion patterns in motion-sensitive regions. For example, during CT imaging in the supine position, the spine exhibits rigid motion, whereas other regions, such as the heart, liver, and ribs, demonstrate smooth motions (18). This further complicates the selection of an appropriate regularization method for the displacement field.
Additionally, thoracic CT images often suffer from artifacts and noise due to the inherent invisibility of internal structures and limitations in imaging devices, imaging doses, and scanning conditions. For instance, cone-beam CT (CBCT) imaging generates high noise due to low doses, and the presence of implanted metal objects in the patient’s body can cause noticeable star-shaped or streak artifacts (19). In many scenarios, it is necessary to perform registration of high-noise CT images, such as in image-guided radiation therapy, where the pre-treatment low-noise CT image needs to be registered to the low-dose acquired high-noise real-time CBCT image to clearly visualize the target anatomical structures (3,20). Therefore, these challenges pose significant obstacles to robust thoracic CT image registration. To enhance the safety and reliability of thoracic CT image registration in clinical applications, it is imperative to develop a robust registration algorithm.
Related work
For various image features in different regions, Hu et al. (21) incorporated a structural alignment constraint, specifically the branch landmarks of the vessels and airway, into the lung registration process. Liu et al. (17) further proposed a mechanism to enhance anatomical embedding for calculating global anatomical point correspondences. However, these methods often focus solely on the lung region (22), disregarding or sacrificing the registration accuracy in other regions. Therefore, He et al. (18) proposed a deformation-aware dissimilarity metric, which utilizes dissimilarity metrics tailored to the deformation characteristics of each specific region, resulting in improved global registration accuracy. However, their method does not account for noise commonly caused by low-dose CT.
For various motion patterns in different regions, the commonly used regularization methods (23) aimed at achieving smooth deformations may result in unrealistic deformations at sliding boundaries of organs. To address this issue, Schmidt-Richberg et al. (24) and Delmon et al. (25) divided the sliding boundaries of organs, allowing for discontinuous deformations. Furthermore, to explicitly constrain sliding motion, Pock et al. (26) proposed total variation (TV) regularization utilizing the -norm, assuming sliding motion takes place when the updated deformation exceeds a predetermined threshold. In order to improve registration speed, Vishnevskiy et al. (27) introduced the parametric-TV (pTV) method, which integrates TV regularization into the free-form deformation (FFD) B-spline framework to apply constraints to the control points and achieve a significant reduction in the solution space. Recently, Gong et al. (28) further enhanced the pTV method by introducing local adaptive regularization, where the level of the norm is defined based on the distance to the sliding surface. However, their dissimilarity metric is global and only applied at the coarse level, failing to account for varying degrees of sliding motion that may occur simultaneously.
To enhance the robustness of registration, most methods (29,30) employ a masking strategy to restrict the deformation within certain regions and avoid interference from other regions. Nevertheless, these approaches do not explicitly enhance robustness by attenuating noise. Recently, Zhang et al. introduced the Welsch’s function (31) into rigid point cloud alignment (32), directly mitigating the influence of noise and enhancing robustness. However, their method is only applicable to rigid point cloud alignment, which differs significantly from non-rigid volumetric image registration in terms of the construction of optimization objectives, and there has been limited research on the robustness of methods in recent volumetric image registration, especially those based on DL.
Contribution
In this work, we propose a robust CT image registration algorithm. The key idea of our approach is to leverage Welsch’s function (31) to enforce sparsity, as well as the adoption of a divide-and-conquer strategy to handle deformations in different regions. Experimental results validate the accuracy and robustness of our method.
In summary, our contributions mainly consist of:
- We devised a novel hierarchical anatomical structure-aware thoracic CT image registration scheme that specifically targets intra-patient and intra-session registrations. By leveraging different image features, we designed specific dissimilarity metrics for individual regions. Furthermore, considering various motion patterns, appropriate regularization was tailored for each region.
- We formulated an optimization problem for volumetric image registration, wherein we innovatively used the smooth Welsch’s function with sparse control capability to reconstruct the dissimilarity metric and regularization term in this task. The majorization-minimization (MM) algorithm (33) was utilized to handle the Welsch term. This algorithm constructs a surrogate function of the objective function based on the current variable values and minimizes this surrogate functions to update the variables, thereby reducing the complexity of the optimization process.
- We devised a novel dynamic parameter update scheme for the Welsch’s function to achieve image-environment-adaptive registration by adjusting tolerance for noise and sliding motion.
- Experimental results demonstrated that our proposed method achieves comparable performance to state-of-the-art methods in noise-free scenarios, while demonstrating superior robustness in the presence of noise.
Methods
Overview
As shown in Figure 1, a hierarchical strategy is employed, where different regions are merged onto the control lattice, enabling coarse-to-fine image registration through an elaborate Gaussian pyramid to cope with large-scale deformations.
Problem formulation
Let represent the domain of the CT image, where X, Y, and Z respectively denote the length, width, and depth dimensions. The objective of the registration process is to determine a three-dimensional (3D) displacement field that effectively aligns the moving image with the fixed image . The estimation of the displacement field is commonly approached as an optimization problem, given by:
where is the objective function to be minimized, is an image dissimilarity metric term, and is a displacement regularization term, whereas is the weight of regularization.
Image dissimilarity metric term
The pursuit of a globally unified dissimilarity metric often entails sacrificing local registration accuracy in exchange for overall precision (18). Inspired by (34,35), we adopted an anatomical structure-aware dissimilarity metric framework.
For the bone region, it primarily undergoes rigid displacement or smooth deformation as a whole, thus necessitating a holistic assessment (18). Normalized cross-correlation (NCC), a statistical approach for quantifying the correlation between random variables, emerges as a suitable method for evaluating skeletal regions. NCC can be expressed as:
where is the voxel position in the image, represents the intensity value of the voxel at in the image after undergoing a transformation through the displacement vector field , and and represents the average intensity of and , respectively.
For the lung region, the majority of air within the lungs in CT images exhibits homogeneity, makes it unsuitable for guiding registration. Conversely, the rich vascular structure in the lung can serve as a reliable guide for localization. The normalized gradient field (NGF), based on normalized image gradients, is independent of specific intensity values, making it well-suited for capturing the local structure of blood vessels and an appropriate metric of dissimilarity in lung region. NGF can be expressed as:
where , is the number of voxels in the image, is a small value to avoid numerical error, and are defined the same as they are in , and gradient , is the -th component of the derivative operator.
For the remaining regions, the intensity values within organs and soft tissues are mostly uniform, making these homogeneous regions unsuitable for guiding registration. However, the edges formed by the intensity differences between different soft tissues and organs can be utilized for localization guidance. Therefore, we also employed NGF as a metric of dissimilarity in this region.
Additionally, factors such as motion artifacts and metal artifacts may deteriorate the registration accuracy. Inspired by recent advancements in robust rigid point cloud alignment (32), this study introduces the Welsch’s function (31) as a robust metric to dynamically adjust sensitivity to noise at different optimization stages. The following WelschNGF (WN) formulation is derived:
where is a user-specified parameter, and
As illustrated in Figure 2A, the function exhibits a monotonically increasing behavior on the interval , while being upper-bounded by 1. Moreover, as , Eq. [5] approaches the -norm. Consequently, it can penalize dissimilarity without being overly sensitive to large dissimilarities caused by outliers. Furthermore, compared to the original form, this formulation also distinguishes cases where the angle between the gradients of corresponding point intensities exceeds . Additionally, as demonstrated in the “Numerical optimization” section later, an efficient optimization approach can be designed for the Welsch term.
Displacement field regularization term
Similar to the “Image dissimilarity metric term” section, we also employ anatomical structure-aware regularization to obtain plausible deformations.
For the spine region, due to the minimal movement of the spine, direct penalty is imposed on the displacement itself. Specifically, parameter non-gradient regularization (pNG) based on the convex squared -norm is employed to minimize spinal displacement.
where represents the control lattice on the image, is the total number of control points. corresponds to the displacement vector at .
For the remaining region, the difference between displacement vectors of adjacent control points are typically penalized using the -th power of the -norm. In this study, we incorporate the smooth Welsch’s function into the regularization term to enhance robustness,
where is a user-specified parameter.
Numerical optimization
Efficient optimization of terms utilizing Welsch’ function can be achieved using the MM algorithm (33). Specifically, given the current variable values during iteration, the MM algorithm can construct a surrogate function for the objective function such that
Subsequently, the variables undergo updates through minimizing the surrogate function
This ensures a reduction of the objective function during each iteration, as Eq. [8] and Eq. [9] imply that .
Therefore, the numerical optimization is assured to attain a local minimum (33).
Surrogate function
At point , a convex quadratic surrogate function can be formulated for the Welsch’s function (36) (see Figure 2B):
The function bounds Welsch’s function from above, and the graphs of two functions intersect at . A readily optimized convex quadratic surrogate term is obtained by applying it to in Eq. [7]:
Note that in Eq. [11], certain constant terms that do not affect the optimization were disregarded. Similarly, by applying Eq. [11] to in Eq. [4] and disregarding constant terms, a surrogate term can be obtained:
In the “Ablations studies” section, we present a comparative analysis between this surrogate function and the original function to evaluate its potential in reducing optimization complexity.
Numerical minimization
To address the displacement field folding issue, we propose incorporating dissimilarity metrics and regularization for different regions into a unified objective function by summing up the individual terms. Next, the objective function is minimized using the AMSGrad optimizer (37).
Dynamic update strategy
The parameters and play crucial roles in the robustness of our method. The surrogate functions in Eq. [10] and Eq. [11] indicate that we are minimizing a quadratic error function, where the weights of the WN term and Welsch regularization (WR) term dynamically update as Gaussian functions of the current loss norms, with standard deviations and , respectively. On the one hand, and should be small enough in the final stage of optimization, enabling the Gaussian weights to proficiently attenuate the influence of larger errors caused by noise. On the other hand, fixing and at small values would reduce the effectiveness of optimization because many corresponding points have large loss at the beginning and are disregarded due to their tiny Gaussian weights.
Based on this, we gradually decrease and during the optimization. Specifically, for each iteration of optimization that yields , we set to the mean NGF dissimilarity of all corresponding points transformed by for the next iteration. According to the well-known 3- principle, when the NGF spatial distance is less than , the weight is larger, allowing for a sufficient number of terms in each optimization step, whereas when it exceeds , the weight becomes small enough to exclude noisy corresponding points from registration. As the optimization progresses and the average dissimilarity gradually decreases, also decreases. For , when it is in the first level of the Gaussian pyramid, the number of control points is limited due to the low resolution. Strict control of sliding is necessary to prevent error propagation and accumulation between levels. We set the initial value to , where represents the lattice spacing and represents the average voxel spacing. Since the largest sliding occurs at the lung boundary (18), as shown in Figure 3, where the cubes represent voxels, we assume that the lung boundary (blue) can shift by up to 100 mm along each axis parallel to the sliding plane, while the spine (gray) remains stationary. Consequently, the -norm gradient penalty of the displacement vector of the boundary voxel (i.e., the middle voxel) is . Thus, the initial value of is sufficient to strictly penalize sliding. In the subsequent levels of the Gaussian pyramid, is first updated with the new voxel spacing and then reduced to a quarter of its own value to penalize less sliding.
Thoracic anatomical structure segmentation
Segmentation for dissimilarity metric
The segmentation of lung region (blue), bone region (red), and the remaining region (gray) is performed (see Figure 4A). Specifically, for the lung region, we employ a model based on U-Net (38) proposed by Hofmanninger et al. (39) for automatic segmentation, ensuring that all parameters align with the original open-source code provided by their work. The hyperparameters, architecture, and training methods of the model can be referenced in (39). This model was trained on a large and diverse dataset encompassing extensive visual variations. For the bone region, due to its higher intensity compared to other areas, we can obtain the bone mask rapidly through threshold-based segmentation. Specifically, binarization is performed using a threshold of 1,200 Hounsfield units (HU), followed by applying morphological closing operations. Additionally, small fragments are removed by eliminating connected regions with areas smaller than 10. Finally, the remaining region is obtained by subtracting the lung and bone masks from the entire image.
Segmentation for displacement regularization
As shown in Figure 4B, the segmentation of spine region (blue) and the remaining region (gray and red) is performed for applying different regularization to these regions (see the “Problem formulation” section). To begin with, a morphological dilation operation is applied to the lung mask to obtain the expanded lung (red). Subsequently, the bone and the expanded lung undergo Boolean operations, followed by extraction of the largest connected subregion. This subregion is then merged with the central region of the original bone mask, yielding the generation of a spine mask. Finally, the remaining region is obtained by subtracting the spine mask from the entire image.
Hierarchical strategy
The thoracic CT images exhibit large-scale deformation, and direct registration may lead to undesired local minima. Therefore, as illustrated in Figure 1, based on the given number of levels, we embed the dissimilarity metric and regularization masks into a Gaussian pyramid.
Figure 5 presents the pseudocode of our robust CT image registration algorithm, where L denotes the number of levels in the Gaussian pyramid, and are parameters of the Welsch function in the real-time updated dissimilarity metrics for the lung and remaining regions, and and are binary masks for the spine and remaining regions.
Results
Dataset preparation
Conventional dataset
Experiments were conducted on the publicly available datasets deformable image registration lab four-dimensional CT (DIR-Lab 4DCT) (40) and chronic obstructive pulmonary disease (COPD) (41). Each dataset consisted of 10 cases, with 300 anatomical landmarks per phase. Registration was performed between the end-expiration and end-inspiration phases, and the average TRE was calculated for the 300 anatomical landmarks (ground-truth label). Since landmarks only exist within the lungs, although our method also models registration of bony anatomy outside the lungs and deformation of soft tissues, we evaluated the overall registration performance using root mean square error (RMSE), calculated as follows:
Compared to the 4DCT dataset, the COPD dataset exhibited larger motion range (see Figure 6).
Dataset with noise
Registration robustness is crucial as there are many scenarios where CT images need to be registered to lower-resolution and noisier images, such as in image-guided radiation therapy when preoperative CT images are registered to intraoperative CBCT images. To evaluate the robustness of our method, we applied the algorithm proposed in (19) to add simulated metal artifacts and Poisson noise to the 4DCT and COPD datasets. These two types of noise are the most common in CT imaging and are simulated in various studies involving low-dose CT (19,42). The variance of Poisson noise is set to 12, and for the detailed generation process and parameters of metal artifacts, please refer to (19). An example of noise addition is shown in Figure 7, where Figure 7A represents a slice image of COPD case 10, and Figure 7B is the image after adding noise. In the robustness validation experiment, the original datasets were registered to the noisy datasets. Furthermore, the TRE results on the 4DCT dataset with noise will be tested by reducing resolution and employing a Gaussian filter, used respectively for validating the method’s effectiveness on low-resolution images and comparing with denoising followed by registration. As shown in Figure 7, Figure 7C depicts the outcome of Figure 7B being reduced to one-fourth of the original resolution, whereas Figure 7D shows the result of Figure 7B being subjected to a Gaussian filter with a kernel size of 11.
Implementation details
The entire registration procedure was implemented based on the medical image registration framework provided by AirLab (43). All images were resampled to isotropic 1×1×1 mm3 voxel spacing and subsequently cropped to eliminate redundant background. The image intensities were clipped between 0 and 2,000 HU for handling noise points, then converted to double precision and scaled to the range [0, 1]. Due to the higher intensity values of the images in the dataset compared to typical CT images, the upper and lower bounds of our clip operation were correspondingly increased. Bilinear interpolation was employed for image deformation. The dissimilarity metric weights for lung, bone, and the remaining region were set to 2.5, 0.03, and 0.03, respectively, whereas the displacement field regularization weights for the spine and the remaining region were set to 0.04 and 0.04, respectively. A five-level Gaussian pyramid with an upsampling factor of 0.5 was utilized. The spacing for control points was set to 7×7×7 voxels. The AMSGrad optimizer (37) was employed to optimize the objective function. The initial step size of the optimizer was set to 0.005, and the iteration counts for the five Gaussian pyramid levels were 800, 800, 800, 100, and 50, respectively. The convergence threshold was set to 10−5. Using the PyTorch 1.8.1+cu111 library (https://pytorch.org/) and running on an NVIDIA RTX 3090 GPU (NVIDIA, Santa Clara, CA, USA), the average registration time was 96 seconds.
Parameter sensitivity
As shown in Figure 8, the sensitivities of dissimilarity metric weights, regularization weights, Gaussian pyramid levels, and control point lattice spacing were tested on the 4DCT DIR dataset with 300 landmarks.
Following (18), in the initial setup, the weights for the bone region and the remaining regions in the dissimilarity metric were both assigned a value of 0.03. The regularization weight for each region was set to 0.04, the number of levels in the Gaussian pyramid was 5, and the lattice spacing was 7. Following this, each hyperparameter was individually optimized, and the best values were then fixed for the subsequent hyperparameter search. First, the weight for the dissimilarity metric in the lung region was then tested within the range of [0.25, 4]. The blue curve in Figure 8 shows that initially, a larger weight led to better TRE, but after reaching 2.5, the TRE slightly increased. Therefore, a weight of 2.5 was set for the dissimilarity metric in the lung region. Next, the weights for the regularization in different regions were tested within the range of [0.001, 0.07], following (18) where the weights were set to the same value. The red curve in Figure 8 demonstrates that a weight of 0.04 yielded good results. Therefore, a weight of 0.04 was chosen for the regularization in this study. Then, the green curve in Figure 8 shows that initially, TRE decreased as the number of Gaussian pyramid levels increased, but after reaching five levels, TRE became insensitive to further increases in the number of levels. At this point, the computational cost significantly increased without a significant improvement in TRE. To enhance registration efficiency, five levels were selected. Furthermore, as depicted by the yellow curve in Figure 8, both too few and too many control points could lead to a decrease in registration accuracy. Therefore, this study opted for a relatively optimal control point lattice spacing of seven voxels. Note that we only optimized the hyperparameters on the publicly available dataset to ensure fairness in the comparisons on the non-public datasets.
Comparison methods
For the conventional dataset, we compared our algorithm with state-of-the-art methods, including MJ-CNN (44), One-Shot (45), LRN (46), Hering2021 (47), GRN (22), He (18), LapIRN (48), FlowNet (49), PDDN (50), and VM (11). Most of the comparison methods only reported results on the 4DCT dataset and did not provide results on the COPD dataset, so not all methods were compared on the COPD dataset.
None of the compared methods reported results on datasets with noise. Therefore, for the noisy dataset, to compare with state-of-the-art methods, two optimization-based methods [pTVReg (27) and He (18)] and one DL-based method [GRN (22)] were reproduced using their respective open-source codes [Pytorch 1.8.1+cu111 for He and GRN, Matlab R2021a (MathWorks, Natick, MA, USA) for pTVReg] for fair comparison. DL-based registration models often have limited generalization capabilities (18). To ensure fairness, we conducted unsupervised training of the model using a combination of conventional and noisy mixed datasets, which took around 7 hours on our graphics card.
Ablations studies
NoWelsch
NoWelsch refers to not using the Welsch’s function. In this case, the square of the modified NGF {as modified in Eq. [4]} is used for dissimilarity metric in non-bone regions, and the squared -norm is used for regularization in non-bone regions. NoWelsch is also equivalent to fixing the Welsch parameters and at larger values.
WR
Compared to NoWelsch, the regularization for non-bone regions was replaced with the WR in Eq. [7] to evaluate the effect of WR.
Surrogate WR (SWR)
Compared to WR, the regularization for non-bone region was replaced with the surrogate function in Eq. [11], to evaluate the effect of using a surrogate function in SWR.
WN + SWR
Compared to SWR, the dissimilarity metric in non-bone region was replaced with the WN in Eq. [4] to verify the effectiveness of the WN dissimilarity metric.
Ours
The method was ultimately employed in this work. Compared to WN + SWR, the dissimilarity metric in the non-bone region was replaced with the surrogate function in in Eq. [12] to evaluate the effect of using a surrogate function in WN dissimilarity metric.
Experimental results
TRE
Table 1 presents the TRE on the conventional dataset. In the 4DCT dataset, our method achieved the best average TRE of 1.14 mm and standard deviation of 0.11 mm. In the COPD dataset, our method achieved the second-best average TRE of 1.69 mm and standard deviation of 0.42 mm, and outperformed the recent optimization-based registration method (He with TRE of 2.05 mm and standard deviation of 0.99 mm). The DL-based registration method, GRN, achieved the best TRE (1.35 mm) on the COPD dataset, yet performed poorly on the noisy dataset (see Table 2).
Table 1
Case | Init. | MJ-CNN (44) | One-Shot (45) | LRN (46) | Hering 2021 (47) | GRN (22) | He (18) | Lap IRN (48) | FE+ (49) | PD DN (50) | VM (11) | NoWelsch | WR | SWR | WN + SWR | Ours |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
4DCT | ||||||||||||||||
01 | 3.89 | 1.20 | 1.21 | 0.98 | 0.99 | 0.86† | 0.97 | 1.00 | 2.20 | 0.90‡ | 1.46 | 1.05 | 1.03 | 1.05 | 1.05 | 1.04 |
02 | 4.34 | 1.13 | 1.13 | 0.98 | 0.98 | 0.90† | 0.91‡ | 1.28 | 3.89 | 0.91‡ | 1.51 | 0.99 | 0.99 | 1.01 | 1.00 | 0.98 |
03 | 6.94 | 1.30 | 1.32 | 1.14 | 1.11 | 1.06 | 1.04‡ | 2.18 | 2.71 | 1.06 | 2.31 | 1.11 | 1.12 | 1.02† | 1.04 | 1.05 |
04 | 9.83 | 1.55 | 1.84 | 1.39 | 1.37 | 1.45 | 1.37 | 3.05 | 2.95 | 1.66 | 2.72 | 1.41 | 1.40 | 1.31 | 1.30‡ | 1.29† |
05 | 7.48 | 1.72 | 1.80 | 1.43 | 1.32‡ | 1.60 | 1.36 | 2.36 | 3.03 | 1.68 | 2.69 | 1.38 | 1.35 | 1.35 | 1.34 | 1.31† |
06 | 10.89 | 2.02 | 2.30 | 2.26 | 1.15† | 1.59 | 1.16‡ | 1.78 | 3.36 | 1.86 | 3.07 | 1.18 | 1.18 | 1.15† | 1.20 | 1.16‡ |
07 | 11.03 | 1.70 | 1.91 | 1.42 | 1.05† | 1.74 | 1.18 | 2.24 | 3.10 | 1.94 | 3.01 | 1.15‡ | 1.15‡ | 1.15‡ | 1.15‡ | 1.16 |
08 | 14.99 | 2.64 | 3.47 | 3.13 | 1.22 | 1.46 | 1.20‡ | 2.24 | 2.94 | 1.79 | 6.22 | 1.25 | 1.27 | 1.23 | 1.21 | 1.19† |
09 | 7.92 | 1.51 | 1.47 | 1.27 | 1.11† | 1.58 | 1.19 | 2.26 | 2.86 | 1.94 | 2.94 | 1.20 | 1.17 | 1.20 | 1.15‡ | 1.15‡ |
10 | 7.30 | 1.79 | 1.79 | 1.93 | 1.05† | 1.71 | 1.07 | 1.90 | 2.99 | 2.03 | 3.00 | 1.16 | 1.15 | 1.08 | 1.08 | 1.06 |
Avg | 8.46 | 1.66 | 1.83 | 1.59 | 1.14† | 1.40 | 1.14† | 2.03 | 3.00 | 1.57 | 2.89 | 1.19 | 1.18 | 1.16 | 1.15‡ | 1.14† |
Std | 3.33 | 0.42 | 0.65 | 0.64 | 0.13 | 0.31 | 0.14 | 1.25 | 0.55 | 0.41 | 0.42 | 0.13 | 0.12‡ | 0.11† | 0.11† | 0.11† |
COPD | ||||||||||||||||
01 | 26.33 | / | / | / | / | 1.38 | 1.12† | 6.85 | 4.89 | 2.57 | 9.95 | 1.39 | 1.35‡ | 1.36 | 1.39 | 1.39 |
02 | 21.79 | / | / | / | / | 2.09† | 3.20 | 6.90 | 7.30 | 4.01 | 9.96 | 3.75 | 3.30 | 3.25 | 2.50‡ | 2.48 |
03 | 12.64 | / | / | / | / | 1.22‡ | 1.20† | 1.51 | 2.89 | 1.46 | 4.41 | 1.34 | 1.32 | 1.30 | 1.31 | 1.30 |
04 | 29.58 | / | / | / | / | 1.58† | 2.07 | 6.38 | 5.46 | 2.19 | 7.08 | 1.88 | 1.93 | 1.91 | 1.89 | 1.86‡ |
05 | 30.08 | / | / | / | / | 1.37† | 1.50‡ | 6.81 | 5.19 | 2.22 | 9.19 | 1.94 | 1.86 | 1.89 | 1.65 | 1.63 |
06 | 28.46 | / | / | / | / | 1.10† | 2.24 | 4.19 | 5.53 | 1.89 | 8.12 | 1.54 | 1.54 | 1.55 | 1.50 | 1.49‡ |
07 | 21.60 | / | / | / | / | 1.19† | 1.22 | 2.73 | 4.40 | 1.62 | 7.10 | 1.25 | 1.21‡ | 1.25 | 1.22 | 1.19† |
08 | 26.46 | / | / | / | / | 1.19† | 3.77 | 4.32 | 3.94 | 1.72 | 7.92 | 2.46 | 2.31 | 1.74 | 1.55 | 1.52‡ |
09 | 14.89 | / | / | / | / | 0.99† | 1.11‡ | 3.60 | 3.57 | 1.51 | 6.93 | 2.42 | 2.31 | 2.30 | 2.28 | 2.27 |
10 | 21.81 | / | / | / | / | 1.38† | 3.03 | 6.59 | 4.44 | 2.43 | 9.16 | 5.36 | 4.64 | 3.86 | 1.81 | 1.80‡ |
Avg | 23.36 | / | / | / | / | 1.35† | 2.05 | 7.98 | 4.99 | 4.76 | 2.16 | 2.33 | 2.18 | 2.04 | 1.71 | 1.69‡ |
Std | 5.99 | / | / | / | / | 0.31† | 0.99 | 1.70 | 1.98 | 1.23 | 0.76 | 1.30 | 1.07 | 0.87 | 0.42‡ | 0.42‡ |
Metrics of the comparison methods were taken from the literature. †, the best; ‡, the second best. “/” indicates the results on COPD dataset were not reported in their literature. TRE, target registration error; DIR-Lab 4DCT, deformable image registration lab four-dimensional computed tomography; COPD, chronic obstructive pulmonary disease; init., initial result; WR, Welsch regularization; SWR, surrogate Welsch regularization; WN, WelschNGF; Avg, average value; Std, standard deviation.
Table 2
Case | GRN (22) | pTVReg (27) | He (18) | NoWelsch | WR | SWR | WN + SWR | Ours |
---|---|---|---|---|---|---|---|---|
N4 | ||||||||
01 | 2.41 | 2.06 | 1.45 | 1.45 | 1.45 | 1.42‡ | 1.40† | 1.40† |
02 | 2.05 | 1.98 | 1.43‡ | 1.47 | 1.52 | 1.40† | 1.40† | 1.40† |
03 | 3.65 | 2.96 | 1.69 | 1.66 | 1.65 | 1.62‡ | 1.62‡ | 1.60† |
04 | 5.76 | 1.87 | 1.81 | 1.88 | 1.83 | 1.79 | 1.74† | 1.78‡ |
05 | 4.33 | 4.67 | 2.11 | 2.08 | 1.99 | 1.98‡ | 1.98‡ | 1.89† |
06 | 9.01 | 2.19 | 2.37 | 2.28 | 2.05 | 2.04 | 2.01‡ | 1.80† |
07 | 8.62 | 3.90 | 2.36 | 2.31 | 2.16 | 2.13 | 2.08‡ | 2.02† |
08 | 11.10 | 4.21 | 2.33 | 2.40 | 2.30 | 2.21‡ | 2.20† | 2.26 |
09 | 6.30 | 1.89 | 2.28 | 2.20 | 1.85‡ | 1.97 | 2.02 | 1.74† |
10 | 5.51 | 3.98 | 2.21 | 2.24 | 2.00 | 2.02 | 1.99‡ | 1.89† |
Avg | 5.87 | 2.97 | 2.00 | 2.00 | 1.88 | 1.86 | 1.84‡ | 1.78† |
Std | 2.96 | 1.11 | 0.38 | 0.36 | 0.28‡ | 0.29 | 0.29 | 0.27† |
NC | ||||||||
01 | 18.50 | 5.70 | 2.49 | 2.25 | 2.04‡ | 2.15 | 2.05 | 1.97† |
02 | 14.43 | 3.51 | 4.89 | 4.37 | 3.99 | 3.99 | 3.30‡ | 3.05† |
03 | 7.92 | 2.28 | 2.22 | 2.16 | 1.99‡ | 2.00 | 1.99‡ | 1.98† |
04 | 17.45 | 3.18 | 3.63 | 2.59 | 2.51 | 2.49 | 2.38‡ | 2.37† |
05 | 16.60 | 2.61 | 2.81 | 2.67 | 2.80 | 2.57 | 2.38‡ | 2.37† |
06 | 19.53 | 3.94 | 2.77 | 2.51 | 2.51 | 2.45‡ | 2.24† | 2.24† |
07 | 14.46 | 2.06 | 2.08 | 2.19 | 2.01 | 2.07 | 1.81† | 1.89‡ |
08 | 15.87 | 2.81 | 3.51 | 3.48 | 2.67 | 3.10 | 2.28‡ | 2.26† |
09 | 7.94 | 3.63 | 3.41 | 3.22 | 3.15 | 3.13 | 2.96‡ | 2.88† |
10 | 13.93 | 4.47 | 5.27 | 5.87 | 4.28 | 3.33 | 2.78‡ | 2.75† |
Avg | 14.66 | 3.42 | 3.31 | 3.13 | 2.80 | 2.73 | 2.42‡ | 2.38† |
Std | 3.98 | 1.10 | 1.08 | 1.19 | 0.80 | 0.64 | 0.46‡ | 0.40† |
Metrics of the comparison methods were obtained from local experiments. The initial landmark distance is the same as in Table 1. †, the best; ‡, the second best. TRE, target registration error; DIR-Lab 4DCT, deformable image registration lab four-dimensional computed tomography; COPD, chronic obstructive pulmonary disease; WR, Welsch regularization; SWR, surrogate Welsch regularization; WN, WelschNGF; N4, noisy 4DCT; Avg, average value; Std, standard deviation; NC, noisy COPD.
Table 2 presents the TRE on the noisy dataset. The results demonstrate that our method achieved the best average TRE and standard deviation on both the noisy 4DCT and noisy COPD datasets, validating the robustness of our approach. Among the optimization-based methods (our method, He, and pTVReg), He employs similar anatomical-aware dissimilarity metrics and regularization as in this study, whereas pTVReg adopts a global strategy. The results indicate that He outperformed pTVReg, highlighting the positive effect of anatomical awareness on robustness. Furthermore, our method outperformed He, confirming the effectiveness of the Welsch’s function. Although GRN achieved good results on the conventional COPD dataset, it exhibited larger errors on the noisy dataset. To analyze the reasons, we plotted the loss curve during training for GRN, as shown in Figure 9. It can be observed that the curve fluctuated significantly and continuously, possibly due to the failure of feature point extraction in the presence of excessive noise and the absence of a Gaussian pyramid strategy to prevent registration from getting trapped in undesired local minima. Furthermore, the generalization of DL-based methods still needs improvement, requiring more data, longer training time, and better techniques. In contrast, optimization-based methods exhibited more stable results, likely due to their independence from the dataset and typically higher interpretability and controllability.
In ablation studies, we observed that, for each dataset, the average TRE results were as follows: NoWelsch > WR > SWR > WN + SWR > Ours. Analyzing the noisy COPD dataset with significant displacements, after applying Welsch’s function to the regularization term (NoWelsch → WR), the TRE reduced by ~11%. After using surrogate function for WR (WR → SWR), the TRE decreased by ~3%. After applying Welsch’s function to the NGF dissimilarity metric (SWR → WN + SWR), the TRE decreased by ~11%. After using surrogate function for the WN dissimilarity metric (WN + SWR → Ours), the TRE decreased by ~2%. Consistent with our expectations, both Welsch’s function with parameter dynamic updating mechanism and surrogate functions were effective in reducing the TRE in each case. Furthermore, Welsch’s function reduced the TRE more effectively than surrogate functions because Welsch’s function primarily enhances robustness, whereas surrogate functions focus on reducing the optimization complexity. Figure 10 (first and second columns) present the interpolated heatmaps of TRE comparisons between our method (top row) and the NoWelsch method (bottom row) in the noisy COPD case 10. It can be visually observed that our method achieved better TRE in the lower right corner of the axial slice and the upper right corner of the coronal slice.
RMSE
Table 3 presents the RMSE results on both the Traditional and noisy 4DCT and COPD datasets. On average, our method achieved the best performance on the traditional 4DCT dataset and the second-best performance on the traditional COPD dataset, while obtaining the best results on both the noisy 4DCT and noisy COPD datasets. This indicates that our method performs comparably to state-of-the-art methods in terms of RMSE when noise is absent, and demonstrates excellent robustness in the presence of noise. This aligns with the TRE results, which evaluate the accuracy of lung registration, suggesting that our method effectively models the image characteristics and motion patterns of the entire image. Furthermore, consistent with the TRE results, the optimization-based methods (pTVReg and He) exhibited higher stability across different noise levels compared to the DL-based method (GRN).
Table 3
Case | Without noise | With noise | |||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
Init. | GRN (22) | pTVReg (27) | He (18) | Ours | Init. | GRN (22) | pTVReg (27) | He (18) | Ours | ||
4DCT | |||||||||||
01 | 48.61 | 15.87† | 19.24 | 18.36 | 17.82‡ | 108.31 | 61.23 | 56.72 | 52.99‡ | 52.85† | |
02 | 55.25 | 19.87† | 25.22 | 22.34 | 22.48‡ | 120.67 | 64.52 | 63.56 | 58.72† | 59.83‡ | |
03 | 59.32 | 25.85 | 25.78 | 21.15‡ | 21.03† | 115.95 | 78.37 | 73.80 | 64.15‡ | 62.32† | |
04 | 55.81 | 30.85 | 28.39 | 27.65† | 28.14‡ | 119.18 | 85.52 | 71.26 | 69.01‡ | 68.12† | |
05 | 51.56 | 27.98 | 24.36‡ | 24.47 | 24.24† | 108.91 | 74.25 | 72.12 | 63.22‡ | 61.25† | |
06 | 43.44 | 30.65 | 27.92 | 26.78‡ | 25.98† | 94.30 | 84.94 | 69.58‡ | 71.55 | 67.65† | |
07 | 41.61 | 29.89 | 28.47 | 26.55† | 26.58‡ | 93.96 | 85.59 | 77.00 | 72.30‡ | 65.50† | |
08 | 48.48 | 29.88 | 23.69 | 21.89† | 22.04‡ | 94.91 | 89.03 | 78.81 | 69.73‡ | 59.90† | |
09 | 34.87 | 21.78 | 21.25 | 17.78† | 17.87‡ | 89.94 | 78.28 | 55.24‡ | 57.63 | 52.14† | |
10 | 36.81 | 17.89 | 19.69 | 14.83‡ | 14.37† | 91.61 | 73.58 | 64.29 | 54.68‡ | 49.23† | |
Avg | 47.58 | 25.05 | 24.40 | 22.18‡ | 22.06† | 103.77 | 77.53 | 68.24 | 63.40‡ | 59.88† | |
Std | 8.25 | 5.72 | 3.44 | 4.28† | 4.39‡ | 12.12 | 9.30 | 8.06 | 7.13‡ | 6.59† | |
COPD | |||||||||||
01 | 100.43 | 21.44 | 24.78 | 20.89† | 21.15‡ | 139.24 | 103.65 | 79.80 | 71.08‡ | 69.72† | |
02 | 99.64 | 29.37† | 39.23 | 36.57 | 34.25‡ | 152.92 | 124.54 | 81.14† | 87.59 | 81.58‡ | |
03 | 77.93 | 18.78† | 24.05 | 19.05‡ | 19.19 | 121.05 | 98.06 | 54.16 | 53.98‡ | 51.58† | |
04 | 107.41 | 25.12† | 31.65 | 28.32 | 26.87‡ | 142.51 | 120.83 | 71.82‡ | 73.54 | 62.13† | |
05 | 98.83 | 23.87† | 32.98 | 25.98‡ | 27.65 | 146.85 | 118.36 | 69.53‡ | 69.85 | 67.70† | |
06 | 98.04 | 20.78† | 26.85 | 26.11 | 22.54‡ | 132.67 | 104.32 | 83.15 | 78.63‡ | 67.62† | |
07 | 91.97 | 19.89‡ | 19.93 | 20.33 | 19.24† | 129.02 | 94.71 | 65.58 | 65.11‡ | 62.33† | |
08 | 104.34 | 23.87† | 35.51 | 39.78 | 25.78‡ | 144.22 | 117.73 | 81.29‡ | 84.35 | 71.56† | |
09 | 81.36 | 19.23† | 20.54‡ | 21.08 | 27.87 | 127.18 | 93.58 | 75.86 | 72.39‡ | 69.21† | |
10 | 97.71 | 25.37† | 34.15 | 33.12 | 28.78‡ | 145.42 | 120.02 | 77.59‡ | 81.87 | 70.02† | |
Avg | 95.77 | 22.77† | 28.97 | 27.12 | 25.33‡ | 138.11 | 109.58 | 73.99 | 73.84‡ | 67.35† | |
Std | 9.45 | 3.35† | 6.64 | 7.27 | 4.78‡ | 10.16 | 11.90 | 8.99‡ | 9.91 | 7.73† |
†, the best; ‡, the second best. RMSE, root mean square error; DIR-Lab 4DCT, deformable image registration lab four-dimensional computed tomography; COPD, chronic obstructive pulmonary disease; init., initial result; Avg, average value; Std, standard deviation.
Lower resolution validation
Table 4 presents the TRE results of each method after reducing resolution on the noisy 4DCT dataset. Compared to Table 2, there is no significant decrease in accuracy for any of the methods. This might be attributed to GRN utilizing sparse keypoint control for global displacement fields, whereas pTVReg, He, and our method employ parameterized control lattice for global displacement fields. Both of these approaches are capable of preventing registration from getting stuck in local minima, effectively conducting a search for displacement fields in a downsampled space. The impact of resolution reduction on accuracy is thus limited. Consequently, our method can also be extended to registration at lower resolutions.
Table 4
Case | Init. | Lower resolution | Gaussian filter | |||||||
---|---|---|---|---|---|---|---|---|---|---|
GRN (22) | ptvReg (27) | He (18) | Ours | GRN (22) | ptvReg (27) | He (18) | Ours | |||
4DCT 01 | 3.89 | 2.42 | 2.13 | 1.46 | 1.42† | 2.40 | 2.04 | 1.44 | 1.41† | |
4DCT 02 | 4.34 | 2.16 | 2.01 | 1.44 | 1.43† | 2.04 | 1.97 | 1.43† | 1.44 | |
4DCT 03 | 6.94 | 3.64 | 3.03 | 1.68 | 1.62† | 3.64 | 2.93 | 1.65 | 1.61† | |
4DCT 04 | 9.83 | 5.71 | 1.96 | 1.85 | 1.81† | 5.72 | 1.87 | 1.81 | 1.78† | |
4DCT 05 | 7.48 | 4.35 | 4.73 | 2.21 | 1.96† | 4.30 | 4.68 | 2.10 | 1.90† | |
4DCT 06 | 10.89 | 9.02 | 2.17 | 2.38 | 1.92† | 9.06 | 2.18 | 2.38 | 1.79† | |
4DCT 07 | 11.03 | 8.54 | 3.94 | 2.43 | 2.14† | 8.48 | 3.88 | 2.34 | 2.03† | |
4DCT 08 | 14.99 | 11.27 | 4.25 | 2.39 | 2.27† | 11.21 | 4.18 | 2.33 | 2.28† | |
4DCT 09 | 7.92 | 6.16 | 1.92 | 2.31 | 1.79† | 6.31 | 1.93 | 2.27 | 1.75† | |
4DCT 10 | 7.30 | 5.73 | 3.92 | 2.22 | 1.91† | 5.49 | 3.96 | 2.19 | 1.91† | |
Avg | 8.46 | 5.90 | 3.01 | 2.04 | 1.83† | 5.86 | 2.96 | 1.99 | 1.79† | |
Std | 3.33 | 2.97 | 1.10 | 0.39 | 0.28† | 2.98 | 1.10 | 0.38 | 0.26† |
†, the best. TRE, target registration error; 4DCT, four-dimensional computed tomography; init., initial result; Avg, average value; Std, standard deviation.
Comparison with registration after denoising
Table 4 also presents the TRE results of different methods on the noisy 4DCT dataset after denoising with Gaussian filtering before registration. Similar to lower resolution validation, there is no significant improvement in results compared to Table 2. This lack of improvement could be attributed to Gaussian filtering, which, while filtering out high-frequency noise, also removes many image details and cannot eliminate high-intensity non-high-frequency metal artifacts. This leads to denoising effects similar to resolution reduction, as visually shown in Figure 7. Therefore, even when the comparison methods denoise with Gaussian filtering before registration, their accuracy falls short of our method that addresses outliers directly during the optimization process. Another approach for the comparison methods could involve using DL-based CT artifact removal on the noisy images before registration, but this goes beyond the scope of this study.
Displacement field
The third and fourth columns in Figure 10 illustrate the displacement field grids and RGB images of our method (top row) and NoWelsch (bottom row). It can be observed that the displacement field grid of our method is more reasonable compared to NoWelsch. For instance, NoWelsch exhibited several distortions in the left lung, whereas our method exhibited smoother results. In the corresponding RGB image, NoWelsch showed more localized colors. Additionally, the heatmaps in the second column indicates that our method achieved better TRE in the left lung. Figure 11 illustrates the displacement field of our method (Figure 11A,11C) and NoWelsch (Figure 11B,11D) using arrows. In the first comparison group, the dashed boxes in Figure 11A,11B show that our method achieved smoother transitions when the displacement field changed from right to left compared to NoWelsch. Conversely, in the second comparison group, the dashed boxes in Figure 11C,11D demonstrate that our method achieved sharper transitions when the displacement field changes from bottom to top compared to NoWelsch, which adopts a larger height range for the transition. These two comparison groups correspond to the first and second columns of Figure 10, respectively, and our method consistently achieved better TRE. This was achieved by regularizing the displacement field using the Welsch’s function and gradually reducing the Welsch parameter at each level to allow more differences in adjacent displacement vectors. Furthermore, the abnormal displacements in NoWelsch may be related to noise. Due to the excessive penalty of the squared -norm, NoWelsch may produce unrealistic displacement vectors; especially when a normal point corresponds to a correct noise point, the large penalty forces the normal point to move to a wrong target point. In contrast, our approach applies the Welsch’s function to the distance in NGF space and dynamically adjusts the Welsch parameter to gradually reduce sensitivity to noise.
Loss curve
Figure 12 illustrates the loss curves in the ablation studies to investigate the role of surrogate functions in minimizing the objective function. The optimization process was repeated five times for the challenging noisy COPD case 10. To facilitate comparison, the average loss curves of four methods with the same initial point in the first level are shown, and the loss range is transformed into the TRE interval. In Figure 12B, SWR exhibited higher loss reduction and lower standard deviation than WR, confirming the positive effect of employing surrogate functions for WR. In Figure 12C, although Ours had a slightly higher standard deviation than WN + SWR, it showed a significantly faster descent rate, indicating that using surrogate functions for WN can enhance optimization efficiency, possibly due to the increased local convexity of the objective function. Therefore, consistent with the expected results, employing surrogate functions for both WR and WN dissimilarity metric exerts a positive impact on optimization.
Performance analysis
Table 5 presents a comparative analysis of the performance of the replicated three methods and our method using the same dataset (noisy 4DCT) and computational configuration. Although our method exhibits higher graphics processing unit (GPU) resource consumption compared to the other methods, they all can run on a modest 16 GB GPU, ensuring scalability. Additionally, although DL-based methods demonstrate significant advantages in registration speed, they entail long training times and limited generalizability due to potential issues with dataset distribution.
Table 5
Performance metrics | DL-based | Optimization-based | |||
---|---|---|---|---|---|
GRN (22) | ptvReg (27) | He (18) | Ours | ||
Computational configuration | NVIDIA RTX 3090 (24 GB) |
NVIDIA RTX 3090 (24 GB) |
NVIDIA RTX 3090 (24 GB) |
NVIDIA RTX 3090 (24 GB) |
|
GPU memory consumption | 11.25 GB | 6.14 GB | 10.63 GB | 14.92 GB | |
Training time | 7 hours | 0 s | 0 s | 0 s | |
Registration time | 2 s | 113 s | 61 s | 96 s |
DL, deep learning; GPU, graphics processing unit.
Discussion
In this work, a robust image-environment-adaptive thoracic CT image registration method is proposed. We innovatively utilized the smooth Welsch’s function to redesign dynamically-updated dissimilarity metric and regularization within a hierarchical anatomical structure-aware registration framework. Compared to state-of-the-art methods, our approach achieved comparable performance in low-noise environments and outperformed them in high-noise environments, demonstrating the robustness of our method. Additionally, ablation studies were conducted to individually verify the effectiveness of the Welsch’s function, dynamic update strategy, and MM optimization method.
Although our method performed slightly worse than the DL model GRN, which is specifically designed for large-scale deformation optimization, on the conventional COPD dataset with significant deformations, our method still achieved second place. This indicates that our method, which focuses on optimizing registration robustness, is capable of handling large deformations. Furthermore, compared to GRN, our method outperformed in terms of metrics on other conventional/noisy datasets, demonstrating its stronger generalization ability.
Our method relies on segmentation. Thanks to the advancements in DL, it is possible to quickly segment the masks of lung and bone from conventional or noisy CT images. In this study, the segmentation of a single case could be accomplished within a mere 19 seconds. Similar to other segmentation-based registration methods (18,22), one limitation of our approach is that the registration accuracy is influenced by the segmentation precision. However, this limitation diminishes as the accuracy (51-53) and robustness (54,55) of medical image segmentation improve over time.
Although our method utilizes the NCC dissimilarity metric, which may vary in performance across different datasets (56), the divide-and-conquer strategy mitigates this issue by applying NCC only to suitable regions, thereby improving robustness.
Although our method belongs to optimization-based approaches, the registration of a single case takes an average of only about 1.5 minutes and does not require dataset-specific support, making it highly applicable. Despite the fact that our framework exhibits a certain level of complexity, it is important to note that all its steps can be automated. This allows for user-friendly clinical application, where the algorithm can be employed for end-to-end registration without requiring users to be concerned about the intricate implementation details.
This work can be considered as an improvement upon volumetric image registration methods, focusing specifically on the thoracic CT image registration by incorporating anatomical structure-aware techniques and optimizing parameter update strategies. Hence, we believe that the designed dissimilarity metrics and displacement field regularization based on the Welsch’s function can be easily transferred to other types of pixel/voxel image registration, as well as registration of point clouds or meshes, such as magnetic resonance imaging (MRI), ultrasound, camera-captured images, and in-silico medical phantoms (57-59), whether they are optimization-based or DL-based.
The proposed method is based on optimization, in contrast to popular DL-based registration methods. Consistent with the well-known “No Free Lunch Theorem” (60) in the field of optimization, although DL-based methods can reduce registration time to seconds, they require training for tens of hours and suffer from limited generalizability due to dataset constraints. Therefore, caution should be exercised when applying these methods to images with new features, especially when considering safety in clinical applications. Meanwhile, optimization-based methods, although relatively slower, offer higher interpretability and stronger generalization. In practical applications, the choice of method should be based on the clinical scenario. For example, when registering a single case for disease progression analysis or CT reconstruction, an optimization-based method can be selected. In contrast, for real-time continuous registration tasks such as tumor tracking, a DL-based method would be more suitable.
Despite the favorable outcomes obtained with our method, there are still some limitations. First, appropriate initialization of the Welsch parameters is still required to avoid convergence to undesired local minima. Second, the overall objective function is still non-convex. Furthermore, further developments are needed to handle inter-session (i.e., the CT-CBCT registration problem) and inter-patient registrations, such as initial image alignment, variations in field of view/resolution, and changes in anatomy that cannot be represented by deformation. Future plans include introducing image quality assessment and pre-registration for optimal parameter estimation, as well as exploring alternative dissimilarity metrics for constructing a convex quadratic objective function.
Conclusions
In this study, we propose an image-environment-adaptive registration method for thoracic CT images based on a hierarchical anatomical structure-aware framework, utilizing a dynamic Welsch’s function. The ample experimental results validate the robustness of the proposed method.
Acknowledgments
Funding: This work was supported by
Footnote
Conflicts of Interest: All authors have completed the ICMJE uniform disclosure form (available at https://qims.amegroups.com/article/view/10.21037/qims-24-596/coif). The authors have no conflicts of interest to declare.
Ethical Statement: The authors are accountable for all aspects of the work in ensuring that questions related to the accuracy or integrity of any part of the work are appropriately investigated and resolved.
Open Access Statement: This is an Open Access article distributed in accordance with the Creative Commons Attribution-NonCommercial-NoDerivs 4.0 International License (CC BY-NC-ND 4.0), which permits the non-commercial replication and distribution of the article with the strict proviso that no changes or edits are made and the original work is properly cited (including links to both the formal publication through the relevant DOI and the license). See: https://creativecommons.org/licenses/by-nc-nd/4.0/.
References
- Gorbunova V, Lol P, Ashraf H, Dirksen A, Nielsen M, de Bruijne M. Weight preserving image registration for monitoring disease progression in lung CT. Med Image Comput Comput Assist Interv 2008;11:863-70.
- Chassagnon G, Vakalopoulou M, Régent A, Sahasrabudhe M, Marini R, Hoang-Thi TN, Dinh-Xuan AT, Dunogué B, Mouthon L, Paragios N, Revel MP. Elastic Registration-driven Deep Learning for Longitudinal Assessment of Systemic Sclerosis Interstitial Lung Disease at CT. Radiology 2021;298:189-98. [Crossref] [PubMed]
- Wang T, He T, Zhang Z, Chen Q, Zhang L, Xia G, Yang L, Wang H, Wong STC, Li H. A personalized image-guided intervention system for peripheral lung cancer on patient-specific respiratory motion model. Int J Comput Assist Radiol Surg 2022;17:1751-64. [Crossref] [PubMed]
- Alam F, Rahman SU, Ullah S, Gulati K. Medical image registration in image guided surgery: Issues, challenges and research opportunities. Biocybern Biomed Eng 2018;38:71-89.
- Wang B, Wang DQ, Lin MS, Lu SP, Zhang J, Chen L, Li QW, Cheng ZK, Liu FJ, Guo JY, Liu H, Qiu B. Accumulation of the delivered dose based on cone-beam CT and deformable image registration for non-small cell lung cancer treated with hypofractionated radiotherapy. BMC Cancer 2020;20:1112. [Crossref] [PubMed]
- Fu Y, Lei Y, Wang T, Curran WJ, Liu T, Yang X. Deep learning in medical image registration: a review. Phys Med Biol 2020;65:20TR01. [Crossref] [PubMed]
- Wang T, Xia G, Li H, Qi C, Wang H. Patient specific respiratory motion model using two static CT images. In: Proceedings of the 2nd International Symposium on Artificial Intelligence for Medicine Sciences. 2021:488-92.
- Huang Y, Dong Z, Wu H, Li C, Liu H, Zhang Y. Deep learning-based synthetization of real-time in-treatment 4D images using surface motion and pretreatment images: A proof-of-concept study. Med Phys 2022;49:7016-24. [Crossref] [PubMed]
- Xiao H, Chen K, You T, Liu D, Zhang W, Xue X, Long L, Dang J. Real-Time 4-D-Cone Beam CT Accurate Estimation Based on Single-Angle Projection via Dual-Attention Mechanism Residual Network. IEEE Trans Radiat Plasma Med Sci 2023;7:618-629.
- You C, Zhao R, Liu F, Dong S, Chinchali S, Topcu U, Staib L, Duncan JS. Class-Aware Adversarial Transformers for Medical Image Segmentation. Adv Neural Inf Process Syst 2022;35:29582-96.
- You C, Zhao R, Staib L, Duncan JS. Momentum Contrastive Voxel-wise Representation Learning for Semi-supervised Volumetric Medical Image Segmentation. Med Image Comput Comput Assist Interv 2022;13434:639-52.
- You C, Zhou Y, Zhao R, Staib L, Duncan JS. SimCVD: Simple Contrastive Voxel-Wise Representation Distillation for Semi-Supervised Medical Image Segmentation. IEEE Trans Med Imaging 2022;41:2228-37. [Crossref] [PubMed]
- You C, Xiang J, Su K, Zhang X, Dong S, Onofrey J, Staib L, Duncan JS. Incremental Learning Meets Transfer Learning: Application to Multi-site Prostate MRI Segmentation. Distrib Collab Fed Learn Afford AI Healthc Resour Div Glob Health (2022) 2022;13573:3-16. [Crossref] [PubMed]
- Balakrishnan G, Zhao A, Sabuncu MR, Guttag J, Dalca AV. VoxelMorph: A Learning Framework for Deformable Medical Image Registration. IEEE Trans Med Imaging 2019; Epub ahead of print. [Crossref]
- Chen L, Cao X, Chen L, Gao Y, Shen D, Wang Q, Xue Z. Semantic hierarchy guided registration networks for intra-subject pulmonary CT image alignment. In: Medical Image Computing and Computer Assisted Intervention-MICCAI 2020: 23rd International Conference, Lima, Peru, October 4-8, 2020, Proceedings, Part III 23. Springer International Publishing; 2020:181-9.
- Chen X, Meng Y, Zhao Y, Williams R, Vallabhaneni SR, Zheng Y. Learning unsupervised parameter-specific affine transformation for medical images registration. In: Medical Image Computing and Computer Assisted Intervention–MICCAI 2021: 24th International Conference, Strasbourg, France, September 27–October 1, 2021, Proceedings, Part IV 24. Springer International Publishing; 2021:24-34.
- Liu F, Yan K, Harrison AP, Guo D, Lu L, Yuille AL, Huang L, Xie G, Xiao J, Ye X, Jin D. SAME: Deformable image registration based on self-supervised anatomical embeddings. In: Medical Image Computing and Computer Assisted Intervention-MICCAI 2021: 24th International Conference, Strasbourg, France, September 27-October 1, 2021, Proceedings, Part IV 24. Springer International Publishing; 2021:87-97.
- He Y, Wang A, Li S, Hao A. Hierarchical anatomical structure-aware based thoracic CT images registration. Comput Biol Med 2022;148:105876. [Crossref] [PubMed]
- Zhang Y, Yu H. Convolutional Neural Network Based Metal Artifact Reduction in X-Ray Computed Tomography. IEEE Trans Med Imaging 2018;37:1370-81. [Crossref] [PubMed]
- Brock KK, Mutic S, McNutt TR, Li H, Kessler ML. Use of image registration and fusion algorithms and techniques in radiotherapy: Report of the AAPM Radiation Therapy Committee Task Group No. 132. Med Phys 2017;44:e43-76. [Crossref] [PubMed]
- Hu R, Wang H, Ristaniemi T, Zhu W, Sun X, Lung CT. Image Registration through Landmark-constrained Learning with Convolutional Neural Network. Annu Int Conf IEEE Eng Med Biol Soc 2020;2020:1368-71. [Crossref] [PubMed]
- Hansen L, Heinrich MP. GraphRegNet: Deep Graph Regularisation Networks on Sparse Keypoints for Dense Registration of 3D Lung CTs. IEEE Trans Med Imaging 2021;40:2246-57. [Crossref] [PubMed]
- Rueckert D, Sonoda LI, Hayes C, Hill DL, Leach MO, Hawkes DJ. Nonrigid registration using free-form deformations: application to breast MR images. IEEE Trans Med Imaging 1999;18:712-21. [Crossref] [PubMed]
- Schmidt-Richberg A, Ehrhardt J, Werner R, Handels H. Fast explicit diffusion for registration with direction-dependent regularization. In: Biomedical Image Registration: 5th International Workshop, WBIR 2012, Nashville, TN, USA, July 7-8, 2012. Proceedings 5. Springer Berlin Heidelberg; 2012:220-8.
- Delmon V, Rit S, Pinho R, Sarrut D. Registration of sliding objects using direction dependent B-splines decomposition. Phys Med Biol 2013;58:1303-14. [Crossref] [PubMed]
- Pock T, Urschler M, Zach C, Beichel R, Bischof H. A duality based algorithm for TV-L 1-optical-flow image registration. In: International Conference on Medical Image Computing and Computer-Assisted Intervention. Berlin, Heidelberg: Springer Berlin Heidelberg; 2007:511-8.
- Vishnevskiy V, Gass T, Szekely G, Tanner C, Goksel O. Isotropic Total Variation Regularization of Displacements in Parametric Image Registration. IEEE Trans Med Imaging 2017;36:385-95. [Crossref] [PubMed]
- Gong L, Duan L, Dai Y, He Q, Zuo S, Fu T, Yang X, Zheng J. Locally Adaptive Total p-Variation Regularization for Non-Rigid Image Registration With Sliding Motion. IEEE Trans Biomed Eng 2020;67:2560-71. [Crossref] [PubMed]
- Wu Z, Rietzel E, Boldea V, Sarrut D, Sharp GC. Evaluation of deformable registration of patient lung 4DCT with subanatomical region segmentations. Med Phys 2008;35:775-81. [Crossref] [PubMed]
- Rietzel E, Chen GT. Deformable registration of 4D computed tomography data. Med Phys 2006;33:4423-30. [Crossref] [PubMed]
- Holland PW, Welsch RE. Robust regression using iteratively reweighted least-squares. Commun Stat-Theory Methods 1977;6:813-27.
- Zhang J, Yao Y, Deng B. Fast and Robust Iterative Closest Point. IEEE Trans Pattern Anal Mach Intell 2022;44:3450-66. [Crossref] [PubMed]
- Lange K. MM optimization algorithms. Philadelphia: Society for Industrial and Applied Mathematics; 2016.
- You C, Dai W, Min Y, Staib L, Duncan JS. Bootstrapping Semi-supervised Medical Image Segmentation with Anatomical-Aware Contrastive Distillation. Inf Process Med Imaging 2023;13939:641-53. [Crossref] [PubMed]
- You C, Dai W, Min Y, Staib L, Sekhon J, Duncan JS. ACTION++: Improving Semi-supervised Medical Image Segmentation with Adaptive Anatomical Contrast. Med Image Comput Comput Assist Interv 2023;14223:194-205.
- Ham B, Cho M, Ponce J. Robust image filtering using joint static and dynamic guidance. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2015:4823-31.
- Reddi SJ, Kale S, Kumar S. On the convergence of adam and beyond. arXiv:1904.09237. 2019. Available online: https://arxiv.org/abs/1904.09237
- Ronneberger O, Fischer P, Brox T. U-net: Convolutional networks for biomedical image segmentation. In: Medical Image Computing and Computer-Assisted Intervention-MICCAI 2015: 18th International Conference, Munich, Germany, October 5-9, 2015, Proceedings, Part III 18. Springer International Publishing. 2015:234-41.
- Hofmanninger J, Prayer F, Pan J, Röhrich S, Prosch H, Langs G. Automatic lung segmentation in routine imaging is primarily a data diversity problem, not a methodology problem. Eur Radiol Exp 2020;4:50. [Crossref] [PubMed]
- Castillo R, Castillo E, Guerra R, Johnson VE, McPhail T, Garg AK, Guerrero T. A framework for evaluation of deformable image registration spatial accuracy using large landmark point sets. Phys Med Biol 2009;54:1849-70. [Crossref] [PubMed]
- Castillo R, Castillo E, Fuentes D, Ahmad M, Wood AM, Ludwig MS, Guerrero T. A reference dataset for deformable image registration spatial accuracy evaluation using the COPDgene study archive. Phys Med Biol 2013;58:2861-77. [Crossref] [PubMed]
- Thanh D, Surya P. A review on CT and X-ray images denoising methods. Informatica 2019;43:151-9.
- Sandkühler R, Jud C, Andermatt S, Cattin PC. AirLab: autograd image registration laboratory. arXiv:1806.09907. 2018. Available online: https://arxiv.org/abs/1806.09907
- Jiang Z, Yin FF, Ge Y, Ren L. A multi-scale framework with unsupervised joint training of convolutional neural networks for pulmonary deformable image registration. Phys Med Biol 2020;65:015011. [Crossref] [PubMed]
- Fechter T, Baltas D. One-Shot Learning for Deformable Medical Image Registration and Periodic Motion Tracking. IEEE Trans Med Imaging 2020;39:2506-17. [Crossref] [PubMed]
- Fu Y, Lei Y, Wang T, Higgins K, Bradley JD, Curran WJ, Liu T, Yang X. LungRegNet: An unsupervised deformable image registration method for 4D-CT lung. Med Phys 2020;47:1763-74. [Crossref] [PubMed]
- Hering A, Häger S, Moltz J, Lessmann N, Heldmann S, van Ginneken B. CNN-based lung CT registration with multiple anatomical constraints. Med Image Anal 2021;72:102139. [Crossref] [PubMed]
- Mok TCW, Chung ACS. Large deformation diffeomorphic image registration with laplacian pyramid networks. In: Medical Image Computing and Computer Assisted Intervention-MICCAI 2020: 23rd International Conference, Lima, Peru, October 4-8, 2020, Proceedings, Part III 23. Springer International Publishing. 2020:211-21.
- Liu X, Qi CR, Guibas LJ. Flownet3d: Learning scene flow in 3d point clouds. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2019:529-37.
- Heinrich MP. Closing the gap between deep and conventional image registration using probabilistic dense displacement networks. In: Medical Image Computing and Computer Assisted Intervention-MICCAI 2019: 22nd International Conference, Shenzhen, China, October 13-17, 2019, Proceedings, Part VI 22. Springer International Publishing. 2019:50-8.
- You C, Yang J, Chapiro J, Duncan JS. Unsupervised wasserstein distance guided domain adaptation for 3d multi-domain liver segmentation. In: Interpretable and Annotation-Efficient Learning for Medical Image Computing: Third International Workshop, iMIMIC 2020, Second International Workshop, MIL3ID 2020, and 5th International Workshop, LABELS 2020, Held in Conjunction with MICCAI 2020, Lima, Peru, October 4-8, 2020, Proceedings 3. Springer International Publishing. 2020:155-63.
- You C, Dai W, Liu F, Min Y, Dvornek NC, Li X, Clifton DA, Staib L, Duncan JS. Mine Your Own Anatomy: Revisiting Medical Image Segmentation With Extremely Limited Labels. IEEE Trans Pattern Anal Mach Intell 2024; Epub ahead of print. [Crossref]
- You C, Dai W, Min Y, Staib L, Duncan JS. Implicit Anatomical Rendering for Medical Image Segmentation with Stochastic Experts. Med Image Comput Comput Assist Interv 2023;14222:561-71.
- You C, Dai W, Min Y, Liu F, Clifton DA, Zhou SK, Staib L, Duncan JS. Rethinking Semi-Supervised Medical Image Segmentation: A Variance-Reduction Perspective. Adv Neural Inf Process Syst 2023;36:9984-10021.
- You C, Mint Y, Dai W, Sekhon JS, Staib L, Duncan JS. Calibrating multi-modal representations: A pursuit of group robustness without annotations. In: 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). IEEE. 2024:26140-50.
- Kaso A. Computation of the normalized cross-correlation by fast Fourier transform. PLoS One 2018;13:e0203434. [Crossref] [PubMed]
- Wissmann L, Santelli C, Segars WP, Kozerke S. MRXCAT: Realistic numerical phantoms for cardiovascular magnetic resonance. J Cardiovasc Magn Reson 2014;16:63. [Crossref] [PubMed]
- Neelakantan S, Xin Y, Gaver DP, Cereda M, Rizi R, Smith BJ, Avazmohammadi R. Computational lung modelling in respiratory medicine. J R Soc Interface 2022;19:20220062. [Crossref] [PubMed]
- Neelakantan S, Mukherjee T, Smith BJ, Myers K, Rizi R, Avazmohammadi R. In-silico CT lung phantom generated from finite-element mesh. Proc SPIE Int Soc Opt Eng 2024;12928:1292829.
- Adam SP, Alexandropoulos SAN, Pardalos PM, Vrahatis MN. No free lunch theorem: A review. In: Demetriou IC, Pardalos PM. editors. Approximation and Optimization: Algorithms, Complexity and Applications. Cham: Springer; 2019:57-82.