Robust thoracic CT image registration with environmental adaptability using dynamic Welsch’s function and hierarchical structure-awareness strategy
Original Article

Robust thoracic CT image registration with environmental adaptability using dynamic Welsch’s function and hierarchical structure-awareness strategy

Ziwen Wei1,2 ORCID logo, Xiaolong Wu1, Ligang Xing3 ORCID logo, Jinming Yu3 ORCID logo, Junchao Qian1,2,3 ORCID logo

1Anhui Province Key Laboratory of Medical Physics and Technology, Institute of Health and Medical Technology, Hefei Institutes of Physical Science, Hefei Cancer Hospital, Chinese Academy of Sciences, Hefei, China; 2Science Island Branch, Graduate School of University of Science and Technology of China, Hefei, China; 3Department of Radiation Oncology, School of Medicine, Shandong University, Shandong Cancer Hospital and Institute, Shandong First Medical University and Shandong Academy of Medical Sciences, Jinan, China

Contributions: (I) Conception and design: Z Wei; (II) Administrative support: J Qian, J Yu, L Xing; (III) Provision of study materials or patients: Z Wei; (IV) Collection and assembly of data: Z Wei, X Wu, L Xing; (V) Data analysis and interpretation: Z Wei, X Wu; (VI) Manuscript writing: All authors; (VII) Final approval of manuscript: All authors.

Correspondence to: Junchao Qian, PhD. Anhui Province Key Laboratory of Medical Physics and Technology, Institute of Health and Medical Technology, Hefei Institutes of Physical Science, Hefei Cancer Hospital, Chinese Academy of Sciences, Shushanhu Road 350, Hefei 230031, China; Science Island Branch, Graduate School of University of Science and Technology of China, Hefei, China; Department of Radiation Oncology, School of Medicine, Shandong University, Shandong Cancer Hospital and Institute, Shandong First Medical University and Shandong Academy of Medical Sciences, Jiyan Road 440, Jinan 250117, China. Email: qianjunchao@hmfl.ac.cn; Jinming Yu, MD. Department of Radiation Oncology, School of Medicine, Shandong University, Shandong Cancer Hospital and Institute, Shandong First Medical University and Shandong Academy of Medical Sciences, Jiyan Road 440, Jinan 250117, China. Email: sdyujinming@163.com.

Background: Robust registration of thoracic computed tomography (CT) images is strongly impacted by motion during acquisition, high-density objects, and noise, particularly in lower-dose acquisitions. Despite the enhanced registration speed achieved by popular deep learning (DL) methods, their robustness is often neglected. This study aimed to develop a robust thoracic CT image registration algorithm to address the aforementioned issues.

Methods: A novel, anatomical structure-aware hierarchical registration. By this method, employing a divide-and-conquer approach, dissimilarity metrics, and regularization terms are selected for different regions based on their distinct image features and motion patterns. These terms are then innovatively reconstructed using the Welsch’s function, which allows control over the penalty distribution on the loss values. Subsequently, a novel Welsch parameter update strategy is designed for the task of thoracic CT image registration, enabling dynamic sparsity in registration from coarse to fine levels to accommodate various levels of noise and sliding motion. Moreover, the majorization-minimization (MM) algorithm is used to handle the Welsch terms by constructing surrogate functions based on the current variable values for variable update, thereby reducing the complexity of optimization.

Results: Experimental results on publicly available deformable image registration lab four-dimensional CT (DIR-Lab 4DCT) and chronic obstructive pulmonary disease (COPD) datasets with and without noise, showed that our proposed method achieves comparable performance to state-of-the-art methods in noise-free scenarios [1.14 and 1.19 mm compared to 1.14 and 1.35 mm target registration errors (TREs)], while demonstrating superior robustness in the presence of noise (1.78 and 2.38 mm compared to 2.00 and 3.31 mm TREs). Ablation studies also validated the effectiveness of each component in the method.

Conclusions: A novel and robust algorithm for thoracic CT image registration has been proposed, which has significant potential for valuable clinical applications, including surgical quantitative imaging.

Keywords: Thoracic computed tomography registration (thoracic CT registration); Welsch’s function; robust registration; majorization-minimization (MM); anatomical structure-aware strategy


Submitted Mar 24, 2024. Accepted for publication Sep 29, 2024. Published online Nov 29, 2024.

doi: 10.21037/qims-24-596


Introduction

Thoracic computed tomography (CT) image registration serves as a fundamental basis for various clinical applications, including disease progression estimation (1,2), image-guided surgery (3,4), radiation-therapy planning (5), tumor tracking (6), and real-time CT reconstruction (7-9). All these applications rely on precise and robust estimation of spatial anatomical correspondences. For instance, in the field of surgical quantitative imaging, aligning anatomical structures over time is essential for precise monitoring of disease progression and tumor radiotherapy responses. Additionally, in surgical planning, precise registration of preoperative and intraoperative images is critical for accurate navigation and targeting throughout the procedure. Registration methods can generally be classified into two categories: optimization-based and deep learning (DL)-based. DL-based methods initially dominated the field of medical image segmentation (10-13), and in recent years, with the introduction of VoxelMorph (14), they have rapidly expanded into the field of medical image registration (15-17). Although DL-based registration methods have improved efficiency, there is still room for improvement in robustness and generalization compared to optimization-based methods (18).

Within the scope of thoracic CT imaging, different anatomical structures exhibit varied image features. For instance, the air within the lungs and other internal organs possesses relatively homogeneous intensities, whereas intensity variations are observed in the blood vessels within the lungs. Therefore, selecting an appropriate and consistent dissimilarity metric is challenging.

Moreover, it is necessary to consider the different motion patterns in motion-sensitive regions. For example, during CT imaging in the supine position, the spine exhibits rigid motion, whereas other regions, such as the heart, liver, and ribs, demonstrate smooth motions (18). This further complicates the selection of an appropriate regularization method for the displacement field.

Additionally, thoracic CT images often suffer from artifacts and noise due to the inherent invisibility of internal structures and limitations in imaging devices, imaging doses, and scanning conditions. For instance, cone-beam CT (CBCT) imaging generates high noise due to low doses, and the presence of implanted metal objects in the patient’s body can cause noticeable star-shaped or streak artifacts (19). In many scenarios, it is necessary to perform registration of high-noise CT images, such as in image-guided radiation therapy, where the pre-treatment low-noise CT image needs to be registered to the low-dose acquired high-noise real-time CBCT image to clearly visualize the target anatomical structures (3,20). Therefore, these challenges pose significant obstacles to robust thoracic CT image registration. To enhance the safety and reliability of thoracic CT image registration in clinical applications, it is imperative to develop a robust registration algorithm.

Related work

For various image features in different regions, Hu et al. (21) incorporated a structural alignment constraint, specifically the branch landmarks of the vessels and airway, into the lung registration process. Liu et al. (17) further proposed a mechanism to enhance anatomical embedding for calculating global anatomical point correspondences. However, these methods often focus solely on the lung region (22), disregarding or sacrificing the registration accuracy in other regions. Therefore, He et al. (18) proposed a deformation-aware dissimilarity metric, which utilizes dissimilarity metrics tailored to the deformation characteristics of each specific region, resulting in improved global registration accuracy. However, their method does not account for noise commonly caused by low-dose CT.

For various motion patterns in different regions, the commonly used regularization methods (23) aimed at achieving smooth deformations may result in unrealistic deformations at sliding boundaries of organs. To address this issue, Schmidt-Richberg et al. (24) and Delmon et al. (25) divided the sliding boundaries of organs, allowing for discontinuous deformations. Furthermore, to explicitly constrain sliding motion, Pock et al. (26) proposed total variation (TV) regularization utilizing the l1-norm, assuming sliding motion takes place when the updated deformation exceeds a predetermined threshold. In order to improve registration speed, Vishnevskiy et al. (27) introduced the parametric-TV (pTV) method, which integrates TV regularization into the free-form deformation (FFD) B-spline framework to apply constraints to the control points and achieve a significant reduction in the solution space. Recently, Gong et al. (28) further enhanced the pTV method by introducing local adaptive regularization, where the level of the norm is defined based on the distance to the sliding surface. However, their dissimilarity metric is global and only applied at the coarse level, failing to account for varying degrees of sliding motion that may occur simultaneously.

To enhance the robustness of registration, most methods (29,30) employ a masking strategy to restrict the deformation within certain regions and avoid interference from other regions. Nevertheless, these approaches do not explicitly enhance robustness by attenuating noise. Recently, Zhang et al. introduced the Welsch’s function (31) into rigid point cloud alignment (32), directly mitigating the influence of noise and enhancing robustness. However, their method is only applicable to rigid point cloud alignment, which differs significantly from non-rigid volumetric image registration in terms of the construction of optimization objectives, and there has been limited research on the robustness of methods in recent volumetric image registration, especially those based on DL.

Contribution

In this work, we propose a robust CT image registration algorithm. The key idea of our approach is to leverage Welsch’s function (31) to enforce sparsity, as well as the adoption of a divide-and-conquer strategy to handle deformations in different regions. Experimental results validate the accuracy and robustness of our method.

In summary, our contributions mainly consist of:

  • We devised a novel hierarchical anatomical structure-aware thoracic CT image registration scheme that specifically targets intra-patient and intra-session registrations. By leveraging different image features, we designed specific dissimilarity metrics for individual regions. Furthermore, considering various motion patterns, appropriate regularization was tailored for each region.
  • We formulated an optimization problem for volumetric image registration, wherein we innovatively used the smooth Welsch’s function with sparse control capability to reconstruct the dissimilarity metric and regularization term in this task. The majorization-minimization (MM) algorithm (33) was utilized to handle the Welsch term. This algorithm constructs a surrogate function of the objective function based on the current variable values and minimizes this surrogate functions to update the variables, thereby reducing the complexity of the optimization process.
  • We devised a novel dynamic parameter update scheme for the Welsch’s function to achieve image-environment-adaptive registration by adjusting tolerance for noise and sliding motion.
  • Experimental results demonstrated that our proposed method achieves comparable performance to state-of-the-art methods in noise-free scenarios, while demonstrating superior robustness in the presence of noise.

Methods

Overview

As shown in Figure 1, a hierarchical strategy is employed, where different regions are merged onto the control lattice, enabling coarse-to-fine image registration through an elaborate Gaussian pyramid to cope with large-scale deformations.

Figure 1 Overview of our FFD-based hierarchical framework for robust thoracic CT image registration. The moving and fixed images are segmented to generate masks for dissimilarity metric and displacement field regularization. A five-level Gaussian pyramid is constructed for all images, with increasing resolution as the pyramid level increases. Registration is performed iteratively, starting from the lower levels and progressing towards higher levels for coarse-to-fine registration. At each level, the control lattice is utilized to govern the displacement field of the entire image, and an objective function using Welsch’s function is constructed based on the dissimilarity metric and regularization. The objective function is optimized using an optimizer, with the Welsch parameters updated after N iterations. This process yields the optimal displacement field for the current level which is subsequently upsampled to serve as the initial displacement field for the next level. Finally, the optimized displacement field obtained from the last level is applied to the moving image to obtain the registered image. FFD, free-form deformation; CT, computed tomography.

Problem formulation

Let Ω={(x,y,z)|0xX,0yY,0zZ} represent the domain of the CT image, where X, Y, and Z respectively denote the length, width, and depth dimensions. The objective of the registration process is to determine a three-dimensional (3D) displacement field d:Ω3 that effectively aligns the moving image IM:Ω with the fixed image IF:Ω. The estimation of the displacement field is commonly approached as an optimization problem, given by:

d*=argmindF(d)=argmindD(Ω;d;IF,IM)+αR(d)

where F is the objective function to be minimized, D is an image dissimilarity metric term, and R is a displacement regularization term, whereas α is the weight of regularization.

Image dissimilarity metric term

The pursuit of a globally unified dissimilarity metric often entails sacrificing local registration accuracy in exchange for overall precision (18). Inspired by (34,35), we adopted an anatomical structure-aware dissimilarity metric framework.

For the bone region, it primarily undergoes rigid displacement or smooth deformation as a whole, thus necessitating a holistic assessment (18). Normalized cross-correlation (NCC), a statistical approach for quantifying the correlation between random variables, emerges as a suitable method for evaluating skeletal regions. NCC can be expressed as:

DNCC(Ω;d;IF,IM)=xΩ(IF(x)IF¯)(T(IM,d)(x)IM¯)xΩ(IF(x)IF¯)2xΩ(T(IM,d)(x)IM¯)2

where x3 is the voxel position in the image, T(IM,d)(x) represents the intensity value of the voxel at x in the image IM  after undergoing a transformation through the displacement vector field d, and IF¯ and IM¯ represents the average intensity of IF and IM, respectively.

For the lung region, the majority of air within the lungs in CT images exhibits homogeneity, makes it unsuitable for guiding registration. Conversely, the rich vascular structure in the lung can serve as a reliable guide for localization. The normalized gradient field (NGF), based on normalized image gradients, is independent of specific intensity values, making it well-suited for capturing the local structure of blood vessels and an appropriate metric of dissimilarity in lung region. NGF can be expressed as:

DNGF(Ω;d;IF,IM)=1|Ω|xΩ(1(IF(x),T(IM,d)(x)ηIF(x)ηT(IM,d)(x)η)2)

where f,gη=j=13fjgj+η2,fη=f,fη, |Ω| is the number of voxels in the image, η is a small value to avoid numerical error, x and T(IM,d)(x) are defined the same as they are in DNCC, and gradient =(1,2,3), i is the i-th component of the derivative operator.

For the remaining regions, the intensity values within organs and soft tissues are mostly uniform, making these homogeneous regions unsuitable for guiding registration. However, the edges formed by the intensity differences between different soft tissues and organs can be utilized for localization guidance. Therefore, we also employed NGF as a metric of dissimilarity in this region.

Additionally, factors such as motion artifacts and metal artifacts may deteriorate the registration accuracy. Inspired by recent advancements in robust rigid point cloud alignment (32), this study introduces the Welsch’s function (31) as a robust metric to dynamically adjust sensitivity to noise at different optimization stages. The following WelschNGF (WN) formulation is derived:

DWNGF(Ω;d;IF,IM)=1|Ω|xΩψνd(1IF(x),T(IM,d)(x)ηIF(x)ηT(IM,d)(x)η)

where νd>0 is a user-specified parameter, and

ψν(x)=1exp(x22ν2)

As illustrated in Figure 2A, the function ψν exhibits a monotonically increasing behavior on the interval [0,+), while being upper-bounded by 1. Moreover, as νd0, Eq. [5] approaches the l0-norm. Consequently, it can penalize dissimilarity without being overly sensitive to large dissimilarities caused by outliers. Furthermore, compared to the original form, this formulation also distinguishes cases where the angle between the gradients of corresponding point intensities exceeds π/2. Additionally, as demonstrated in the “Numerical optimization” section later, an efficient optimization approach can be designed for the Welsch term.

Figure 2 Graph of the Welsch function. (A) Function ψν with different values of the parameter ν. (B) Different surrogate functions for the function ψν with ν=4.

Displacement field regularization term

Similar to the “Image dissimilarity metric term” section, we also employ anatomical structure-aware regularization to obtain plausible deformations.

For the spine region, due to the minimal movement of the spine, direct penalty is imposed on the displacement itself. Specifically, parameter non-gradient regularization (pNG) based on the convex squared l2-norm is employed to minimize spinal displacement.

RpNG(p)=1|Ω|xΩp(x)22

where Ω represents the control lattice on the image, |Ω| is the total number of control points. p(x) corresponds to the displacement vector at x3.

For the remaining region, the difference between displacement vectors of adjacent control points are typically penalized using the q-th power of the lp-norm. In this study, we incorporate the smooth Welsch’s function into the regularization term to enhance robustness,

RWelsch(p)=1|Ω|xΩψνr(p(x)2)

where νr>0 is a user-specified parameter.

Numerical optimization

Efficient optimization of terms utilizing Welsch’ function can be achieved using the MM algorithm (33). Specifically, given the current variable values p(k) during iteration, the MM algorithm can construct a surrogate function F(p|p(k)) for the objective function F such that

F(p(k)|p(k))=F(p(k))F(p|p(k))F(p)pp(k)

Subsequently, the variables undergo updates through minimizing the surrogate function

p(k+1)=argminpF(p|p(k))

This ensures a reduction of the objective function during each iteration, as Eq. [8] and Eq. [9] imply that F(p(k+1))F(p(k+1)|p(k))F(p(k)|p(k))=F(p(k)).

Therefore, the numerical optimization is assured to attain a local minimum (33).

Surrogate function

At point y, a convex quadratic surrogate function F(p|p(k)) can be formulated for the Welsch’s function ψν (36) (see Figure 2B):

ψν(x|y)=ψν(y)+1ψν(y)2ν2(x2y2)

The function bounds Welsch’s function from above, and the graphs of two functions intersect at x=y. A readily optimized convex quadratic surrogate term is obtained by applying it to RWelsch in Eq. [7]:

RSWelsch(p|p(k))=12|Ω|νr2xΩexp(p(k)(x)222νr2)p(x)22

Note that in Eq. [11], certain constant terms that do not affect the optimization were disregarded. Similarly, by applying Eq. [11] to DWNGF in Eq. [4] and disregarding constant terms, a surrogate term can be obtained:

DSWNGF(Ω;d|d(k);IF,IM)=12|Ω|νd2×xΩexp(12νd2(1IF(x),T(IM,d(k))(x)ηIF(x)ηT(IM,d(k))(x)η)2)           ×(1IF(x),T(IM,d)(x)ηIF(x)ηT(IM,d(k))(x)η)2

In the “Ablations studies” section, we present a comparative analysis between this surrogate function and the original function DWNGF to evaluate its potential in reducing optimization complexity.

Numerical minimization

To address the displacement field folding issue, we propose incorporating dissimilarity metrics and regularization for different regions into a unified objective function by summing up the individual terms. Next, the objective function F is minimized using the AMSGrad optimizer (37).

Dynamic update strategy

The parameters νd and νr play crucial roles in the robustness of our method. The surrogate functions in Eq. [10] and Eq. [11] indicate that we are minimizing a quadratic error function, where the weights of the WN term and Welsch regularization (WR) term dynamically update as Gaussian functions of the current loss norms, with standard deviations νd and νr, respectively. On the one hand, νd and νr should be small enough in the final stage of optimization, enabling the Gaussian weights to proficiently attenuate the influence of larger errors caused by noise. On the other hand, fixing νd and νr at small values would reduce the effectiveness of optimization because many corresponding points have large loss at the beginning and are disregarded due to their tiny Gaussian weights.

Based on this, we gradually decrease νd and νr during the optimization. Specifically, for each iteration of optimization that yields d(k), we set νd to the mean NGF dissimilarity of all corresponding points transformed by d(k) for the next iteration. According to the well-known 3-σ principle, when the NGF spatial distance is less than νd, the weight is larger, allowing for a sufficient number of terms in each optimization step, whereas when it exceeds 3νd, the weight becomes small enough to exclude noisy corresponding points from registration. As the optimization progresses and the average dissimilarity gradually decreases, νd also decreases. For νr, when it is in the first level of the Gaussian pyramid, the number of control points is limited due to the low resolution. Strict control of sliding is necessary to prevent error propagation and accumulation between levels. We set the initial value to 100/2sl, where s represents the lattice spacing and l represents the average voxel spacing. Since the largest sliding occurs at the lung boundary (18), as shown in Figure 3, where the cubes represent voxels, we assume that the lung boundary (blue) can shift by up to 100 mm along each axis parallel to the sliding plane, while the spine (gray) remains stationary. Consequently, the l2-norm gradient penalty of the displacement vector of the boundary voxel (i.e., the middle voxel) is 100/2sl. Thus, the initial value of νr is sufficient to strictly penalize sliding. In the subsequent levels of the Gaussian pyramid, νr is first updated with the new voxel spacing l and then reduced to a quarter of its own value to penalize less sliding.

Figure 3 Schematic representation of voxels at the boundary between the lung (blue) and the spine (gray). Assuming that the left, top, bottom, front, and back neighboring voxels of the middle voxel undergo a 100 mm displacement in the axes parallel to the sliding plane (orange arrow), while the right neighboring spine voxel remains stationary. For the middle voxel: x=(0,100/(2sl),100/(2sl)),y=(0,0,0),z=(0,0,0), 2=100/(2sl) where s represents the lattice spacing, represents the voxel spacing.

Thoracic anatomical structure segmentation

Segmentation for dissimilarity metric

The segmentation of lung region (blue), bone region (red), and the remaining region (gray) is performed (see Figure 4A). Specifically, for the lung region, we employ a model based on U-Net (38) proposed by Hofmanninger et al. (39) for automatic segmentation, ensuring that all parameters align with the original open-source code provided by their work. The hyperparameters, architecture, and training methods of the model can be referenced in (39). This model was trained on a large and diverse dataset encompassing extensive visual variations. For the bone region, due to its higher intensity compared to other areas, we can obtain the bone mask rapidly through threshold-based segmentation. Specifically, binarization is performed using a threshold of 1,200 Hounsfield units (HU), followed by applying morphological closing operations. Additionally, small fragments are removed by eliminating connected regions with areas smaller than 10. Finally, the remaining region is obtained by subtracting the lung and bone masks from the entire image.

Figure 4 The segmentation example in the thoracic CT image. (A) The masks for dissimilarity metrics. The blue, red, and gray regions represent the lung, the bone, and the remaining region, respectively. (B) The masks for regularization. The red, blue, and gray regions represent the morphologically dilated lung, the spine, and the remaining region, respectively. CT, computed tomography.

Segmentation for displacement regularization

As shown in Figure 4B, the segmentation of spine region (blue) and the remaining region (gray and red) is performed for applying different regularization to these regions (see the “Problem formulation” section). To begin with, a morphological dilation operation is applied to the lung mask to obtain the expanded lung (red). Subsequently, the bone and the expanded lung undergo Boolean operations, followed by extraction of the largest connected subregion. This subregion is then merged with the central region of the original bone mask, yielding the generation of a spine mask. Finally, the remaining region is obtained by subtracting the spine mask from the entire image.

Hierarchical strategy

The thoracic CT images exhibit large-scale deformation, and direct registration may lead to undesired local minima. Therefore, as illustrated in Figure 1, based on the given number of levels, we embed the dissimilarity metric and regularization masks into a Gaussian pyramid.

Figure 5 presents the pseudocode of our robust CT image registration algorithm, where L denotes the number of levels in the Gaussian pyramid, vdl and vdr are parameters of the Welsch function in the real-time updated dissimilarity metrics for the lung and remaining regions, and Ms and Mr are binary masks for the spine and remaining regions.

Figure 5 Pseudocode of the proposed robust thoracic CT images registration algorithm. CT, computed tomography; NGF, normalized gradient field.

Results

Dataset preparation

Conventional dataset

Experiments were conducted on the publicly available datasets deformable image registration lab four-dimensional CT (DIR-Lab 4DCT) (40) and chronic obstructive pulmonary disease (COPD) (41). Each dataset consisted of 10 cases, with 300 anatomical landmarks per phase. Registration was performed between the end-expiration and end-inspiration phases, and the average TRE was calculated for the 300 anatomical landmarks (ground-truth label). Since landmarks only exist within the lungs, although our method also models registration of bony anatomy outside the lungs and deformation of soft tissues, we evaluated the overall registration performance using root mean square error (RMSE), calculated as follows:

RMSE=1X×Y×ZxIM(x)IF(x)2

Compared to the 4DCT dataset, the COPD dataset exhibited larger motion range (see Figure 6).

Figure 6 Displacement fields for 4DCT case1 (A) and COPD case 10 (B) are color-coded with the magnitude of displacement. 4DCT, four-dimensional computed tomography; COPD, chronic obstructive pulmonary disease.

Dataset with noise

Registration robustness is crucial as there are many scenarios where CT images need to be registered to lower-resolution and noisier images, such as in image-guided radiation therapy when preoperative CT images are registered to intraoperative CBCT images. To evaluate the robustness of our method, we applied the algorithm proposed in (19) to add simulated metal artifacts and Poisson noise to the 4DCT and COPD datasets. These two types of noise are the most common in CT imaging and are simulated in various studies involving low-dose CT (19,42). The variance of Poisson noise is set to 12, and for the detailed generation process and parameters of metal artifacts, please refer to (19). An example of noise addition is shown in Figure 7, where Figure 7A represents a slice image of COPD case 10, and Figure 7B is the image after adding noise. In the robustness validation experiment, the original datasets were registered to the noisy datasets. Furthermore, the TRE results on the 4DCT dataset with noise will be tested by reducing resolution and employing a Gaussian filter, used respectively for validating the method’s effectiveness on low-resolution images and comparing with denoising followed by registration. As shown in Figure 7, Figure 7C depicts the outcome of Figure 7B being reduced to one-fourth of the original resolution, whereas Figure 7D shows the result of Figure 7B being subjected to a Gaussian filter with a kernel size of 11.

Figure 7 Comparison between the original image and the image with added simulated noise and metal artifacts. (B) The image is the result of adding noise to (A). (C) The image is the result of directly resizing both the length and width of (B) to half of the original dimensions. (D) The image showcases the output of applying a Gaussian filter with a kernel size of 11 to (B).

Implementation details

The entire registration procedure was implemented based on the medical image registration framework provided by AirLab (43). All images were resampled to isotropic 1×1×1 mm3 voxel spacing and subsequently cropped to eliminate redundant background. The image intensities were clipped between 0 and 2,000 HU for handling noise points, then converted to double precision and scaled to the range [0, 1]. Due to the higher intensity values of the images in the dataset compared to typical CT images, the upper and lower bounds of our clip operation were correspondingly increased. Bilinear interpolation was employed for image deformation. The dissimilarity metric weights for lung, bone, and the remaining region were set to 2.5, 0.03, and 0.03, respectively, whereas the displacement field regularization weights for the spine and the remaining region were set to 0.04 and 0.04, respectively. A five-level Gaussian pyramid with an upsampling factor of 0.5 was utilized. The spacing for control points was set to 7×7×7 voxels. The AMSGrad optimizer (37) was employed to optimize the objective function. The initial step size of the optimizer was set to 0.005, and the iteration counts for the five Gaussian pyramid levels were 800, 800, 800, 100, and 50, respectively. The convergence threshold ϵ was set to 10−5. Using the PyTorch 1.8.1+cu111 library (https://pytorch.org/) and running on an NVIDIA RTX 3090 GPU (NVIDIA, Santa Clara, CA, USA), the average registration time was 96 seconds.

Parameter sensitivity

As shown in Figure 8, the sensitivities of dissimilarity metric weights, regularization weights, Gaussian pyramid levels, and control point lattice spacing were tested on the 4DCT DIR dataset with 300 landmarks.

Figure 8 Parameter sensitivity analysis for (A) the lung dissimilarity metric weight, (B) the regularization weight, (C) the number of levels in the Gaussian pyramid and (D) the lattice spacing. TRE, target registration error.

Following (18), in the initial setup, the weights for the bone region and the remaining regions in the dissimilarity metric were both assigned a value of 0.03. The regularization weight for each region was set to 0.04, the number of levels in the Gaussian pyramid was 5, and the lattice spacing was 7. Following this, each hyperparameter was individually optimized, and the best values were then fixed for the subsequent hyperparameter search. First, the weight for the dissimilarity metric in the lung region was then tested within the range of [0.25, 4]. The blue curve in Figure 8 shows that initially, a larger weight led to better TRE, but after reaching 2.5, the TRE slightly increased. Therefore, a weight of 2.5 was set for the dissimilarity metric in the lung region. Next, the weights for the regularization in different regions were tested within the range of [0.001, 0.07], following (18) where the weights were set to the same value. The red curve in Figure 8 demonstrates that a weight of 0.04 yielded good results. Therefore, a weight of 0.04 was chosen for the regularization in this study. Then, the green curve in Figure 8 shows that initially, TRE decreased as the number of Gaussian pyramid levels increased, but after reaching five levels, TRE became insensitive to further increases in the number of levels. At this point, the computational cost significantly increased without a significant improvement in TRE. To enhance registration efficiency, five levels were selected. Furthermore, as depicted by the yellow curve in Figure 8, both too few and too many control points could lead to a decrease in registration accuracy. Therefore, this study opted for a relatively optimal control point lattice spacing of seven voxels. Note that we only optimized the hyperparameters on the publicly available dataset to ensure fairness in the comparisons on the non-public datasets.

Comparison methods

For the conventional dataset, we compared our algorithm with state-of-the-art methods, including MJ-CNN (44), One-Shot (45), LRN (46), Hering2021 (47), GRN (22), He (18), LapIRN (48), FlowNet (49), PDDN (50), and VM (11). Most of the comparison methods only reported results on the 4DCT dataset and did not provide results on the COPD dataset, so not all methods were compared on the COPD dataset.

None of the compared methods reported results on datasets with noise. Therefore, for the noisy dataset, to compare with state-of-the-art methods, two optimization-based methods [pTVReg (27) and He (18)] and one DL-based method [GRN (22)] were reproduced using their respective open-source codes [Pytorch 1.8.1+cu111 for He and GRN, Matlab R2021a (MathWorks, Natick, MA, USA) for pTVReg] for fair comparison. DL-based registration models often have limited generalization capabilities (18). To ensure fairness, we conducted unsupervised training of the model using a combination of conventional and noisy mixed datasets, which took around 7 hours on our graphics card.

Ablations studies

NoWelsch

NoWelsch refers to not using the Welsch’s function. In this case, the square of the modified NGF {as modified in Eq. [4]} is used for dissimilarity metric in non-bone regions, and the squared l2-norm is used for regularization in non-bone regions. NoWelsch is also equivalent to fixing the Welsch parameters νd and νr at larger values.

WR

Compared to NoWelsch, the regularization for non-bone regions was replaced with the WR in Eq. [7] to evaluate the effect of WR.

Surrogate WR (SWR)

Compared to WR, the regularization for non-bone region was replaced with the surrogate function in Eq. [11], to evaluate the effect of using a surrogate function in SWR.

WN + SWR

Compared to SWR, the dissimilarity metric in non-bone region was replaced with the WN in Eq. [4] to verify the effectiveness of the WN dissimilarity metric.

Ours

The method was ultimately employed in this work. Compared to WN + SWR, the dissimilarity metric in the non-bone region was replaced with the surrogate function in in Eq. [12] to evaluate the effect of using a surrogate function in WN dissimilarity metric.

Experimental results

TRE

Table 1 presents the TRE on the conventional dataset. In the 4DCT dataset, our method achieved the best average TRE of 1.14 mm and standard deviation of 0.11 mm. In the COPD dataset, our method achieved the second-best average TRE of 1.69 mm and standard deviation of 0.42 mm, and outperformed the recent optimization-based registration method (He with TRE of 2.05 mm and standard deviation of 0.99 mm). The DL-based registration method, GRN, achieved the best TRE (1.35 mm) on the COPD dataset, yet performed poorly on the noisy dataset (see Table 2).

Table 1

TRE results (in mm) of the DIR-Lab 4DCT (40) and COPD datasets (41)

Case Init. MJ-CNN (44) One-Shot (45) LRN (46) Hering 2021 (47) GRN (22) He (18) Lap IRN (48) FE+ (49) PD DN (50) VM (11) NoWelsch WR SWR WN + SWR Ours
4DCT
   01 3.89 1.20 1.21 0.98 0.99 0.86 0.97 1.00 2.20 0.90 1.46 1.05 1.03 1.05 1.05 1.04
   02 4.34 1.13 1.13 0.98 0.98 0.90 0.91 1.28 3.89 0.91 1.51 0.99 0.99 1.01 1.00 0.98
   03 6.94 1.30 1.32 1.14 1.11 1.06 1.04 2.18 2.71 1.06 2.31 1.11 1.12 1.02 1.04 1.05
   04 9.83 1.55 1.84 1.39 1.37 1.45 1.37 3.05 2.95 1.66 2.72 1.41 1.40 1.31 1.30 1.29
   05 7.48 1.72 1.80 1.43 1.32 1.60 1.36 2.36 3.03 1.68 2.69 1.38 1.35 1.35 1.34 1.31
   06 10.89 2.02 2.30 2.26 1.15 1.59 1.16 1.78 3.36 1.86 3.07 1.18 1.18 1.15 1.20 1.16
   07 11.03 1.70 1.91 1.42 1.05 1.74 1.18 2.24 3.10 1.94 3.01 1.15 1.15 1.15 1.15 1.16
   08 14.99 2.64 3.47 3.13 1.22 1.46 1.20 2.24 2.94 1.79 6.22 1.25 1.27 1.23 1.21 1.19
   09 7.92 1.51 1.47 1.27 1.11 1.58 1.19 2.26 2.86 1.94 2.94 1.20 1.17 1.20 1.15 1.15
   10 7.30 1.79 1.79 1.93 1.05 1.71 1.07 1.90 2.99 2.03 3.00 1.16 1.15 1.08 1.08 1.06
   Avg 8.46 1.66 1.83 1.59 1.14 1.40 1.14 2.03 3.00 1.57 2.89 1.19 1.18 1.16 1.15 1.14
   Std 3.33 0.42 0.65 0.64 0.13 0.31 0.14 1.25 0.55 0.41 0.42 0.13 0.12 0.11 0.11 0.11
COPD
   01 26.33 / / / / 1.38 1.12 6.85 4.89 2.57 9.95 1.39 1.35 1.36 1.39 1.39
   02 21.79 / / / / 2.09 3.20 6.90 7.30 4.01 9.96 3.75 3.30 3.25 2.50 2.48
   03 12.64 / / / / 1.22 1.20 1.51 2.89 1.46 4.41 1.34 1.32 1.30 1.31 1.30
   04 29.58 / / / / 1.58 2.07 6.38 5.46 2.19 7.08 1.88 1.93 1.91 1.89 1.86
   05 30.08 / / / / 1.37 1.50 6.81 5.19 2.22 9.19 1.94 1.86 1.89 1.65 1.63
   06 28.46 / / / / 1.10 2.24 4.19 5.53 1.89 8.12 1.54 1.54 1.55 1.50 1.49
   07 21.60 / / / / 1.19 1.22 2.73 4.40 1.62 7.10 1.25 1.21 1.25 1.22 1.19
   08 26.46 / / / / 1.19 3.77 4.32 3.94 1.72 7.92 2.46 2.31 1.74 1.55 1.52
   09 14.89 / / / / 0.99 1.11 3.60 3.57 1.51 6.93 2.42 2.31 2.30 2.28 2.27
   10 21.81 / / / / 1.38 3.03 6.59 4.44 2.43 9.16 5.36 4.64 3.86 1.81 1.80
   Avg 23.36 / / / / 1.35 2.05 7.98 4.99 4.76 2.16 2.33 2.18 2.04 1.71 1.69
   Std 5.99 / / / / 0.31 0.99 1.70 1.98 1.23 0.76 1.30 1.07 0.87 0.42 0.42

Metrics of the comparison methods were taken from the literature. , the best; , the second best. “/” indicates the results on COPD dataset were not reported in their literature. TRE, target registration error; DIR-Lab 4DCT, deformable image registration lab four-dimensional computed tomography; COPD, chronic obstructive pulmonary disease; init., initial result; WR, Welsch regularization; SWR, surrogate Welsch regularization; WN, WelschNGF; Avg, average value; Std, standard deviation.

Table 2

TRE results (in mm) of the DIR-Lab 4DCT (40) and COPD datasets (41) with simulated metal artifacts and noise

Case GRN (22) pTVReg (27) He (18) NoWelsch WR SWR WN + SWR Ours
N4
   01 2.41 2.06 1.45 1.45 1.45 1.42 1.40 1.40
   02 2.05 1.98 1.43 1.47 1.52 1.40 1.40 1.40
   03 3.65 2.96 1.69 1.66 1.65 1.62 1.62 1.60
   04 5.76 1.87 1.81 1.88 1.83 1.79 1.74 1.78
   05 4.33 4.67 2.11 2.08 1.99 1.98 1.98 1.89
   06 9.01 2.19 2.37 2.28 2.05 2.04 2.01 1.80
   07 8.62 3.90 2.36 2.31 2.16 2.13 2.08 2.02
   08 11.10 4.21 2.33 2.40 2.30 2.21 2.20 2.26
   09 6.30 1.89 2.28 2.20 1.85 1.97 2.02 1.74
   10 5.51 3.98 2.21 2.24 2.00 2.02 1.99 1.89
   Avg 5.87 2.97 2.00 2.00 1.88 1.86 1.84 1.78
   Std 2.96 1.11 0.38 0.36 0.28 0.29 0.29 0.27
NC
   01 18.50 5.70 2.49 2.25 2.04 2.15 2.05 1.97
   02 14.43 3.51 4.89 4.37 3.99 3.99 3.30 3.05
   03 7.92 2.28 2.22 2.16 1.99 2.00 1.99 1.98
   04 17.45 3.18 3.63 2.59 2.51 2.49 2.38 2.37
   05 16.60 2.61 2.81 2.67 2.80 2.57 2.38 2.37
   06 19.53 3.94 2.77 2.51 2.51 2.45 2.24 2.24
   07 14.46 2.06 2.08 2.19 2.01 2.07 1.81 1.89
   08 15.87 2.81 3.51 3.48 2.67 3.10 2.28 2.26
   09 7.94 3.63 3.41 3.22 3.15 3.13 2.96 2.88
   10 13.93 4.47 5.27 5.87 4.28 3.33 2.78 2.75
   Avg 14.66 3.42 3.31 3.13 2.80 2.73 2.42 2.38
   Std 3.98 1.10 1.08 1.19 0.80 0.64 0.46 0.40

Metrics of the comparison methods were obtained from local experiments. The initial landmark distance is the same as in Table 1. , the best; , the second best. TRE, target registration error; DIR-Lab 4DCT, deformable image registration lab four-dimensional computed tomography; COPD, chronic obstructive pulmonary disease; WR, Welsch regularization; SWR, surrogate Welsch regularization; WN, WelschNGF; N4, noisy 4DCT; Avg, average value; Std, standard deviation; NC, noisy COPD.

Table 2 presents the TRE on the noisy dataset. The results demonstrate that our method achieved the best average TRE and standard deviation on both the noisy 4DCT and noisy COPD datasets, validating the robustness of our approach. Among the optimization-based methods (our method, He, and pTVReg), He employs similar anatomical-aware dissimilarity metrics and regularization as in this study, whereas pTVReg adopts a global strategy. The results indicate that He outperformed pTVReg, highlighting the positive effect of anatomical awareness on robustness. Furthermore, our method outperformed He, confirming the effectiveness of the Welsch’s function. Although GRN achieved good results on the conventional COPD dataset, it exhibited larger errors on the noisy dataset. To analyze the reasons, we plotted the loss curve during training for GRN, as shown in Figure 9. It can be observed that the curve fluctuated significantly and continuously, possibly due to the failure of feature point extraction in the presence of excessive noise and the absence of a Gaussian pyramid strategy to prevent registration from getting trapped in undesired local minima. Furthermore, the generalization of DL-based methods still needs improvement, requiring more data, longer training time, and better techniques. In contrast, optimization-based methods exhibited more stable results, likely due to their independence from the dataset and typically higher interpretability and controllability.

Figure 9 Loss curve of GraphRegNet (22) on a noisy dataset: dark section represents smoothed curve, light section represents original oscillating curve.

In ablation studies, we observed that, for each dataset, the average TRE results were as follows: NoWelsch > WR > SWR > WN + SWR > Ours. Analyzing the noisy COPD dataset with significant displacements, after applying Welsch’s function to the regularization term (NoWelsch → WR), the TRE reduced by ~11%. After using surrogate function for WR (WR → SWR), the TRE decreased by ~3%. After applying Welsch’s function to the NGF dissimilarity metric (SWR → WN + SWR), the TRE decreased by ~11%. After using surrogate function for the WN dissimilarity metric (WN + SWR → Ours), the TRE decreased by ~2%. Consistent with our expectations, both Welsch’s function with parameter dynamic updating mechanism and surrogate functions were effective in reducing the TRE in each case. Furthermore, Welsch’s function reduced the TRE more effectively than surrogate functions because Welsch’s function primarily enhances robustness, whereas surrogate functions focus on reducing the optimization complexity. Figure 10 (first and second columns) present the interpolated heatmaps of TRE comparisons between our method (top row) and the NoWelsch method (bottom row) in the noisy COPD case 10. It can be visually observed that our method achieved better TRE in the lower right corner of the axial slice and the upper right corner of the coronal slice.

Figure 10 Interpolated TRE and displacement vector field maps for noisy COPD case 10. The top row represents our method, while the bottom, NoWelsch. (A) Axial lung masks with overlaid TRE maps. (B) Coronal lung masks with overlaid TRE maps. (C) Deformed regular grid in the coronal plane after applying the displacement field. (D) Mapping of the 3D displacement field to the RGB space. (E) Color bar representing the mapping between colors and the TRE. TRE, target registration error; COPD, chronic obstructive pulmonary disease; 3D, three-dimensional.

RMSE

Table 3 presents the RMSE results on both the Traditional and noisy 4DCT and COPD datasets. On average, our method achieved the best performance on the traditional 4DCT dataset and the second-best performance on the traditional COPD dataset, while obtaining the best results on both the noisy 4DCT and noisy COPD datasets. This indicates that our method performs comparably to state-of-the-art methods in terms of RMSE when noise is absent, and demonstrates excellent robustness in the presence of noise. This aligns with the TRE results, which evaluate the accuracy of lung registration, suggesting that our method effectively models the image characteristics and motion patterns of the entire image. Furthermore, consistent with the TRE results, the optimization-based methods (pTVReg and He) exhibited higher stability across different noise levels compared to the DL-based method (GRN).

Table 3

RMSE results of the DIR-Lab 4DCT (40) and COPD datasets (41) with and without simulated metal artifacts and noise

Case Without noise With noise
Init. GRN (22) pTVReg (27) He (18) Ours Init. GRN (22) pTVReg (27) He (18) Ours
4DCT
   01 48.61 15.87 19.24 18.36 17.82 108.31 61.23 56.72 52.99 52.85
   02 55.25 19.87 25.22 22.34 22.48 120.67 64.52 63.56 58.72 59.83
   03 59.32 25.85 25.78 21.15 21.03 115.95 78.37 73.80 64.15 62.32
   04 55.81 30.85 28.39 27.65 28.14 119.18 85.52 71.26 69.01 68.12
   05 51.56 27.98 24.36 24.47 24.24 108.91 74.25 72.12 63.22 61.25
   06 43.44 30.65 27.92 26.78 25.98 94.30 84.94 69.58 71.55 67.65
   07 41.61 29.89 28.47 26.55 26.58 93.96 85.59 77.00 72.30 65.50
   08 48.48 29.88 23.69 21.89 22.04 94.91 89.03 78.81 69.73 59.90
   09 34.87 21.78 21.25 17.78 17.87 89.94 78.28 55.24 57.63 52.14
   10 36.81 17.89 19.69 14.83 14.37 91.61 73.58 64.29 54.68 49.23
   Avg 47.58 25.05 24.40 22.18 22.06 103.77 77.53 68.24 63.40 59.88
   Std 8.25 5.72 3.44 4.28 4.39 12.12 9.30 8.06 7.13 6.59
COPD
   01 100.43 21.44 24.78 20.89 21.15 139.24 103.65 79.80 71.08 69.72
   02 99.64 29.37 39.23 36.57 34.25 152.92 124.54 81.14 87.59 81.58
   03 77.93 18.78 24.05 19.05 19.19 121.05 98.06 54.16 53.98 51.58
   04 107.41 25.12 31.65 28.32 26.87 142.51 120.83 71.82 73.54 62.13
   05 98.83 23.87 32.98 25.98 27.65 146.85 118.36 69.53 69.85 67.70
   06 98.04 20.78 26.85 26.11 22.54 132.67 104.32 83.15 78.63 67.62
   07 91.97 19.89 19.93 20.33 19.24 129.02 94.71 65.58 65.11 62.33
   08 104.34 23.87 35.51 39.78 25.78 144.22 117.73 81.29 84.35 71.56
   09 81.36 19.23 20.54 21.08 27.87 127.18 93.58 75.86 72.39 69.21
   10 97.71 25.37 34.15 33.12 28.78 145.42 120.02 77.59 81.87 70.02
   Avg 95.77 22.77 28.97 27.12 25.33 138.11 109.58 73.99 73.84 67.35
   Std 9.45 3.35 6.64 7.27 4.78 10.16 11.90 8.99 9.91 7.73

, the best; , the second best. RMSE, root mean square error; DIR-Lab 4DCT, deformable image registration lab four-dimensional computed tomography; COPD, chronic obstructive pulmonary disease; init., initial result; Avg, average value; Std, standard deviation.

Lower resolution validation

Table 4 presents the TRE results of each method after reducing resolution on the noisy 4DCT dataset. Compared to Table 2, there is no significant decrease in accuracy for any of the methods. This might be attributed to GRN utilizing sparse keypoint control for global displacement fields, whereas pTVReg, He, and our method employ parameterized control lattice for global displacement fields. Both of these approaches are capable of preventing registration from getting stuck in local minima, effectively conducting a search for displacement fields in a downsampled space. The impact of resolution reduction on accuracy is thus limited. Consequently, our method can also be extended to registration at lower resolutions.

Table 4

TRE results (in mm) obtained by downsampling the 4DCT dataset with noise to a lower resolution (half the width and half the height), and applying a Gaussian filter to the original image

Case Init. Lower resolution Gaussian filter
GRN (22) ptvReg (27) He (18) Ours GRN (22) ptvReg (27) He (18) Ours
4DCT 01 3.89 2.42 2.13 1.46 1.42 2.40 2.04 1.44 1.41
4DCT 02 4.34 2.16 2.01 1.44 1.43 2.04 1.97 1.43 1.44
4DCT 03 6.94 3.64 3.03 1.68 1.62 3.64 2.93 1.65 1.61
4DCT 04 9.83 5.71 1.96 1.85 1.81 5.72 1.87 1.81 1.78
4DCT 05 7.48 4.35 4.73 2.21 1.96 4.30 4.68 2.10 1.90
4DCT 06 10.89 9.02 2.17 2.38 1.92 9.06 2.18 2.38 1.79
4DCT 07 11.03 8.54 3.94 2.43 2.14 8.48 3.88 2.34 2.03
4DCT 08 14.99 11.27 4.25 2.39 2.27 11.21 4.18 2.33 2.28
4DCT 09 7.92 6.16 1.92 2.31 1.79 6.31 1.93 2.27 1.75
4DCT 10 7.30 5.73 3.92 2.22 1.91 5.49 3.96 2.19 1.91
Avg 8.46 5.90 3.01 2.04 1.83 5.86 2.96 1.99 1.79
Std 3.33 2.97 1.10 0.39 0.28 2.98 1.10 0.38 0.26

, the best. TRE, target registration error; 4DCT, four-dimensional computed tomography; init., initial result; Avg, average value; Std, standard deviation.

Comparison with registration after denoising

Table 4 also presents the TRE results of different methods on the noisy 4DCT dataset after denoising with Gaussian filtering before registration. Similar to lower resolution validation, there is no significant improvement in results compared to Table 2. This lack of improvement could be attributed to Gaussian filtering, which, while filtering out high-frequency noise, also removes many image details and cannot eliminate high-intensity non-high-frequency metal artifacts. This leads to denoising effects similar to resolution reduction, as visually shown in Figure 7. Therefore, even when the comparison methods denoise with Gaussian filtering before registration, their accuracy falls short of our method that addresses outliers directly during the optimization process. Another approach for the comparison methods could involve using DL-based CT artifact removal on the noisy images before registration, but this goes beyond the scope of this study.

Displacement field

The third and fourth columns in Figure 10 illustrate the displacement field grids and RGB images of our method (top row) and NoWelsch (bottom row). It can be observed that the displacement field grid of our method is more reasonable compared to NoWelsch. For instance, NoWelsch exhibited several distortions in the left lung, whereas our method exhibited smoother results. In the corresponding RGB image, NoWelsch showed more localized colors. Additionally, the heatmaps in the second column indicates that our method achieved better TRE in the left lung. Figure 11 illustrates the displacement field of our method (Figure 11A,11C) and NoWelsch (Figure 11B,11D) using arrows. In the first comparison group, the dashed boxes in Figure 11A,11B show that our method achieved smoother transitions when the displacement field changed from right to left compared to NoWelsch. Conversely, in the second comparison group, the dashed boxes in Figure 11C,11D demonstrate that our method achieved sharper transitions when the displacement field changes from bottom to top compared to NoWelsch, which adopts a larger height range for the transition. These two comparison groups correspond to the first and second columns of Figure 10, respectively, and our method consistently achieved better TRE. This was achieved by regularizing the displacement field using the Welsch’s function and gradually reducing the Welsch parameter νr at each level to allow more differences in adjacent displacement vectors. Furthermore, the abnormal displacements in NoWelsch may be related to noise. Due to the excessive penalty of the squared l2-norm, NoWelsch may produce unrealistic displacement vectors; especially when a normal point corresponds to a correct noise point, the large penalty forces the normal point to move to a wrong target point. In contrast, our approach applies the Welsch’s function to the distance in NGF space and dynamically adjusts the Welsch parameter νd to gradually reduce sensitivity to noise.

Figure 11 Displacement vector field maps for case 10 of the noisy COPD dataset, with each displacement vector represented by an arrow. Blue represents the moving image, while red represents the fixed image. (A,B) show the comparison in the coronal plane. The differences between the two methods are more pronounced within the dashed box. (C,D) depict a comparison of the axial displacement vector fields between our method and the NoWelsch method. In particular, (A) demonstrates smoother transitions compared to (B), whereas (C) exhibits sharper transitions compared to (D), both aiming to achieve a more optimal TRE. COPD, chronic obstructive pulmonary disease; TRE, target registration error.

Loss curve

Figure 12 illustrates the loss curves in the ablation studies to investigate the role of surrogate functions in minimizing the objective function. The optimization process was repeated five times for the challenging noisy COPD case 10. To facilitate comparison, the average loss curves of four methods with the same initial point in the first level are shown, and the loss range is transformed into the TRE interval. In Figure 12B, SWR exhibited higher loss reduction and lower standard deviation than WR, confirming the positive effect of employing surrogate functions for WR. In Figure 12C, although Ours had a slightly higher standard deviation than WN + SWR, it showed a significantly faster descent rate, indicating that using surrogate functions for WN can enhance optimization efficiency, possibly due to the increased local convexity of the objective function. Therefore, consistent with the expected results, employing surrogate functions for both WR and WN dissimilarity metric exerts a positive impact on optimization.

Figure 12 Comparison of loss curves for four different approaches (WR, SWR, WN + SWR, and ours) defined in ablation studies. (A) Comparison of four approaches together; (B) comparison of WR and SWR; (C) comparison of WN + SWR and ours. Light shades represent the standard deviation of the TRE. WR, Welsch regularization; SWR, surrogate Welsch regularization; WN, WelschNGF; TRE, target registration error.

Performance analysis

Table 5 presents a comparative analysis of the performance of the replicated three methods and our method using the same dataset (noisy 4DCT) and computational configuration. Although our method exhibits higher graphics processing unit (GPU) resource consumption compared to the other methods, they all can run on a modest 16 GB GPU, ensuring scalability. Additionally, although DL-based methods demonstrate significant advantages in registration speed, they entail long training times and limited generalizability due to potential issues with dataset distribution.

Table 5

Performance comparison of the locally replicated three methods and our method using the same dataset and computational setup

Performance metrics DL-based Optimization-based
GRN (22) ptvReg (27) He (18) Ours
Computational configuration NVIDIA RTX 3090
(24 GB)
NVIDIA RTX 3090
(24 GB)
NVIDIA RTX 3090
(24 GB)
NVIDIA RTX 3090
(24 GB)
GPU memory consumption 11.25 GB 6.14 GB 10.63 GB 14.92 GB
Training time 7 hours 0 s 0 s 0 s
Registration time 2 s 113 s 61 s 96 s

DL, deep learning; GPU, graphics processing unit.


Discussion

In this work, a robust image-environment-adaptive thoracic CT image registration method is proposed. We innovatively utilized the smooth Welsch’s function to redesign dynamically-updated dissimilarity metric and regularization within a hierarchical anatomical structure-aware registration framework. Compared to state-of-the-art methods, our approach achieved comparable performance in low-noise environments and outperformed them in high-noise environments, demonstrating the robustness of our method. Additionally, ablation studies were conducted to individually verify the effectiveness of the Welsch’s function, dynamic update strategy, and MM optimization method.

Although our method performed slightly worse than the DL model GRN, which is specifically designed for large-scale deformation optimization, on the conventional COPD dataset with significant deformations, our method still achieved second place. This indicates that our method, which focuses on optimizing registration robustness, is capable of handling large deformations. Furthermore, compared to GRN, our method outperformed in terms of metrics on other conventional/noisy datasets, demonstrating its stronger generalization ability.

Our method relies on segmentation. Thanks to the advancements in DL, it is possible to quickly segment the masks of lung and bone from conventional or noisy CT images. In this study, the segmentation of a single case could be accomplished within a mere 19 seconds. Similar to other segmentation-based registration methods (18,22), one limitation of our approach is that the registration accuracy is influenced by the segmentation precision. However, this limitation diminishes as the accuracy (51-53) and robustness (54,55) of medical image segmentation improve over time.

Although our method utilizes the NCC dissimilarity metric, which may vary in performance across different datasets (56), the divide-and-conquer strategy mitigates this issue by applying NCC only to suitable regions, thereby improving robustness.

Although our method belongs to optimization-based approaches, the registration of a single case takes an average of only about 1.5 minutes and does not require dataset-specific support, making it highly applicable. Despite the fact that our framework exhibits a certain level of complexity, it is important to note that all its steps can be automated. This allows for user-friendly clinical application, where the algorithm can be employed for end-to-end registration without requiring users to be concerned about the intricate implementation details.

This work can be considered as an improvement upon volumetric image registration methods, focusing specifically on the thoracic CT image registration by incorporating anatomical structure-aware techniques and optimizing parameter update strategies. Hence, we believe that the designed dissimilarity metrics and displacement field regularization based on the Welsch’s function can be easily transferred to other types of pixel/voxel image registration, as well as registration of point clouds or meshes, such as magnetic resonance imaging (MRI), ultrasound, camera-captured images, and in-silico medical phantoms (57-59), whether they are optimization-based or DL-based.

The proposed method is based on optimization, in contrast to popular DL-based registration methods. Consistent with the well-known “No Free Lunch Theorem” (60) in the field of optimization, although DL-based methods can reduce registration time to seconds, they require training for tens of hours and suffer from limited generalizability due to dataset constraints. Therefore, caution should be exercised when applying these methods to images with new features, especially when considering safety in clinical applications. Meanwhile, optimization-based methods, although relatively slower, offer higher interpretability and stronger generalization. In practical applications, the choice of method should be based on the clinical scenario. For example, when registering a single case for disease progression analysis or CT reconstruction, an optimization-based method can be selected. In contrast, for real-time continuous registration tasks such as tumor tracking, a DL-based method would be more suitable.

Despite the favorable outcomes obtained with our method, there are still some limitations. First, appropriate initialization of the Welsch parameters is still required to avoid convergence to undesired local minima. Second, the overall objective function is still non-convex. Furthermore, further developments are needed to handle inter-session (i.e., the CT-CBCT registration problem) and inter-patient registrations, such as initial image alignment, variations in field of view/resolution, and changes in anatomy that cannot be represented by deformation. Future plans include introducing image quality assessment and pre-registration for optimal parameter estimation, as well as exploring alternative dissimilarity metrics for constructing a convex quadratic objective function.


Conclusions

In this study, we propose an image-environment-adaptive registration method for thoracic CT images based on a hierarchical anatomical structure-aware framework, utilizing a dynamic Welsch’s function. The ample experimental results validate the robustness of the proposed method.


Acknowledgments

Funding: This work was supported by the National Natural Science Foundation of China (Nos. U1932158, 81871085, and 82271519), the Natural Science Foundation of Shandong Province (No. ZR2019LZL018), the Anhui Province Funds for Distinguished Young Scientists (No. 2208085J10), the Collaborative Innovation Program of Hefei Science Center, CAS (No. 2019HSC-CIP003), the China Postdoctoral Science Foundation (No. 2019M652403), and the Project of Postdoctoral Innovation of Shandong Province (No. 202002048).


Footnote

Conflicts of Interest: All authors have completed the ICMJE uniform disclosure form (available at https://qims.amegroups.com/article/view/10.21037/qims-24-596/coif). The authors have no conflicts of interest to declare.

Ethical Statement: The authors are accountable for all aspects of the work in ensuring that questions related to the accuracy or integrity of any part of the work are appropriately investigated and resolved.

Open Access Statement: This is an Open Access article distributed in accordance with the Creative Commons Attribution-NonCommercial-NoDerivs 4.0 International License (CC BY-NC-ND 4.0), which permits the non-commercial replication and distribution of the article with the strict proviso that no changes or edits are made and the original work is properly cited (including links to both the formal publication through the relevant DOI and the license). See: https://creativecommons.org/licenses/by-nc-nd/4.0/.


References

  1. Gorbunova V, Lol P, Ashraf H, Dirksen A, Nielsen M, de Bruijne M. Weight preserving image registration for monitoring disease progression in lung CT. Med Image Comput Comput Assist Interv 2008;11:863-70.
  2. Chassagnon G, Vakalopoulou M, Régent A, Sahasrabudhe M, Marini R, Hoang-Thi TN, Dinh-Xuan AT, Dunogué B, Mouthon L, Paragios N, Revel MP. Elastic Registration-driven Deep Learning for Longitudinal Assessment of Systemic Sclerosis Interstitial Lung Disease at CT. Radiology 2021;298:189-98. [Crossref] [PubMed]
  3. Wang T, He T, Zhang Z, Chen Q, Zhang L, Xia G, Yang L, Wang H, Wong STC, Li H. A personalized image-guided intervention system for peripheral lung cancer on patient-specific respiratory motion model. Int J Comput Assist Radiol Surg 2022;17:1751-64. [Crossref] [PubMed]
  4. Alam F, Rahman SU, Ullah S, Gulati K. Medical image registration in image guided surgery: Issues, challenges and research opportunities. Biocybern Biomed Eng 2018;38:71-89.
  5. Wang B, Wang DQ, Lin MS, Lu SP, Zhang J, Chen L, Li QW, Cheng ZK, Liu FJ, Guo JY, Liu H, Qiu B. Accumulation of the delivered dose based on cone-beam CT and deformable image registration for non-small cell lung cancer treated with hypofractionated radiotherapy. BMC Cancer 2020;20:1112. [Crossref] [PubMed]
  6. Fu Y, Lei Y, Wang T, Curran WJ, Liu T, Yang X. Deep learning in medical image registration: a review. Phys Med Biol 2020;65:20TR01. [Crossref] [PubMed]
  7. Wang T, Xia G, Li H, Qi C, Wang H. Patient specific respiratory motion model using two static CT images. In: Proceedings of the 2nd International Symposium on Artificial Intelligence for Medicine Sciences. 2021:488-92.
  8. Huang Y, Dong Z, Wu H, Li C, Liu H, Zhang Y. Deep learning-based synthetization of real-time in-treatment 4D images using surface motion and pretreatment images: A proof-of-concept study. Med Phys 2022;49:7016-24. [Crossref] [PubMed]
  9. Xiao H, Chen K, You T, Liu D, Zhang W, Xue X, Long L, Dang J. Real-Time 4-D-Cone Beam CT Accurate Estimation Based on Single-Angle Projection via Dual-Attention Mechanism Residual Network. IEEE Trans Radiat Plasma Med Sci 2023;7:618-629.
  10. You C, Zhao R, Liu F, Dong S, Chinchali S, Topcu U, Staib L, Duncan JS. Class-Aware Adversarial Transformers for Medical Image Segmentation. Adv Neural Inf Process Syst 2022;35:29582-96.
  11. You C, Zhao R, Staib L, Duncan JS. Momentum Contrastive Voxel-wise Representation Learning for Semi-supervised Volumetric Medical Image Segmentation. Med Image Comput Comput Assist Interv 2022;13434:639-52.
  12. You C, Zhou Y, Zhao R, Staib L, Duncan JS. SimCVD: Simple Contrastive Voxel-Wise Representation Distillation for Semi-Supervised Medical Image Segmentation. IEEE Trans Med Imaging 2022;41:2228-37. [Crossref] [PubMed]
  13. You C, Xiang J, Su K, Zhang X, Dong S, Onofrey J, Staib L, Duncan JS. Incremental Learning Meets Transfer Learning: Application to Multi-site Prostate MRI Segmentation. Distrib Collab Fed Learn Afford AI Healthc Resour Div Glob Health (2022) 2022;13573:3-16. [Crossref] [PubMed]
  14. Balakrishnan G, Zhao A, Sabuncu MR, Guttag J, Dalca AV. VoxelMorph: A Learning Framework for Deformable Medical Image Registration. IEEE Trans Med Imaging 2019; Epub ahead of print. [Crossref]
  15. Chen L, Cao X, Chen L, Gao Y, Shen D, Wang Q, Xue Z. Semantic hierarchy guided registration networks for intra-subject pulmonary CT image alignment. In: Medical Image Computing and Computer Assisted Intervention-MICCAI 2020: 23rd International Conference, Lima, Peru, October 4-8, 2020, Proceedings, Part III 23. Springer International Publishing; 2020:181-9.
  16. Chen X, Meng Y, Zhao Y, Williams R, Vallabhaneni SR, Zheng Y. Learning unsupervised parameter-specific affine transformation for medical images registration. In: Medical Image Computing and Computer Assisted Intervention–MICCAI 2021: 24th International Conference, Strasbourg, France, September 27–October 1, 2021, Proceedings, Part IV 24. Springer International Publishing; 2021:24-34.
  17. Liu F, Yan K, Harrison AP, Guo D, Lu L, Yuille AL, Huang L, Xie G, Xiao J, Ye X, Jin D. SAME: Deformable image registration based on self-supervised anatomical embeddings. In: Medical Image Computing and Computer Assisted Intervention-MICCAI 2021: 24th International Conference, Strasbourg, France, September 27-October 1, 2021, Proceedings, Part IV 24. Springer International Publishing; 2021:87-97.
  18. He Y, Wang A, Li S, Hao A. Hierarchical anatomical structure-aware based thoracic CT images registration. Comput Biol Med 2022;148:105876. [Crossref] [PubMed]
  19. Zhang Y, Yu H. Convolutional Neural Network Based Metal Artifact Reduction in X-Ray Computed Tomography. IEEE Trans Med Imaging 2018;37:1370-81. [Crossref] [PubMed]
  20. Brock KK, Mutic S, McNutt TR, Li H, Kessler ML. Use of image registration and fusion algorithms and techniques in radiotherapy: Report of the AAPM Radiation Therapy Committee Task Group No. 132. Med Phys 2017;44:e43-76. [Crossref] [PubMed]
  21. Hu R, Wang H, Ristaniemi T, Zhu W, Sun X, Lung CT. Image Registration through Landmark-constrained Learning with Convolutional Neural Network. Annu Int Conf IEEE Eng Med Biol Soc 2020;2020:1368-71. [Crossref] [PubMed]
  22. Hansen L, Heinrich MP. GraphRegNet: Deep Graph Regularisation Networks on Sparse Keypoints for Dense Registration of 3D Lung CTs. IEEE Trans Med Imaging 2021;40:2246-57. [Crossref] [PubMed]
  23. Rueckert D, Sonoda LI, Hayes C, Hill DL, Leach MO, Hawkes DJ. Nonrigid registration using free-form deformations: application to breast MR images. IEEE Trans Med Imaging 1999;18:712-21. [Crossref] [PubMed]
  24. Schmidt-Richberg A, Ehrhardt J, Werner R, Handels H. Fast explicit diffusion for registration with direction-dependent regularization. In: Biomedical Image Registration: 5th International Workshop, WBIR 2012, Nashville, TN, USA, July 7-8, 2012. Proceedings 5. Springer Berlin Heidelberg; 2012:220-8.
  25. Delmon V, Rit S, Pinho R, Sarrut D. Registration of sliding objects using direction dependent B-splines decomposition. Phys Med Biol 2013;58:1303-14. [Crossref] [PubMed]
  26. Pock T, Urschler M, Zach C, Beichel R, Bischof H. A duality based algorithm for TV-L 1-optical-flow image registration. In: International Conference on Medical Image Computing and Computer-Assisted Intervention. Berlin, Heidelberg: Springer Berlin Heidelberg; 2007:511-8.
  27. Vishnevskiy V, Gass T, Szekely G, Tanner C, Goksel O. Isotropic Total Variation Regularization of Displacements in Parametric Image Registration. IEEE Trans Med Imaging 2017;36:385-95. [Crossref] [PubMed]
  28. Gong L, Duan L, Dai Y, He Q, Zuo S, Fu T, Yang X, Zheng J. Locally Adaptive Total p-Variation Regularization for Non-Rigid Image Registration With Sliding Motion. IEEE Trans Biomed Eng 2020;67:2560-71. [Crossref] [PubMed]
  29. Wu Z, Rietzel E, Boldea V, Sarrut D, Sharp GC. Evaluation of deformable registration of patient lung 4DCT with subanatomical region segmentations. Med Phys 2008;35:775-81. [Crossref] [PubMed]
  30. Rietzel E, Chen GT. Deformable registration of 4D computed tomography data. Med Phys 2006;33:4423-30. [Crossref] [PubMed]
  31. Holland PW, Welsch RE. Robust regression using iteratively reweighted least-squares. Commun Stat-Theory Methods 1977;6:813-27.
  32. Zhang J, Yao Y, Deng B. Fast and Robust Iterative Closest Point. IEEE Trans Pattern Anal Mach Intell 2022;44:3450-66. [Crossref] [PubMed]
  33. Lange K. MM optimization algorithms. Philadelphia: Society for Industrial and Applied Mathematics; 2016.
  34. You C, Dai W, Min Y, Staib L, Duncan JS. Bootstrapping Semi-supervised Medical Image Segmentation with Anatomical-Aware Contrastive Distillation. Inf Process Med Imaging 2023;13939:641-53. [Crossref] [PubMed]
  35. You C, Dai W, Min Y, Staib L, Sekhon J, Duncan JS. ACTION++: Improving Semi-supervised Medical Image Segmentation with Adaptive Anatomical Contrast. Med Image Comput Comput Assist Interv 2023;14223:194-205.
  36. Ham B, Cho M, Ponce J. Robust image filtering using joint static and dynamic guidance. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2015:4823-31.
  37. Reddi SJ, Kale S, Kumar S. On the convergence of adam and beyond. arXiv:1904.09237. 2019. Available online: https://arxiv.org/abs/1904.09237
  38. Ronneberger O, Fischer P, Brox T. U-net: Convolutional networks for biomedical image segmentation. In: Medical Image Computing and Computer-Assisted Intervention-MICCAI 2015: 18th International Conference, Munich, Germany, October 5-9, 2015, Proceedings, Part III 18. Springer International Publishing. 2015:234-41.
  39. Hofmanninger J, Prayer F, Pan J, Röhrich S, Prosch H, Langs G. Automatic lung segmentation in routine imaging is primarily a data diversity problem, not a methodology problem. Eur Radiol Exp 2020;4:50. [Crossref] [PubMed]
  40. Castillo R, Castillo E, Guerra R, Johnson VE, McPhail T, Garg AK, Guerrero T. A framework for evaluation of deformable image registration spatial accuracy using large landmark point sets. Phys Med Biol 2009;54:1849-70. [Crossref] [PubMed]
  41. Castillo R, Castillo E, Fuentes D, Ahmad M, Wood AM, Ludwig MS, Guerrero T. A reference dataset for deformable image registration spatial accuracy evaluation using the COPDgene study archive. Phys Med Biol 2013;58:2861-77. [Crossref] [PubMed]
  42. Thanh D, Surya P. A review on CT and X-ray images denoising methods. Informatica 2019;43:151-9.
  43. Sandkühler R, Jud C, Andermatt S, Cattin PC. AirLab: autograd image registration laboratory. arXiv:1806.09907. 2018. Available online: https://arxiv.org/abs/1806.09907
  44. Jiang Z, Yin FF, Ge Y, Ren L. A multi-scale framework with unsupervised joint training of convolutional neural networks for pulmonary deformable image registration. Phys Med Biol 2020;65:015011. [Crossref] [PubMed]
  45. Fechter T, Baltas D. One-Shot Learning for Deformable Medical Image Registration and Periodic Motion Tracking. IEEE Trans Med Imaging 2020;39:2506-17. [Crossref] [PubMed]
  46. Fu Y, Lei Y, Wang T, Higgins K, Bradley JD, Curran WJ, Liu T, Yang X. LungRegNet: An unsupervised deformable image registration method for 4D-CT lung. Med Phys 2020;47:1763-74. [Crossref] [PubMed]
  47. Hering A, Häger S, Moltz J, Lessmann N, Heldmann S, van Ginneken B. CNN-based lung CT registration with multiple anatomical constraints. Med Image Anal 2021;72:102139. [Crossref] [PubMed]
  48. Mok TCW, Chung ACS. Large deformation diffeomorphic image registration with laplacian pyramid networks. In: Medical Image Computing and Computer Assisted Intervention-MICCAI 2020: 23rd International Conference, Lima, Peru, October 4-8, 2020, Proceedings, Part III 23. Springer International Publishing. 2020:211-21.
  49. Liu X, Qi CR, Guibas LJ. Flownet3d: Learning scene flow in 3d point clouds. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2019:529-37.
  50. Heinrich MP. Closing the gap between deep and conventional image registration using probabilistic dense displacement networks. In: Medical Image Computing and Computer Assisted Intervention-MICCAI 2019: 22nd International Conference, Shenzhen, China, October 13-17, 2019, Proceedings, Part VI 22. Springer International Publishing. 2019:50-8.
  51. You C, Yang J, Chapiro J, Duncan JS. Unsupervised wasserstein distance guided domain adaptation for 3d multi-domain liver segmentation. In: Interpretable and Annotation-Efficient Learning for Medical Image Computing: Third International Workshop, iMIMIC 2020, Second International Workshop, MIL3ID 2020, and 5th International Workshop, LABELS 2020, Held in Conjunction with MICCAI 2020, Lima, Peru, October 4-8, 2020, Proceedings 3. Springer International Publishing. 2020:155-63.
  52. You C, Dai W, Liu F, Min Y, Dvornek NC, Li X, Clifton DA, Staib L, Duncan JS. Mine Your Own Anatomy: Revisiting Medical Image Segmentation With Extremely Limited Labels. IEEE Trans Pattern Anal Mach Intell 2024; Epub ahead of print. [Crossref]
  53. You C, Dai W, Min Y, Staib L, Duncan JS. Implicit Anatomical Rendering for Medical Image Segmentation with Stochastic Experts. Med Image Comput Comput Assist Interv 2023;14222:561-71.
  54. You C, Dai W, Min Y, Liu F, Clifton DA, Zhou SK, Staib L, Duncan JS. Rethinking Semi-Supervised Medical Image Segmentation: A Variance-Reduction Perspective. Adv Neural Inf Process Syst 2023;36:9984-10021.
  55. You C, Mint Y, Dai W, Sekhon JS, Staib L, Duncan JS. Calibrating multi-modal representations: A pursuit of group robustness without annotations. In: 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). IEEE. 2024:26140-50.
  56. Kaso A. Computation of the normalized cross-correlation by fast Fourier transform. PLoS One 2018;13:e0203434. [Crossref] [PubMed]
  57. Wissmann L, Santelli C, Segars WP, Kozerke S. MRXCAT: Realistic numerical phantoms for cardiovascular magnetic resonance. J Cardiovasc Magn Reson 2014;16:63. [Crossref] [PubMed]
  58. Neelakantan S, Xin Y, Gaver DP, Cereda M, Rizi R, Smith BJ, Avazmohammadi R. Computational lung modelling in respiratory medicine. J R Soc Interface 2022;19:20220062. [Crossref] [PubMed]
  59. Neelakantan S, Mukherjee T, Smith BJ, Myers K, Rizi R, Avazmohammadi R. In-silico CT lung phantom generated from finite-element mesh. Proc SPIE Int Soc Opt Eng 2024;12928:1292829.
  60. Adam SP, Alexandropoulos SAN, Pardalos PM, Vrahatis MN. No free lunch theorem: A review. In: Demetriou IC, Pardalos PM. editors. Approximation and Optimization: Algorithms, Complexity and Applications. Cham: Springer; 2019:57-82.
Cite this article as: Wei Z, Wu X, Xing L, Yu J, Qian J. Robust thoracic CT image registration with environmental adaptability using dynamic Welsch’s function and hierarchical structure-awareness strategy. Quant Imaging Med Surg 2024;14(12):8999-9020. doi: 10.21037/qims-24-596

Download Citation