Spatial-temporal and physical constrained deep learning model for simultaneous T1 and T2 reconstruction and mapping (STEP)

Runyu Yang; Haozhong Sun; Xiaoqi Lin; Haokun Li; Huijun Chen

doi:10.21037/qims-2025-1563

Original Article

Spatial-temporal and physical constrained deep learning model for simultaneous T1 and T2 reconstruction and mapping (STEP)

Runyu Yang, Haozhong Sun, Xiaoqi Lin, Haokun Li, Huijun Chen

Center for Biomedical Imaging Research, Department of Biomedical Engineering, Tsinghua University, Beijing, China

Contributions: (I) Conception and design: R Yang; (II) Administrative support: None; (III) Provision of study materials or patients: H Li; (IV) Collection and assembly of data: H Sun; (V) Data analysis and interpretation: R Yang; (VI) Manuscript writing: All authors; (VII) Final approval of manuscript: All authors.

Correspondence to: Huijun Chen, PhD. Center for Biomedical Imaging Research, Department of Biomedical Engineering, Tsinghua University, Shuangqing Road No. 30, Haidian District, Beijing 100084, China. Email: chenhj_cbir@mail.tsinghua.edu.cn.

Background: Magnetic resonance (MR) parametric maps provide quantitative tissue characteristics that are valuable for medical diagnosis. However, existing T1 and T2 measurement techniques confront challenges such as prolonged reconstruction/fitting times and the requirement for multi-sequence image registration, limiting the clinical applicability of quantitative parameter mapping. The development of compressed sensing combined with parallel imaging technology has improved reconstruction efficiency. However, traditional two-step workflow lacks spatial constraints for parameter mapping and suffers from slow pixel-level fitting. Furthermore, deep learning methods are applied to reconstruction to accelerate reconstruction and remove noise but exhibit heavy dependence on training datasets and neglect to incorporate inherent low-rank and sparse data constraints. To address these challenges, this study proposes the spatial-temporal and physical constrained deep learning model for simultaneous T1 and T2 reconstruction and mapping (STEP) method.

Methods: The method utilizes physical models for backpropagation of deep learning features, allowing physical priors to explicitly express the spatiotemporal correlations of images. This integration enables mutual enhancement between low-rank/sparse constraints and deep learning priors, thereby improving both constraint-based reconstruction and deep learning performances.

Results: Experimental results from simulated brain data, real phantoms, and healthy volunteers demonstrate that the proposed method produces more accurate T1 and T2 maps than those obtained using conjugate gradient (CG), low-rank plus sparse (LS) matrix factorization least squares fitting, or conventional deep learning-based mapping methods. The method uses only 2–4% (spokes of 1,000 to 2,000) of full k-space data to simultaneously generate quantitative T1 and T2 maps, yet still achieves a structural similarity (SSIM) greater than 0.7 and a normalized root mean square error (nRMSE) lower than 0.1. The proposed method also yielded excellent Pearson correlation coefficients of R²=0.99 for T1 and R2=0.94 for T2. The entire process from reconstructing three dimensions isotropic weighted images with high spatiotemporal resolution to fitting approximately 152 corresponding T1 and T2 images has been accelerated by about 150 times compared to traditional methods.

Conclusions: The proposed method extracts feature information through low-rank sparse priors and optimizes backpropagation via physical models, integrating the synergistic advantages of deep learning and low-rank sparse iterative processing to achieve concurrent T1 and T2 quantification.

Keywords: Deep learning; magnetic resonance parametric maps (MR parametric maps); low rank; sparsity; physical model

Submitted Jul 16, 2025. Accepted for publication Mar 26, 2026. Published online May 27, 2026.

doi: 10.21037/qims-2025-1563

Introduction

Magnetic resonance (MR) parametric maps provide quantitative tissue characteristics which are valuable for medical diagnosis. Recently, there has been a growing interest in T1/T2 mapping technology research (1-10). Iron uptake with Parkinson’s disease (2), multiple system atrophy, white matter and gray matter abnormalities (3) can be reflected by T1 mapping, while T2 mapping is essential for tumor detection (4,5), stroke prediction (6,7), myocardial function assessment (8), and neurodegenerative disease diagnosis (9,10). However, existing T1 and T2 measurement techniques face several challenges, including prolonged reconstruction and fitting times (1,11) and the requirement for multi-sequence image registration, which limit the clinical applicability of quantitative MR parameter mapping. Complex scanning protocols are susceptible to motion artifacts, image misalignment, and patient discomfort (11). Previous studies (11-13) have attempted to acquire multiple contrast images in a single scan to address registration issues across different scanning sequences and reduce scanning time. However, these approaches are still limited by a narrow field of view, low inter-slice resolution (12) and computationally intensive data processing (13-17).

In recent years, the development of compressed sensing combined with parallel imaging technology has greatly improved the reconstruction efficiency. The weighted image to be reconstructed is formulated as a Casorati matrix, with the rows corresponding to temporal-domain information and the columns corresponding to spatial-domain information. Due to the temporal and spatial correlations between different TI images, the matrix possesses both low-rank (18,19) and sparse (20,21) properties. On the one hand, the images are represented as the product of temporal and spatial information through a partially separable model (22,23), using the conjugate gradient (CG) algorithm (24) for weighted image reconstruction. On the other hand, these images can be expressed as the sum of low-rank and sparse components using the low-rank plus sparse (LS) method (25). The low-rank components and sparse components are then iteratively solved and reconstructed using singular value thresholding (SVT) (26) and soft-thresholding (ST) operators (27), respectively. However, these methods involve a two-step workflow: first, regularization constraints are applied to reconstruct the weighted images, followed by pixel-by-pixel parameter fitting.

The limitations of the two-step workflow include performing only time-consuming pixel-level fitting during parameter fitting and an inability to utilize spatial constraints from weighted images. Deep learning (28,29) methods have been widely applied in MR reconstruction and parameter mapping. Specifically, some end-to-end neural network models have been proposed for parameter mapping. Pei et al. implemented a U-net with double residual connections for three-dimensional (3D) T1 mapping (30), while Shao et al. employed a deep convolutional neural network (CNN) to learn the signal changes of the myocardial acquisition sequence to obtain quantitative values for myocardial T1 and T2 (31). Nevertheless, these methods exhibit heavy dependence on training datasets and fail to incorporate the inherent low-rank and sparse data constraints. Therefore, specialized model-based deep learning methods has been proposed for reconstruction and mapping. Jun et al. cascaded a CNN-based initial parameter mapping network estimated from undersampled k-space data, with a CNN-based MR parameter mapping reconstruction network to remove artifacts for rapid reconstruction (32). However, the cascade network construction is complex, with interactions between the networks that result in overly smooth quantitative maps. Meng et al. applied a deep learning prior with low-rank and sparse modeling for T2 mapping (33). Although this method discovers subject-dependent novel features and produces accurate T2 maps, its pixel-wise T2 fitting significantly prolongs computation time. The model-based deep learning methods mentioned above generally struggle to effectively leverage spatiotemporal information in weighted images as constraints during quantitative estimation. Consequently, the quantitative results still demonstrate measurable room for improvement.

In response to these challenges, the spatial-temporal and physical constrained deep learning model for simultaneous T1 and T2 reconstruction and mapping (STEP) was proposed in the study. Unlike traditional constrained reconstruction methods, STEP leverages deep neural networks to achieve T1 and T2 mapping with effective denoising and artifact reduction. In contrast to end-to-end deep learning frameworks, the proposed method utilizes physical models for backpropagation of deep learning features, allowing physical priors to explicitly express the spatiotemporal correlations of images. This integration enables mutual compensation between low-rank/sparse constraints and deep learning priors, enhancing both constraint-based reconstruction and deep learning performance. Experimental results from simulated brain data, real phantom, and healthy volunteers demonstrate that the STEP method generates more accurate T1 and T2 maps than those obtained using CG (22,23) and LS least squares (25) T1 and T2 fitting methods, as well as conventional deep learning-based T1 and T2 mapping methods.

Methods

The SIMPLE sequence for T1 and T2 mapping

Considering that the data acquisition was not the focus of this study, the simultaneous T1 and T2 mapping of the carotid plaque (SIMPLE) sequence (13) is briefly described here. Details of SIMPLE can be found in the previous paper. Adiabatic T2 preparation combined with inversion recovery (T2IR) pulses generates T1 and T2 contrast mechanisms. To modulate T2 weighting, the T2 preparation duration (TEprep) is systematically adjusted across fast gradient-echo excitations. Following a brief interval (Tgap), a radiofrequency pulse train employing low flip angles (θ) is executed as in a standard spoiled gradient-echo (SPGR) sequence. 3D radial encoding is employed for 3D isotropic data acquisition. This approach ensures uniform angular distribution of k-space spokes, enabling robust image reconstruction at specific inversion times (TI) and TEprep intervals. The longitudinal magnetization before the kth θ pulse, M_θ(k), is:

$M_{θ} (k) = {\begin{array}{l} M_{0} - (M_{0} + α \cdot M_{b e f o r e T 2 I R} \cdot E_{2}) E_{gap} & if k = 1 \\ M_{θ} (1) {(E_{1} \cos θ)}^{k - 1} + M_{0} (1 - E_{1}) \frac{1 - {(E_{1} \cos θ)}^{k - 1}}{1 - E_{1} \cos θ} & else if k > 1 \end{array}$ [1]

where M₀ is the fully relaxed longitudinal magnetization, M_before _T₂_IR is the longitudinal magnetization before T2IR, $E_{1} = exp (- T R / T_{1}), E_{2} = exp (- T E_{prep} / T_{2}), E_{g a p} = exp (- T_{g a p} / T_{1})$ , TR is the time of repetition of the θ pulse and α is the inversion efficiency of the IR pulse. Finally, the longitudinal magnetization before the next T2IR can be defined as follows:

$M_{z} (T_{e x}) = M_{θ} (N + 1) E_{e} + M_{0} (1 - E_{e})$ [2]

where E_e is defined as $E_{e} = exp (- T_{e x} / T_{1})$ . Ignoring T2* decay, the longitudinal magnetization can be assumed to be the measured signal X. Then, based on Eq. [1], signals acquired by each spoke can be determined (13). Through iterative applications of varied TEprep durations, the system progressively converges to steady-state magnetization equilibrium.

The proposed STEP method

The signal $S (r, TI)$ on the T2IR curve to be reconstructed is written in the following Casorati matrix form:

$S = [\begin{matrix} S (r_{1}, T I_{1}) & \dots & S (r_{1}, T I_{M}) \\ ⋮ & ⋱ & ⋮ \\ S (r_{N x}, T I_{1}) & \dots & S (r_{N x}, T I_{M}) \end{matrix}]$ [3]

where r represents the spatial position, N_x is the number of all pixels in an image frame, M is the total number of reconstructed frames. The adjacent radial spokes within the time window (TW) (34) were combined to reconstruct weighted images X from $S (r, TI)$ at the TI to increase the signal-to-noise ratio (SNR), where $X \in ℂ^{N_{x} \times N_{y} \times M \times N_{T I}}$ , $ℂ$ denotes the complex set, $N_{x}, N_{y} M and N_{T I}$ denote the width, height, the total slices and different TI, respectively. The TW combined all spokes with spoke number from n to N + TW −1 in all shots with the same TEprep, where TW is the temporal window width, and n is from 1 to N – TW + 1 with an interval of TW.

The regularization form of the SIMPLE weighted images can be presented as:

$\underset{X}{argmin} E (X) - d_{2}^{2} + λ C (X)$ [4]

where d denotes the stacked (k, t)-space acquired data; E denotes the magnetic resonance imaging (MRI) encoding operator modeling the under-sampling mask, fast Fourier transform (FFT) and coil sensitivity, C(X) denotes the regularization constraint applied to X; λ denotes the regularization factor. The X is typically assumed to have a low-rank structure due to the strong temporal and spatial correlations among different TI weighted images. Therefore, Eq. [4] can be formulated as an optimization problem with low-rank constraints. A common regularization method (LS method) is to express Eq. [4] in an additive form (13):

$\underset{L, S}{argmin} E (X) - d_{2}^{2} + λ_{L} L_{*} + λ_{s} T S_{1} s . t . X = L + S$ [5]

where T is Fourier transform operator, L and S are the low rank and sparsity part, respectively, λ_L and λ_S are constraint operators. Another commonly used regularization method (CG method) is to express Eq. [5] in a multiplicative form:

$\underset{U, V}{argmin} E (X) - d_{2}^{2} + λ T V {(U V)}_{1} s . t . X = U V$ [6]

where TV is total variation operator, U and V are the temporal basis and spatial basis, respectively, λ is constraint operators. Both LS method and CG method can utilize low rank sparse constraint priors for an iterative solution, so we combined the proposed STEP method with the above method for verification. Specifically, deep learning can be embedded into the E operator in Eqs. [5] and [6] to accelerate the weighted image reconstruction process and improve the accuracy of T1 and T2 map estimations. Then the Eqs. [5] and [6] can be expressed as [low-rank and sparsity (LRS) block in Figure 1]:

$\underset{L, S}{argmin} E G R (X) - d_{2}^{2} + λ_{L} L_{*} + λ_{s} T S_{1} s . t . G R (X) = L + S$ [7]

$\underset{U, V}{argmin} E G R (X) - d_{2}^{2} + λ T V {(U V)}_{1} s . t . G R (X) = U V$ [8]

Figure 1 Illustration of the simultaneous T1/T2 mapping with deep learning enhanced spatial-temporal and physical constraint framework. LRS, low-rank and sparsity.

where R denotes designed networks mentioned above that can generate T1 maps and T2 maps from weighted images X (R block in Figure 1); G refers to using the physical model based on Eqs. [1] and [2] along with T1 and T2 maps in R to generate corresponding mixed T1- and T2-weighted images (G block in Figure 1). In the studies by Pruessmann et al. and Otazo et al. (24,25), Eqs. [7] and [8] can be solved by shrinkage threshold method (26,27) and CG method (24), respectively. The methods of iterative updating in STEP for Eqs. [7] and [8] are consistent with the methods mentioned in previous studies (24,26,27). Besides, at each iteration, M₀_i is updated using R block and G block. Specifically, the corresponding T1 and T2 values are obtained through R: $R (X_{i}) = (T 1_{i}, T 2_{i})$ and proton density map M₀_i can be obtained using the following formula:

$M_{0 i} = \frac{\sum_{i} \sin θ [\frac{1}{Nlines} \sum_{j} e^{i ϕ} S_{j} (T_{1}, T_{2}, M_{0}, θ, α)] S I_{i}}{\sum_{i} [\sin θ [\frac{1}{Nlines} \sum_{j} e^{i ϕ} S_{j} (T_{1}, T_{2}, M_{0}, θ, α)]]}$ [9]

where SI_i denotes the reconstructed ith frame image after phase correction (35) and $ϕ = arctan (\frac{real (X_{i - 1})}{imag (X_{i - 1})})$ . Then the updated weighted X_i₊₁ image according to Eqs. [1] and [2]: $X_{i + 1} = G (T 1_{i}, T 2_{i}, M_{0 i})$ . S_j is the signal curve from the reconstructed image series defined in Eq. [1] to the theoretically simulated signal, T1 and T2 can be estimated $S_{j} (T_{1}, T_{2}, M_{0}, θ, α)$ :

$S_{j} (T_{1}, T_{2}, M_{0}, θ, α) = \sum_{i} [S I_{i} - \frac{1}{Nlines} \sum_{j} S_{j} (T_{1}, T_{2}, M_{0}, θ, α)]$ [10]

where the variable Nlines represents the quantity of radial lines within each temporal window corresponding to individual inversion recovery (IR) repetition times (Nlines = 25 in the SIMPLE sequence). $S_{j} \in ℂ$ denotes the set of reconstructed complex-valued T1- and T2-weighted images. So far, M₀_i in each iteration will be constrained by deep learning and physical model. The entire process can be found in Appendix 1.

Deep neural network setting and training

The dense attention U-Net (36,37) was utilized to implement the attention mechanism to help get more accurate quantitative maps of T1 and T2. The network output consisted of the reference T1 and T2 maps, while the inputs were the mixed T1 and T2 weighted images. Two parallel networks with the loss function (L_T1total and L_T2total) are used to quantify T1 and T2, which can simultaneously help restore details and denoise. L_T1total and L_T2total of the network was defined as a combination of mean square error and mean absolute error (37,38):

$L_{T1total} = 0.5 * L_{M S E T 1} + 0.5 * L_{M A E T 1}$ [11]

$L_{T2total} = 0.5 * L_{M S E T 2} + 0.5 * L_{M A E T 2}$ [12]

where $L_{M A E T 1} = \frac{1}{m} \sum_{i = 1}^{m} R_{T 1} (X_{i}) - Q_{T 1 i 1}$ , $L_{M A E T 2} = \frac{1}{m} \sum_{i = 1}^{m} R_{T 2} (X_{i}) - Q_{T 2 i 1}$ , $L_{M S E T 1} = \frac{1}{m} \sum_{i = 1}^{m} R_{T 1} (X_{i}) - Q_{T 1 i 2}$ , $L_{M S E T 2} = \frac{1}{m} \sum_{i = 1}^{m} R_{T 2} (X_{i}) - Q_{T 2 i 2}$ . R_T_l and R_T₂ denotes the Dense Attention U-Net neural networks; X_i denotes the SIMPLE weighted images; Q_T_l_i and Q_T₂_i denotes the reference T1 and T2 maps; m denotes the training batch number. Simulated digital brain phantoms (39) were employed for network training, and the details of this process can refer to Appendix 2. Weighted images, T1 maps, and T2 maps were generated based on the signal formula of SIMPLE sequence. The training data encompassed the following components: the simulated weighted images (specifically, the final reconstructed SIMPLE-weighted images obtained using the LS and CG methods) and the intermediate reconstructed simulated weighted images. These intermediate results were generated through the inverse non-uniform fast Fourier transform processing, which included intermediate reconstructed images generated during the first three iterations of both the LS and CG methods. Additionally, complex-valued Gaussian noise following a uniformly random distribution (20–70 dB) was added to the simulated k-space data. Images were divided into a training, validation, and testing data set that had 45,600 (12 simulated brain volumes), 15,200 (4 simulated brain volumes), and 1,216 (8 simulated brain volumes) image slices after data augmentation using random flipping and rotation.

Network implementation details

In this study, a U-Net is adopted as the backbone network, with additional skip connections introduced within each encoding module, decoding module, and the bridging module. The detailed configuration is as follows. Max pooling is used for downsampling, and rectified linear unit (ReLU) is employed as the activation function. Upsampling is performed using transposed convolutions with ReLU activation. The bridging module also uses ReLU as the activation function. The convolution kernel size in the encoding modules, decoding modules, transposed convolutions for upsampling, and the bridging module is set to 3×3. Both the encoding and decoding paths consist of four levels, and the feature maps from the corresponding encoding and decoding modules are concatenated. The numbers of output feature channels at each level are 32, 64, 128, and 256, respectively. After empirical hyperparameter tuning during training, stochastic gradient descent is used to update the network weights, with a weight decay of 10⁻⁸ and a momentum of 0.9. The learning rate is initially set to 0.5 and is reduced by 0.1 every 100 iterations. The total number of training iterations is 300.

Implementation details

The proposed STEP method was combined with the CG method and LS method (Eqs. [7] and [8]), followed by the least-squares fitting (STEP CG and STEP LS) and conventional deep learning mapping [SIMPLE CG (deep learning) and SIMPLE LS (deep learning)] and network applied only to NUFFT weighted images (SIMPLE AttUnet). The normalized root mean square error (nRMSE) (40) and structural similarity (SSIM) index (40) of the reconstructed T1 and T2 maps were used to quantitatively evaluate these methods. Additionally, random θ values ([0.85, 1.15]×8/180×3.14) and α values ([0.90, 1]) are used to simulate the actual flip-angle variations of the θ pulses and the attenuation of longitudinal magnetization in real scans. Complex-valued Gaussian noise following a uniformly random distribution (20–70 dB) was added to the simulated k-space data.

θ and α were estimated from the SPGR radial data by using only the central half data in in-vivo experiment as described in the study by Qi et al. (13).

Data acquisitions were performed on a 3T MR scanner (Ingenia CX, Philips Healthcare, Best, the Netherlands). All experiments were performed on a server with a 6-core Intel Core i7-6850K CPU, a NVIDIA Titan Xp 12-GB GPU (Santa Clara, CA, USA), and 32 GB of RAM.

Simulation experiment

The size of the simulated weighted images is 152×152×152×21. The size of the T1 maps and T2 maps is 152×152×152. The parameters of the simulated SIMPLE: field of view (FOV) =120, TEprep =[25, 50, 0], nTR =240, fast gradient-echo factor =175. Various noise levels (SNRs of 30, 35, 40, and 45 dB) and under-sampling factors (spokes of 2,000, 1,500, and 1,000) were established in the experiments. Under this configuration requiring approximately 72,546 spokes to fill k-space with 3D radial sampling, the undersampling multiples (spokes of 2,000, 1,500, and 1,000) correspond to 36×, 54×, and 72×. The mean values of the T1 and T2 maps of randomly selected gray and white matter regions of interest were calculated.

Phantom experiment

A real phantom consisting of 6 tubes containing diluted gadopentetic acid was scanned using the SIMPLE, the IR spin-echo (IR-SE) with 14 T1 (14 TIs =100, 200, 300, 400, 500, 600, 700, 800, 900, 1,000, 1,500, 2,000, 2,500, and 3,000 ms), and the multi-echo spin-echo (ME-SE) sequences with 8 TE (8 TE values ranging from 8.5 to 68 ms with an interval of 8.5 ms). The parameters of IR-SE: FOV =150×100 mm², TR/TE =15,000/9.3 ms. The parameters of ME-SE: FOV =150×100 mm², TR/TE =15,000/19 ms. Pearson correlations of T1 values for IR-SE, LS, CG, and the proposed method were calculated using IR-SE as the reference, and T2 correlations for ME-SE, LS, CG, and the proposed method using ME-SE as the reference.

In vivo experiment

The study was conducted in accordance with the Declaration of Helsinki and its subsequent amendments. The study was approved by institutional review board of Tsinghua University (No. 20200034) and informed consent was obtained from all individual participants. Three healthy volunteers were scanned using the SIMPLE sequence, 2D modified Look-Locker inversion recovery (MOLLI) sequence (41) and multi-echo turbo spin echo (ME-TSE) sequence (42). Various parts of brain anatomy (genu of corpus callosum, splenium of corpus callosum, frontal white matter and caudate gray matter) were outlined on MOLLI and ME-TSE images and mapped to T1 and T2 maps respectively. The parameters of MOLLI: FOV =240×240 mm², TR/TE =2.5/1.16 ms, pixel size =2.4×2.4×4 mm³, flip angle =20°. The parameters of ME-TSE: FOV =240×240 mm², TR/TE =3,000/20 ms, pixel size =2.4×2.4×4 mm³, flip angle =90°. The parameters of SIMPLE: FOV =240×240×240 mm³, TR/TE =10/3.6 ms, pixel size =1.6×1.6×1.6 mm³, flip angle =8°. From the images of three healthy volunteers, we extracted 50 corresponding T1 and T2 quantitative maps reconstructed using the STEP CG and SIMPLE CG methods, respectively. The mean values and mean difference values of the T1 and T2 maps for these regions of interest were calculated. Bland-Altman analysis were calculated to evaluate the consistency of the STEP method, MOLLI method and ME-TSE method.

Results

Simulation experiment

The proposed method first conducts the relevant ablation comparisons, as shown in Table 1. The quantitative results (Table 1) indicate that the STEP CG method achieves the lowest nRMSE and the highest SSIM compared to SIMPLE CG (deep learning), SIMPLE LS (deep learning) and STEP LS methods. Furthermore, STEP LS method also has lower nRMSE and higher SSIM than SIMPLE LS (deep learning) method. From the Table 1, it can be observed that, compared with the SIMPLE AttUnet method, the proposed method can explicitly introduce low-rank/sparse priors and physical-model priors into the quantitative iteration, enabling the quantitative maps to be more accurately constrained and thus producing more accurate T1 and T2 results. The proposed STEP CG method achieved the closest T1 and T2 relaxation times to Ground Truth in simulated gray and white matter regions of the brain (Table 2). Compared with SIMPLE LS (deep learning) and SIMPLE CG (deep learning), although the purely indirect pipeline can improve image quality, the iterative reconstruction of weighted images and the fitting of quantitative maps are completely separated. As a result, the temporal continuity and spatial structural information present in the weighted images cannot be utilized during the quantitative fitting, leading to inferior numerical performance compared with the proposed STEP method. This demonstrates that each component of the proposed framework (the deep-learning network and the physics-model-driven data-consistency module) contributes to performance improvement.

Table 1

SSIM/nRMSE comparison of direct deep learning mapping and STEP method

Method	Numerical indicators	T1	T2
STEP CG	nRMSE	0.083±0.009	0.080±0.016
STEP CG	SSIM	0.793±0.026	0.799±0.036
SIMPLE CG (deep learning)	nRMSE	0.095±0.008	0.092±0.018
SIMPLE CG (deep learning)	SSIM	0.764±0.027	0.769±0.038
STEP LS	nRMSE	0.086±0.009	0.084±0.016
STEP LS	SSIM	0.773±0.028	0.782±0.039
SIMPLE LS (deep learning)	nRMSE	0.153±0.016	0.134±0.022
SIMPLE LS (deep learning)	SSIM	0.551±0.046	0.573±0.051
AttUnet	nRMSE	0.108±0.011	0.095±0.013
AttUnet	SSIM	0.706±0.036	0.660±0.032

Data are presented as mean ± standard deviation. Spoke =4,000; SNR =45. CG, conjugate gradient; LS, low-rank plus sparse; nRMSE, normalized root mean square error; SNR, signal-to-noise ratio; SSIM, structural similarity.

Table 2

T1/T2 relaxation time comparison of direct deep learning mapping and STEP method (ms)

Method	WM T1	WM T2	GM T1	GM T2
Ground truth	1,400.00	80.00	1,932.00	133.00
STEP CG	1,431.29	78.42	1,984.12	127.13
SIMPLE CG (deep learning)	1,437.13	76.58	2,031.15	117.63
STEP LS	1,432.31	77.01	1,992.47	126.22
SIMPLE LS (deep learning)	1,514.80	75.76	2,013.89	116.21
AttUnet	1,504.67	74.82	2,024.06	113.81

Spoke =4,000; SNR =45. CG, conjugate gradient; GM, gray matter; LS, low-rank plus sparse; SNR, signal-to-noise ratio; WM, white matter.

Furthermore, we compare our approach with the widely used SIMPLE LS and SIMPLE CG methods. The numerical results of further improving the under-sampling factor and noise level are shown in Table 3. The proposed STEP CG consistently achieved the highest SSIM and near-lowest nRMSE in both T1 and T2 estimations compared to the other methods. The STEP LS method displayed SSIM (0.717±0.035) and nRMSE (0.103±0.009) values very close to those of the STEP CG method at a low noise level (SNR of 40 dB). However, as the noise level increases, the SSIM of T1 fitted by STEP LS shows a notable decrease. In contrast, STEP CG maintains robust numerical indicators. At a high noise level, the SSIM and nRMSE indicators for STEP LS are relatively poor but still outperform those of SIMPLE LS. The SSIM for T1 fitted by STEP CG decreases to below 0.7 but remains within an acceptable range at 0.694±0.037, while the T2 maps show acceptable SSIM and nRMSE indices. The proposed STEP LS method achieved a lower nRMSE than the other methods in T2 estimation when the under-sampling factor was low (spokes =2,000). Although the SSIM values for STEP LS are not as high as those for STEP CG, they remain relatively close. When the under-sampling factor is high (spokes =1,000), the STEP CG method demonstrates the highest SSIM and the lowest nRMSE. The proposed STEP method could effectively decrease the required iterations of constraint reconstruction. By integrating the neural network’s denoising capability into weighted image iterations, it accelerates algorithm convergence. As shown in Table 4, the proposed method for reconstruction and quantification takes only 7–14 minutes (matrix size: 152×152×152 with a 6-core Intel Core i7-6850K CPU, a NVIDIA Titan Xp 12 GB GPU and 32 GB of RAM), whereas the traditional method requires approximately 20 minutes for reconstruction and around 20 hours for fitting. Although direct deep learning after traditional reconstruction saves time, it still demands substantial computational resources for weighted image reconstruction. Moreover, as illustrated in Figure 2, the conventional iterative module constrains the quantitative maps derived from the physics-informed deep learning inversion. Consequently, qualitatively comparable T1 and T2 maps are obtained after a single iteration, and the process converges within only three iterations, substantially accelerating the quantification workflow. It can be observed that the increase in noise level has a greater impact on T1 estimation than on T2 estimation (Figure 3; 40 and 35 dB with 2,000 spokes). Notably, STEP LS exhibits amplified errors in the gyrus and clivus regions under high noise conditions, as indicated by the white arrows. STEP CG demonstrates greater robustness in both T1 and T2 estimations compared to STEP LS. At high noise levels (35 dB), STEP CG shows lower nRMSE values (T1: 0.104 and 0.096 vs. 0.107 and 0.114; T2: 0.104 and 0.084 vs. 0.104 and 0.089) and higher SSIM values (T1: 0.708 and 0.729 vs. 0.654 and 0.655; T2: 0.696 and 0.724 vs. 0.656 and 0.662). Figure 4 presents example reconstructed T1 maps of two slices at different under-sampling factors (spokes of 2,000, 1,500, and 1,000) with 45 dB SNR. Visually, compared to the results from SIMPLE CG and SIMPLE LS, the STEP CG and STEP LS methods notably reduce under-sampling artifacts, particularly at higher under-sampling factors (white arrows). In the SIMPLE CG and SIMPLE LS methods, an increase in the under-sampling factor introduces more noise and has a greater impact on T2 estimation. In contrast, the proposed STEP method demonstrates greater robustness in estimating T2 across varying under-sampling factors (Figure 5). At the edge of the midbrain, clear mapping can still be achieved (Figure 5, white arrows) despite the decrease in the spokes.

Table 3

SSIM/nRMSE comparison of reconstruction methods at different noise levels and spokes

Comparison conditions	Quantitative value	STEP CG		SIMPLE CG		STEP LS		SIMPLE LS
Comparison conditions	Quantitative value	nRMSE	SSIM	nRMSE	SSIM	nRMSE	SSIM	nRMSE	SSIM
Spoke 2,000; SNR 40 dB	T1	0.092±0.008	0.751±0.029	0.177±0.009	0.554±0.025	0.103±0.009	0.717±0.035	0.238±0.012	0.497±0.023
Spoke 2,000; SNR 40 dB	T2	0.094±0.018	0.760±0.040	0.229±0.021	0.476±0.025	0.096±0.020	0.740±0.045	0.179±0.013	0.496±0.022
Spoke 2,000; SNR 35 dB	T1	0.101±0.008	0.729±0.033	0.194±0.009	0.554±0.029	0.112±0.010	0.681±0.037	0.229±0.011	0.507±0.022
Spoke 2,000; SNR 35 dB	T2	0.098±0.019	0.747±0.041	0.271±0.017	0.463±0.024	0.103±0.022	0.712±0.047	0.207±0.014	0.480±0.023
Spoke 2,000; SNR 30 dB	T1	0.108±0.008	0.694±0.037	0.224±0.014	0.551±0.034	0.160±0.012	0.597±0.039	0.227±0.011	0.523±0.025
Spoke 2,000; SNR 30 dB	T2	0.102±0.020	0.727±0.042	0.330±0.013	0.451±0.025	0.130±0.027	0.602±0.064	0.286±0.018	0.460±0.025
Spoke 2,000; SNR 45 dB	T1	0.095±0.008	0.760±0.027	0.168±0.010	0.578±0.024	0.099±0.009	0.736±0.033	0.245±0.013	0.486±0.024
Spoke 2,000; SNR 45 dB	T2	0.094±0.018	0.765±0.039	0.214±0.022	0.480±0.027	0.091±0.018	0.756±0.043	0.169±0.013	0.507±0.021
Spoke 1,500; SNR 45 dB	T1	0.100±0.007	0.728±0.031	0.180±0.010	0.556±0.022	0.110±0.010	0.690±0.035	0.245±0.012	0.486±0.023
Spoke 1,500; SNR 45 dB	T2	0.097±0.019	0.748±0.039	0.217±0.020	0.485±0.022	0.099±0.020	0.725±0.044	0.188±0.012	0.496±0.022
Spoke 1,000; SNR 45 dB	T1	0.107±0.007	0.705±0.028	0.182±0.010	0.572±0.023	0.122±0.011	0.655±0.034	0.252±0.013	0.473±0.023
Spoke 1,000; SNR 45 dB	T2	0.099±0.019	0.726±0.039	0.259±0.023	0.458±0.029	0.106±0.021	0.685±0.043	0.243±0.024	0.469±0.026

Data are presented as mean ± standard deviation. CG, conjugate gradient; LS, low-rank plus sparse; nRMSE, normalized root mean square error; SNR, signal-to-noise ratio; SSIM, structural similarity.

Table 4

The mean times (min) of T1 and T2 quantifications in different methods

Method	Reconstruction (min)	Fitting (min)
STEP CG	14.13±0.45
STEP LS	7.75±0.29
SIMPLE CG	28.17±1.25	1,140.57±5.45
SIMPLE LS	13.01±2.15	1,216.20±4.25
SIMPLE CG (deep learning)	28.17±1.25	0.16±0.02
SIMPLE LS (deep learning)	13.01±2.15	0.15±0.03

Data are presented as mean ± standard deviation. Spoke =2,000; SNR =45. CG, conjugate gradient; LS, low-rank plus sparse; SNR, signal-to-noise ratio.

Figure 2 Example for convergence of STEP reconstruction and fitting. LRS, low-rank and sparsity; nRMSE, normalized root mean square error; SSIM, structural similarity.

Figure 3 Example for reconstructed T1 (A) and T2 (B) maps of two slices at different noise levels (40 and 35 dB) with an under-sampling factor of 2,000 spokes. STEP LS exhibits amplified errors in the gyrus and clivus regions under high noise conditions, as indicated by the white arrows. CG, conjugate gradient; LS, low-rank plus sparse; nRMSE, normalized root mean square error; SNR, signal-to-noise ratio; SSIM, structural similarity.

Figure 4 Example for reconstructed T1 maps of two slices at different under-sampling factors (spokes of 2,000, 1,500, and 1,000) with a noise level of 45 dB. Compared to the results from SIMPLE CG and SIMPLE LS, the STEP CG and STEP LS methods notably reduce under-sampling artifacts, particularly at higher under-sampling factors (white arrows). CG, conjugate gradient; LS, low-rank plus sparse; nRMSE, normalized root mean square error; SSIM, structural similarity.

Figure 5 Example for reconstructed T2 maps of two slices at different under-sampling factors (spokes of 2,000, 1,500, and 1,000) with a noise level of 45 dB. At the edge of the midbrain, clear mapping can still be achieved (white arrows) despite the decrease in the spokes. CG, conjugate gradient; LS, low-rank plus sparse; nRMSE, normalized root mean square error; SSIM, structural similarity.

Phantom experiment

As shown in Figure 6, the STEP produces more uniform T1 and T2 maps withi regions of interest compared to the SIMPLE CG and SIMPLE LS methods. In most regions of interest, T1 values were overestimated for SIMPLE CG and SIMPLE LS, especially at higher T1 values, a problem that the proposed method effectively mitigates. For T2 values, noticeable fitting errors were observed in SIMPLE CG and SIMPLE LS, which were effectively avoided in the proposed method. In Pearson correlation analysis (Figure 7), T1 and T2 values quantified by proposed method showed higher correlation coefficients (T1:R²=0.99 for STEP CG, T2: R²=0.94 for STEP CG; T1: R²=0.99 for STEP LS, T2: R²=0.90 for STEP LS) than those obtained with SIMPLE CG method (T1: R²=0.92, T2: R²=0.89) and SIMPLE LS method (T1: R²=0.95, T2: R²=0.90).

Figure 6 A real phantom consisting of 6 tubes containing diluted gadopentetic acid was scanned using the SIMPLE, the IR-SE, and the ME-SE sequences. CG, conjugate gradient; IR, inversion recovery; LS, low-rank plus sparse; ME, multi-echo; SE, spin-echo.

Figure 7 Pearson correlations of T1 values for IR-SE, LS, CG, and the proposed method were calculated using IR-SE as the reference, and T2 correlations for ME-SE, LS, CG, and the proposed method using ME-SE as the reference. CG, conjugate gradient; IR, inversion recovery; LS, low-rank plus sparse; ME, multi-echo; SE, spin-echo.

In vivo experiment

Figure 8 presents the in vivo brain results. The regional analysis in Table 5 shows that the standard deviation of T1 and T2 quantifications by the proposed method was lower than that of the SIMPLE CG and SIMPLE LS methods. Moreover, the results obtained by STEP are consistently closer to the literature-reported values (43,44) than those of the corresponding conventional methods, demonstrating that STEP has a superior capability to estimate true T1 and T2 values. However, estimating cerebrospinal fluid (CSF) proves particularly challenging due to its complex and variable anatomy, which is not fully captured by the relatively fixed CSF structures in the simulated training data. Furthermore, the model is constrained by low-rank sparse regularization, collectively leading to inaccuracies in CSF relaxation time estimates on real data. In contrast, the lateral ventricles exhibit more consistent and simplified anatomy, making them easier to learn; consequently, the method maintains relatively accurate estimates when applied to real data. Bland-Altman analysis (Figure 9) reveals that T1 values measured by STEP CG and STEP LS exceeded MOLLI measurements by 28.90 and 40.45 ms, respectively. Conversely, T2 values from STEP CG and STEP LS were 8.422 and 15.24 ms lower than ME-TSE values. As shown in Figure 9, the limits of agreement (LoA) represent the 95% distribution range of the differences between two measurement methods. Within this range, MOLLI yields lower T1 estimates compared to the proposed method, whereas METSE produces higher T2 estimates relative to the proposed method. Compared to MOLLI and ME-TSE techniques, the proposed method provides more accurate quantitative maps, overcoming MOLLI’s tendency to underestimate T1 (45) and ME-TSE’s tendency to overestimate T2 (46,47). Besides, we introduced radio-frequency (RF) inhomogeneity and refocusing flip-angle attenuation into the training data, and acquired images with high spatiotemporal resolution where feasible, to mitigate potential confounding factors such as RF inhomogeneity, refocusing angle variations, and partial-volume effects.

Figure 8 The in vivo brain results scanned using the SIMPLE sequence, MOLLI sequence (T1 reference) and ME-TSE sequence (T2 reference). CG, conjugate gradient; LS, low-rank plus sparse; ME, multi-echo; TSE, turbo spin echo.

Table 5

The mean and standard deviation of T1 and T2 quantifications in various parts of brain anatomy

Anatomical location	Quantitative value	Reference (ms)	Literature value (ms)	SIMPLE LS (ms)	STEP LS (ms)	SIMPLE CG (ms)	STEP CG (ms)
Corpus callosum (genu)	T1	738.3±6.7	707±28	929.4±32.2	821.1±23.9	928.4±32.2	785.2±18.4
Corpus callosum (genu)	T2	73.0±0.9	46.8±2.3	68.0±1.6	58.0±1.0	66.1±1.8	65.9±0.4
Corpus callosum (splenium)	T1	685.2±15.5	N/A	628.8±28.7	625.3±11.4	627.3±28.5	592.4±9.4
Corpus callosum (splenium)	T2	72.3±2.1	55.3±3.0	55.70±5.6	55.11±4.8	55.37±5.5	54.22±2.9
Caudate nucleus	T1	1,219.3±5.0	1,197±47	1,497.9±104.8	1,270.6±81.4	1,492.2±108.9	1,255.7±38.7
Caudate nucleus	T2	76.5±0.9	57.1±2.8	68.42±4.3	64.72±3.5	68.9±4.6	68.3±3.4
Frontal white matter	T1	755.6±34.7	754±18	868.9±34.8	804.9±18.6	857.7±30.5	789.9±13.2
Frontal white matter	T2	74.9±2.6	53.0±1.5	63.9±4.1	58.6±3.3	66.4±4.9	65.6±1.9

Data are presented as mean ± standard deviation. CG, conjugate gradient; LS, low-rank plus sparse; N/A, not applicable.

Figure 9 The in vivo brain Bland-Altman analysis results. CG, conjugate gradient; CI, confidence interval; LoA, limits of agreement; LS, low-rank plus sparse.

Discussion

The main innovation of STEP is that, with the help of deep-learning priors, it explicitly integrates low-rank/sparse priors with the T1/T2 physical model, enabling accurate and fast simultaneous quantification of multiple quantitative parameters. The physical model in the STEP method enables both deep learning and traditional constraint reconstruction methods to jointly contribute to quantitative reconstruction simultaneously, achieving superior performance compared to relying on either method alone. Compared to deep-learning-only approaches, the proposed STEP method generates more accurate T1 and T2 mapping. By incorporating both the physical model and constraint-based fidelity, the quantization maps obtained through the proposed method leverage deep learning priors and benefit from the joint influence of low-rank and sparse constraints during the iterative process. This approach addresses the limitation of fully integrated deep neural networks, which relies entirely on training data while demonstrating certain instabilities. The proposed method integrates deep learning into various iterative algorithms for validation, among which the STEP CG approach achieves near-optimal performance. The STEP CG method substantially streamlines the conventional iterative reconstruction and fitting pipeline, thereby improving the efficiency of multi-parameter quantification. In addition, various iterative constraint algorithms (e.g., the LS and CG algorithms discussed in the Methods section) play a role similar to the backbone of deep learning networks, extracting feature information using low rank sparse priors and optimizing backpropagation through physical models, which helps optimize weighted images, reduce iteration time, and produce more accurate quantitative maps. After each iteration, the steps involving the physical model are executed to generate weighted images from the quantitative maps using the signal equations of the adopted quantitative MRI sequence. Owing to this explicit integration of the physical signal model, STEP is inherently generalizable to other quantitative MRI protocols. Specifically, STEP can be readily adapted by replacing the “signal evolution” module in the R block with the signal model corresponding to the desired protocol. Since the framework operates by fitting weighted images to the underlying signal equation to estimate the target quantitative parameters (e.g., T1 or T2) and enforces data consistency through the inverse operation, this mechanism contributes in the same principled manner regardless of the specific quantitative MRI sequence, thereby enabling straightforward extension beyond the protocol investigated in this study.

The proposed STEP method shows clear distinctions and greater scalability compared to other advanced multiparameter quantification techniques, for instance, magnetic resonance fingerprinting (MRF) (1). In contrast to conventional MRF (11), our approach embeds a deep learning module into the reconstruction pipeline, which speeds up multiparameter fitting and quantification while streamlining the workflow. Although some low-rank subspace MRF variants have been proposed to optimize the process (48), STEP, unlike these methods, does not require estimating a low-rank dictionary subspace. Instead, it applies low-rank constraints holistically to the weighted images during reconstruction, thereby avoiding potential losses in fitting accuracy that may result from subspace dimensionality reduction. Furthermore, relative to deep learning-based regression methods for MRF parameter estimation (49,50), the proposed approach explicitly constrains the deep learning module through physical formulas. This not only provides stronger regularization but also allows for straightforward extension to any parameter-fitting model describable by physical equations, a feature not readily achievable with standard MRF.

Additionally, the weighted images in SIMPLE combine the information of T1 and T2 simultaneously. During magnetization evolution, the T1 variation demonstrates higher complexity and a wider dynamic range, while T2 changes remain relatively minor. When applying the least squares method to estimate T1 and T2 maps, this approach introduces not only noise interference but also accuracy degradation caused by the disparity in T1/T2 amplitude variations, resulting in significant estimation errors and pronounced noise artifacts (Figure 5). To address this, the proposed STEP method employs two parallel deep learning networks for T1 and T2 training, effectively correcting mapping deviations. First, reference T1 and T2 maps derived from simulated brain phantoms are used to train the networks, producing maps closer to ground truth. Subsequently, the mapping based on deep learning used the characteristics of down sampling, up sampling and layer hopping connection of dense attention U-Net to further link the spatial correlation between pixels. Thus, the T1 and T2 quantitative maps have higher smoothness, less noise, and more accurately reflect the actual anatomical structure. The proposed STEP method can reduce the number of spokes required during scanning, and shows the potential to reduce the total scanning time of the sequence. By combining the proposed approach with LS reconstruction iterations, 152 T1 and T2 maps can be reconstructed and fitted within 7 minutes, where the majority of the computation time is spent on the LRS module iterations, while the G and R modules require only about 1 second. Notably, for clinical applications, the number of iterations can be further reduced to decrease the overall scanning and reconstruction time, thereby better meeting clinical requirements. The method uses only 2–4% (spokes of 1,000 to 2,000) of full k-space data to generate quantitative maps, yet still achieves an SSIM greater than 0.7 and an nRMSE lower than 0.1. This performance may be due to the deep learning model’s ability to denoise the quantitative map at the beginning of the iteration, thereby improving the image quality of the initial under-sampled weighted images.

The proposed method still has room for improvement. In practical scans, due to variations in scanners/vendors/field strengths, deviations from the SIMPLE model, motion, and other artifacts, the performance of the proposed STEP method may be affected. To mitigate this problem, data correction and related preprocessing can be performed at the early iterations of the LRS reconstruction step to ensure that the weighted images fed into the subsequent neural network module are of high quality. In recent years, self-supervised methods incorporating pre-trained models have been proposed (51,52), which not only enhance training efficiency but also enable models to focus on task-specific features. The proposed framework can similarly integrate such self-supervised pre-training into the R-block, and through joint regularization with the LRS-block, further improve the efficiency of the multi-parameter fitting pipeline. This direction represents a promising avenue for future research. For the reconstruction of potential pathological features, lesions can be synthesized in the training data based on abnormal T1 or T2 values reported in the existing literature, thereby enhancing the generalization capability of the network within STEP. This study primarily focuses on brain imaging; however, the proposed method also has the potential to be applied to a wide range of clinical scenarios. Specifically, the evaluation of vasculitis or intracranial atherosclerotic disease (ICAD) may require 3D, isotropic, high-resolution whole-brain imaging and quantification for comprehensive assessment (53). The proposed method can efficiently provide images with high spatiotemporal resolution, thereby meeting the clinical requirements of these applications. In addition, potential clinical applications include the use of the proposed STEP method to quantitatively analyze T1 and T2 values without the need for image registration, enabling quantitative assessment of MRI signal characteristics associated with specific plaque components (54,55). This capability provides strong support for the use of sequences that simultaneously quantify multiple parameters, such as T1 and T2, in characterizing atherosclerotic plaques. Besides, the proposed method may apply to other mapping techniques, such as R*2, susceptibility weighted imaging (SWI), and quantitative susceptibility mapping (QSM). This presents an exciting avenue for future research.

Conclusions

STEP was proposed to reconstructed weighted data and fit T1/T2 maps. The proposed method extracts feature information using low-rank sparse priors and optimizes backpropagation through physical models, leveraging the synergistic advantages of deep learning and low-rank sparse iterative processing to achieve accurate T1 and T2 quantification results simultaneously. The experimental results of simulated brain, real phantom and healthy volunteers show that this method can generate more accurate T1 and T2 mapping.

Acknowledgments

None.

Footnote

Data Sharing Statement: Available at https://qims.amegroups.com/article/view/10.21037/qims-2025-1563/dss

Funding: This study received funding from the Beijing Municipal Natural Science Foundation (No. Z190024), the Science and Technology Planning Program of Beijing Municipal Science & Technology Commission and Administrative Commission of Zhongguancun Science Park (No. Z231100004823012), the Key Program of the National Natural Science Foundation of China (No. 81930119), the Beijing Natural Science Foundation-Daxing Innovation Joint Fund (No. L246019), and the Beijing Natural Science Foundation-Haidian Original Innovation Joint Fund (Nos. L242045 and L252052).

Conflicts of Interest: All authors have completed the ICMJE uniform disclosure form (available at https://qims.amegroups.com/article/view/10.21037/qims-2025-1563/coif). The authors have no conflicts of interest to declare.

Ethical Statement: The authors are accountable for all aspects of the work in ensuring that questions related to the accuracy or integrity of any part of the work are appropriately investigated and resolved. The study was conducted in accordance with the Declaration of Helsinki and its subsequent amendments. The study was approved by institutional review board of Tsinghua University (No. 20200034) and informed consent was obtained from all individual participants.

Open Access Statement: This is an Open Access article distributed in accordance with the Creative Commons Attribution-NonCommercial-NoDerivs 4.0 International License (CC BY-NC-ND 4.0), which permits the non-commercial replication and distribution of the article with the strict proviso that no changes or edits are made and the original work is properly cited (including links to both the formal publication through the relevant DOI and the license). See: https://creativecommons.org/licenses/by-nc-nd/4.0/.

References

Ma D, Gulani V, Seiberlich N, Liu K, Sunshine JL, Duerk JL, Griswold MA. Magnetic resonance fingerprinting. Nature 2013;495:187-92. [Crossref] [PubMed]
Vymazal J, Righini A, Brooks RA, Canesi M, Mariani C, Leonardi M, Pezzoli G. T1 and T2 in the brain of healthy subjects, patients with Parkinson disease, and patients with multiple system atrophy: relation to iron content. Radiology 1999;211:489-95. [Crossref] [PubMed]
Vrenken H, Geurts JJ, Knol DL, van Dijk LN, Dattola V, Jasperse B, van Schijndel RA, Polman CH, Castelijns JA, Barkhof F, Pouwels PJ. Whole-brain T1 mapping in multiple sclerosis: global changes of normal-appearing gray and white matter. Radiology 2006;240:811-20. [Crossref] [PubMed]
Damadian R. Tumor detection by nuclear magnetic resonance. Science 1971;171:1151-3. [Crossref] [PubMed]
Gibbs P, Tozer DJ, Liney GP, Turnbull LW. Comparison of quantitative T2 mapping and diffusion-weighted imaging in the normal and pathologic prostate. Magn Reson Med 2001;46:1054-8. [Crossref] [PubMed]
Siemonsen S, Mouridsen K, Holst B, Ries T, Finsterbusch J, Thomalla G, Ostergaard L, Fiehler J. Quantitative t2 values predict time from symptom onset in acute stroke patients. Stroke 2009;40:1612-6. [Crossref] [PubMed]
Siemonsen S, Löbel U, Sedlacik J, Forkert ND, Mouridsen K, Østergaard L, Thomalla G, Fiehler J. Elevated T2-values in MRI of stroke patients shortly after symptom onset do not predict irreversible tissue infarction. Brain 2012;135:1981-9. [Crossref] [PubMed]
Thavendiranathan P, Walls M, Giri S, Verhaert D, Rajagopalan S, Moore S, Simonetti OP, Raman SV. Improved detection of myocardial involvement in acute inflammatory cardiomyopathies using T2 mapping. Circ Cardiovasc Imaging 2012;5:102-10. [Crossref] [PubMed]
Bauer CM, Jara H, Killiany RAlzheimer's Disease Neuroimaging Initiative. Whole brain quantitative T2 MRI across multiple scanners with dual echo FSE: applications to AD, MCI, and normal aging. Neuroimage 2010;52:508-14. [Crossref] [PubMed]
Bernat JL. Magnetic resonance imaging of neurodegenerative. Neurology 1985;35:93.
Chen Y, Jiang Y, Pahwa S, Ma D, Lu L, Twieg MD, Wright KL, Seiberlich N, Griswold MA, Gulani V. MR Fingerprinting for Rapid Quantitative Abdominal Imaging. Radiology 2016;279:278-86. [Crossref] [PubMed]
Coolen BF, Poot DH, Liem MI, Smits LP, Gao S, Kotek G, Klein S, Nederveen AJ. Three-dimensional quantitative T1 and T2 mapping of the carotid artery: Sequence design and in vivo feasibility. Magn Reson Med 2016;75:1008-17. [Crossref] [PubMed]
Qi H, Sun J, Qiao H, Zhao X, Guo R, Balu N, Yuan C, Chen H. Simultaneous T(1) and T(2) mapping of the carotid plaque (SIMPLE) with T(2) and inversion recovery prepared 3D radial imaging. Magn Reson Med 2018;80:2598-608. [Crossref] [PubMed]
Nezafat R, Stuber M, Ouwerkerk R, Gharib AM, Desai MY, Pettigrew RI. B1-insensitive T2 preparation for improved coronary magnetic resonance angiography at 3 T. Magn Reson Med 2006;55:858-64. [Crossref] [PubMed]
Blume U, Lockie T, Stehning C, Sinclair S, Uribe S, Razavi R, Schaeffter T. Interleaved T(1) and T(2) relaxation time mapping for cardiac applications. J Magn Reson Imaging 2009;29:480-7. [Crossref] [PubMed]
Giri S, Chung YC, Merchant A, Mihai G, Rajagopalan S, Raman SV, Simonetti OP. T2 quantification for improved detection of myocardial edema. J Cardiovasc Magn Reson 2009;11:56. [Crossref] [PubMed]
Qi H, Sun J, Qiao H, Chen S, Zhou Z, Pan X, Wang Y, Zhao X, Li R, Yuan C, Chen H. Carotid Intraplaque Hemorrhage Imaging with Quantitative Vessel Wall T1 Mapping: Technical Development and Initial Experience. Radiology 2018;287:276-84. [Crossref] [PubMed]
Petzschner FH, Ponce IP, Blaimer M, Jakob PM, Breuer FA. Fast MR parameter mapping using k-t principal component analysis. Magn Reson Med 2011;66:706-16. [Crossref] [PubMed]
Zhao B, Lu W, Hitchens TK, Lam F, Ho C, Liang ZP. Accelerated MR parameter mapping with low-rank and sparsity constraints. Magn Reson Med 2015;74:489-98. [Crossref] [PubMed]
Doneva M, Börnert P, Eggers H, Stehning C, Sénégas J, Mertins A. Compressed sensing reconstruction for magnetic resonance parameter mapping. Magn Reson Med 2010;64:1114-20.
Velikina JV, Alexander AL, Samsonov A. Accelerating MR parameter mapping using sparsity-promoting regularization in parametric dimension. Magn Reson Med 2013;70:1263-73. [Crossref] [PubMed]
Zhao B, Haldar JP, Christodoulou AG, Liang ZP. Image reconstruction from highly undersampled (k, t)-space data with joint partial separability and sparsity constraints. IEEE Trans Med Imaging 2012;31:1809-20. [Crossref] [PubMed]
Zhao B, Haldar JP, Liang ZP. PSF model-based reconstruction with sparsity constraint: algorithm and application to real-time cardiac MRI. Annu Int Conf IEEE Eng Med Biol Soc 2010;2010:3390-3. [Crossref] [PubMed]
Pruessmann KP, Weiger M, Börnert P, Boesiger P. Advances in sensitivity encoding with arbitrary k-space trajectories. Magn Reson Med 2001;46:638-51. [Crossref] [PubMed]
Otazo R, Candès E, Sodickson DK. Low-rank plus sparse matrix decomposition for accelerated dynamic MRI with separation of background and dynamic components. Magn Reson Med 2015;73:1125-36. [Crossref] [PubMed]
Palomar DP, Eldar YC. Convex Optimization in Signal Processing and Communications. New York, NY: Cambridge University Press; 2010.
Eldar YC, Kutyniok G. Compressed Sensing: Theory and Applications. London, UK: Cambridge University Press; 2012.
LeCun Y, Bengio Y, Hinton G. Deep learning. Nature 2015;521:436-44.
Litjens G, Kooi T, Bejnordi BE, Setio AAA, Ciompi F, Ghafoorian M, van der Laak JAWM, van Ginneken B, Sánchez CI. A survey on deep learning in medical image analysis. Med Image Anal 2017;42:60-88. [Crossref] [PubMed]
Pei H, Xia D, Xu X, Yang Y, Wang Y, Liu F, Feng L. Rapid 3D T(1) mapping using deep learning-assisted Look-Locker inversion recovery MRI. Magn Reson Med 2023;90:569-82. [Crossref] [PubMed]
Shao J, Ghodrati V, Nguyen KL, Hu P. Fast and accurate calculation of myocardial T(1) and T(2) values using deep learning Bloch equation simulations (DeepBLESS). Magn Reson Med 2020;84:2831-45. [Crossref] [PubMed]
Jun Y, Shin H, Eo T, Kim T, Hwang D. Deep model-based magnetic resonance parameter mapping network (DOPAMINE) for fast T1 mapping using variable flip angle method. Med Image Anal 2021;70:102017. [Crossref] [PubMed]
Meng Z, Guo R, Li Y, Guan Y, Wang T, Zhao Y, Sutton B, Li Y, Liang ZP. Accelerating T(2) mapping of the brain by integrating deep learning priors with low-rank and sparse modeling. Magn Reson Med 2021;85:1455-67. [Crossref] [PubMed]
Song HK, Dougherty L. k-space weighted image contrast (KWIC) for contrast manipulation in projection reconstruction MRI. Magn Reson Med 2000;44:825-32. [Crossref] [PubMed]
Kellman P, Arai AE, McVeigh ER, Aletras AH. Phase-sensitive inversion recovery for detecting myocardial infarction using gadolinium-delayed hyperenhancement. Magn Reson Med 2002;47:372-83. [Crossref] [PubMed]
Oktay O, Schlemper J, Folgoc LL, Lee M, Heinrich M, Misawa K, Mori K, McDonagh S, Hammerla NY, Kainz B, Glocker B, Rueckert D. Attention U-net: learning where to look for the pancreas. In: Proceedings of the 1st Conference on Medical Imaging with Deep Learning. Amsterdam, the Netherlands; 2018.
Li Y, Wang Y, Qi H, Hu Z, Chen Z, Yang R, Qiao H, Sun J, Wang T, Zhao X, Guo H, Chen H. Deep learning-enhanced T(1) mapping with spatial-temporal and physical constraint. Magn Reson Med 2021;86:1647-61. [Crossref] [PubMed]
Zhao H, Gallo O, Frosio I, Kautz J. Loss functions for image restoration with neural networks. IEEE Trans Comput Imaging 2017;3:47-57.
Kwan RK, Evans AC, Pike GB. MRI simulation-based evaluation of image-processing and classification methods. IEEE Trans Med Imaging 1999;18:1085-97. [Crossref] [PubMed]
Sun C, Robinson A, Wang Y, Bilchick KC, Kramer CM, Weller D, Salerno M, Epstein FH. A Slice-Low-Rank Plus Sparse (slice-L + S) Reconstruction Method for k-t Undersampled Multiband First-Pass Myocardial Perfusion MRI. Magn Reson Med 2022;88:1140-55. [Crossref] [PubMed]
Messroghli DR, Radjenovic A, Kozerke S, Higgins DM, Sivananthan MU, Ridgway JP. Modified Look-Locker inversion recovery (MOLLI) for high-resolution T1 mapping of the heart. Magn Reson Med 2004;52:141-6. [Crossref] [PubMed]
Ben-Eliezer N, Sodickson DK, Block KT. Rapid and accurate T2 mapping from multi-spin-echo data using Bloch-simulation-based reconstruction. Magn Reson Med 2015;73:809-17. [Crossref] [PubMed]
Jiang K, Zhu Y, Jia S, Wu Y, Liu X, Chung YC. Fast T1 mapping of the brain at high field using Look-Locker and fast imaging. Magn Reson Imaging 2017;36:49-55. [Crossref] [PubMed]
Shepherd TM, Kirov II, Charlson E, Bruno M, Babb J, Sodickson DK, Ben-Eliezer N. New rapid, accurate T(2) quantification detects pathology in normal-appearing brain regions of relapsing-remitting MS patients. Neuroimage Clin 2017;14:363-70.
Roujol S, Weingärtner S, Foppa M, Chow K, Kawaji K, Ngo LH, Kellman P, Manning WJ, Thompson RB, Nezafat R. Accuracy, precision, and reproducibility of four T1 mapping sequences: a head-to-head comparison of MOLLI, ShMOLLI, SASHA, and SAPPHIRE. Radiology 2014;272:683-9. [Crossref] [PubMed]
Emmerich J, Flassbeck S, Schmidt S, Bachert P, Ladd ME, Straub S. Rapid and accurate dictionary-based T(2) mapping from multi-echo turbo spin echo data at 7 Tesla. J Magn Reson Imaging 2019;49:1253-62. [Crossref] [PubMed]
Hossein J, Fariborz F, Mehrnaz R, Babak R. Evaluation of diagnostic value and T2-weighted three-dimensional isotropic turbo spin-echo (3D-SPACE) image quality in comparison with T2-weighted two-dimensional turbo spin-echo (2D-TSE) sequences in lumbar spine MR imaging. Eur J Radiol Open 2019;6:36-41. [Crossref] [PubMed]
Zhao B, Setsompop K, Adalsteinsson E, Gagoski B, Ye H, Ma D, Jiang Y, Ellen Grant P, Griswold MA, Wald LL. Improved magnetic resonance fingerprinting reconstruction with low-rank and subspace modeling. Magn Reson Med 2018;79:933-42. [Crossref] [PubMed]
Ding T, Gao Y, Xiong Z, Liu F, Cloos MA, Sun H. MRF-Mixer: A Simulation-Based Deep Learning Framework for Accelerated and Accurate Magnetic Resonance Fingerprinting Reconstruction. Information 2025;16:218.
Li P, Hu Y. Deep magnetic resonance fingerprinting based on Local and Global Vision Transformer. Med Image Anal 2024;95:103198. [Crossref] [PubMed]
Chen Z, Hu Z, Xie Y, Li D, Christodoulou AG. Repeatability-encouraging self-supervised learning reconstruction for quantitative MRI. Magn Reson Med 2025;94:797-809. [Crossref] [PubMed]
Shi S, Wang C, Xiao S, Li H, Zhao X, Guo F, Shi L, Zhou X. Magnetic resonance image denoising for Rician noise using a novel hybrid transformer-CNN network (HTC-net) and self-supervised pretraining. Med Phys 2025;52:1643-60. [Crossref] [PubMed]
Song JW, Moon BF, Burke MP, Kamesh Iyer S, Elliott MA, Shou H, Messé SR, Kasner SE, Loevner LA, Schnall MD, Kirsch JE, Witschey WR, Fan Z. MR Intracranial Vessel Wall Imaging: A Systematic Review. J Neuroimaging 2020;30:428-42. [Crossref] [PubMed]
Harteveld AA, Denswil NP, Siero JC, Zwanenburg JJ, Vink A, Pouran B, Spliet WG, Klomp DW, Luijten PR, Daemen MJ, Hendrikse J, van der Kolk AG. Quantitative Intracranial Atherosclerotic Plaque Characterization at 7T MRI: An Ex Vivo Study with Histologic Validation. AJNR Am J Neuroradiol 2016;37:802-10. [Crossref] [PubMed]
Fernández-Alvarez V, Linares-Sánchez M, Suárez C, López F, Guntinas-Lichius O, Mäkitie AA, Bradley PJ, Ferlito A. Novel Imaging-Based Biomarkers for Identifying Carotid Plaque Vulnerability. Biomolecules 2023;13:1236. [Crossref] [PubMed]

Cite this article as: Yang R, Sun H, Lin X, Li H, Chen H. Spatial-temporal and physical constrained deep learning model for simultaneous T1 and T2 reconstruction and mapping (STEP). Quant Imaging Med Surg 2026;16(7):553. doi: 10.21037/qims-2025-1563

Spatial-temporal and physical constrained deep learning model for simultaneous T1 and T2 reconstruction and mapping (STEP)

Introduction

Methods

The SIMPLE sequence for T1 and T2 mapping

The proposed STEP method

Deep neural network setting and training

Network implementation details

Implementation details

Simulation experiment

Phantom experiment

In vivo experiment

Results

Simulation experiment

Table 1

Table 2

Table 3

Table 4

Phantom experiment

In vivo experiment

Table 5

Discussion

Conclusions

Acknowledgments

Footnote

References

Article Options

Download Citation

Share