Image harmonization and de-harmonization based on singular value decomposition (SVD) in medical domain
Original Article

Image harmonization and de-harmonization based on singular value decomposition (SVD) in medical domain

Huachao Chen1#, Xinze Li1#, Ka-Hou Chan1, Yue Sun1, Rongsheng Wang1, Qinquan Gao2, Tong Tong2, Tao Tan1

1Faculty of Applied Sciences, Macao Polytechnic University, Macao, China; 2College of Physics and Information Engineering, Fuzhou University, Fuzhou, China

Contributions: (I) Conception and design: H Chen, X Li, KH Chan, T Tan; (II) Administrative support: H Chen, X Li, R Wang, KH Chan, Y Sun; (III) Provision of study materials or patients: H Chen, X Li, Q Gao, Y Sun; (IV) Collection and assembly of data: KH Chan, H Chen, T Tong; (V) Data analysis and interpretation: Y Sun, H Chen, X Li; (VI) Manuscript writing: All authors; (VII) Final approval of manuscript: All authors.

#These authors contributed equally as co-first authors to this work.

Correspondence to: Tao Tan, PhD. Faculty of Applied Sciences, Macao Polytechnic University, Rua de Luís Gonzaga Gomes, Macao 999078, China. Email: taotan@mpu.edu.mo.

Background: Medical imaging is fundamental to modern clinical diagnostics, providing essential insights for disease detection and treatment planning. However, variations in imaging equipment, protocols, and conditions across institutions lead to inconsistencies in image quality, which hinders diagnostic accuracy and the performance of machine learning models. Although existing harmonization techniques improve image uniformity, they often result in the loss of critical image details. This study presents novel singular value decomposition (SVD)-based harmonization and de-harmonization algorithms, designed to address these challenges by ensuring consistency across diverse imaging conditions, while preserving essential diagnostic information.

Methods: The proposed approach utilizes SVD to decompose medical images into multiple frequency bands, allowing for frequency-specific adjustment that enhances both high-frequency details and low-frequency uniformity. The harmonization process begins by splitting red, green, blue (RGB) images into individual channels and applying SVD to extract principal components, enabling the selective enhancement of clinically relevant structures while mitigating variability in brightness and contrast. The de-harmonization method, in contrast, strategically subtracts high-frequency components to remove unwanted noise and preserve significant details. A novel integration of harmonization and de-harmonization processes is employed to optimize image clarity and diagnostic utility. The method’s robustness was evaluated through extensive experimentation, including homology (training and testing on the same dataset) and heterology (training on one dataset and testing on a different dataset) experiments. These tests were conducted across multiple datasets—handwritten digit classification (MNIST, USPS), retinal image segmentation [Digital Retinal Images for Vessel Extraction (DRIVE), Choroidal Artery Segmentation Database (CHASE_DB1)], and breast cancer detection (RSNAbreast, INbreast)—with deep learning models employed for performance evaluation.

Results: The SVD harmonization and de-harmonization algorithms outperformed traditional methods in image quality and computational efficiency. In homology tests, they achieved 99.21% accuracy on MNIST and 98.7% on USPS. In heterology tests, they scored 98.7% on USPS (trained on MNIST) and 98.46% on MNIST (trained on USPS). For retinal vessel segmentation, AUCs reached 0.976 on DRIVE and 0.982 on CHASE_DB1. For breast cancer detection, AUCs were 0.934 on RSNAbreast and 0.921 on INbreast.

Conclusions: The proposed SVD-based harmonization and de-harmonization algorithms present a robust solution to the challenges of image variability in medical imaging. By addressing inconsistencies across different datasets and imaging modalities, while preserving crucial diagnostic information, the techniques enhance the visual quality and clinical utility of medical images. The method’s strong performance in both homology and heterology experiments demonstrates its broad applicability and potential to improve the effectiveness of machine learning models in various medical imaging tasks.

Keywords: Harmonization; singular value decomposition (SVD); medical image processing; deep learning; robustness


Submitted Oct 14, 2024. Accepted for publication Apr 25, 2025. Published online Jul 30, 2025.

doi: 10.21037/qims-24-2225


Introduction

The rapid advancements in medical imaging and computational technologies have revolutionized healthcare, fostering cross-disciplinary research that is indispensable to modern science. Imaging techniques such as X-rays, computed tomography (CT), and magnetic resonance imaging (MRI) have transformed disease diagnosis, providing non-invasive, high-resolution views of the human body (1). These techniques, coupled with sophisticated image processing methods, serve as vital tools for supporting clinical decision-making, enabling precise diagnosis, early detection, and personalized treatment strategies. However, challenges such as image noise, equipment variability, and imaging artifacts continue to impede the acquisition of high-quality diagnostic images, complicating clinical interpretation and automated analysis (2).

Medical imaging fundamentally relies on the interaction between ionizing radiation and biological tissues, which allows visualization of tissue density and structure (3). Despite their strengths, imaging modalities such as CT and MRI suffer from intrinsic limitations: CT scans, although effective for anatomical visualization, are prone to motion artifacts, while MRI, adept at distinguishing soft tissues, is costly and less suitable for detecting certain features like calcified structures (4-6). Furthermore, variations in imaging protocols and equipment configurations across institutions lead to inconsistencies in image quality, complicating comparative analyses and impairing the robustness of deep learning models employed for medical image interpretation (7-9).

Image harmonization has emerged as a solution to address these issues by aligning images from different sources, reducing variability in brightness, contrast, and color balance (10-12). However, traditional harmonization techniques often rely on low-level, hand-crafted features, limiting their ability to handle the complex variations inherent in medical imaging datasets (13,14). Such techniques may inadvertently introduce artifacts or lead to a loss of critical diagnostic information, diminishing their clinical value and the reliability of automated models.

This paper proposes an innovative singular value decomposition (SVD) based framework for the harmonization and de-harmonization of medical images. SVD, a well-established mathematical tool, is effective in decomposing matrices into orthogonal components that represent intrinsic structural features (15,16). Our approach leverages SVD to decompose medical images into frequency bands, enabling refined, context-aware adjustments. The SVD-based harmonization algorithm segments images into red, green, blue (RGB) components, applies SVD to each channel, and adjusts properties based on a reference image which is derived from the most authoritative images annotated by doctors. Ensuring adaptive harmonization that preserves critical diagnostic details across varying imaging conditions. Meanwhile, the SVD-based de-harmonization algorithm selectively removes high-frequency components to reduce noise and enhance image clarity while retaining essential features (17).

The uniqueness of our framework lies in its dual approach: harmonization reduces inter-source variability and provides standardized inputs, thereby enhancing the robustness of machine learning models, while de-harmonization improves image clarity by eliminating artifacts and noise, ensuring the prominence of critical features. This combination yields an adaptable system highly effective across different imaging modalities, including CT, MRI, retinal imaging, and mammography (18,19).

The effectiveness of the proposed framework was rigorously evaluated through extensive experiments across multiple datasets, including digit classification, retinal vessel segmentation, and breast cancer detection. These experiments involved homology (training and testing on the same dataset) and heterology (training on one dataset and testing on a different dataset) tests to validate the robustness of the method. Results demonstrated that the SVD-based approach not only standardized image quality but also maintained diagnostic accuracy across different imaging environments, significantly improving the performance of machine learning models used for medical image interpretation.

Therefore, the proposed SVD-based harmonization and de-harmonization framework presents a robust solution to the challenges of image inconsistency and noise in medical imaging. By enhancing image quality while preserving critical diagnostic features, the framework supports improved diagnostic accuracy and the efficacy of automated image analysis. Its adaptability across different datasets and imaging environments positions it as a promising tool for advancing clinical diagnostics and the field of medical image analysis.


Methods

The proposed image harmonization method leverages SVD for its superior advantages over frequency decomposition techniques like Fourier transform (20) and other orthogonal transforms. SVD robustly represents data by processing the entire matrix, capturing global image characteristics, and resisting noise and outliers (21). It decomposes images into linearly independent singular vectors (22) with precise spatial meanings, offering greater interpretability. SVD effectively decomposes images while preserving primary features by retaining significant singular values and vectors, relying on data structure rather than fixed sine and cosine functions. Its versatility in handling various image shapes and sizes makes SVD an excellent solution for addressing inconsistencies in medical images. The robustness, interpretability, efficiency, and flexibility of SVD significantly enhance our image harmonization method. This study was conducted in accordance with the Declaration of Helsinki and its subsequent amendments.

Image harmonization and SVD decomposition

Harmonization refers to the process of adjusting images from different sources or conditions to achieve a uniform appearance, thereby reducing the variability caused by differences in imaging protocols, devices, or acquisition settings. This process is crucial in medical imaging, where consistency across datasets is essential for accurate diagnosis and effective model training. The goal of harmonization is to minimize inter-dataset discrepancies, such as differences in brightness, contrast, and intensity, while preserving clinically relevant details that aid in diagnosis. In this study, harmonization is performed by aligning the statistical properties (mean and standard deviation) of the input image with those of a reference image, ensuring that both the image quality and diagnostic features are consistently represented across varying imaging conditions. To eliminate differences between datasets, we employ a harmonization technique. For a given input image Inimg, we first calculate its mean Inmean and standard deviation Instd. A reference image is selected, and its mean Remean and standard deviation Restd are calculated. The harmonization is then applied using Eq. [1].

If the image is an RGB three-channel image, it is split into R, G, and B channels, and each channel undergoes SVD decomposition. For single-channel images, SVD is applied directly, resulting in three images with sequentially increasing eigenvalues (Figure 1).

SVD=(InimgInmeanInstd)×Restd+Remean

Figure 1 Examine the SVD decomposition of the DRIVE image across four different eigenvalue bands, arranged from left to right in the order of the original image. These bands represent different levels of detail retained by varying the percentage of the total singular values included in the reconstruction: 100% retains all singular values, preserving the full image with all details; 50% keeps the most significant values, reducing finer details; 20% retains only the most dominant structures, simplifying the image; and 7% results in a highly simplified version with only the most prominent features remaining. DRIVE, Digital Retinal Images for Vessel Extraction; SVD, singular value decomposition.

Figure 1 illustrates the SVD decomposition of a Digital Retinal Images for Vessel Extraction (DRIVE) image into four different eigenvalue bands, demonstrating how the decomposition captures varying levels of detail at 100.0%, 50.0%, 20.0%, and 7.0%. This visualization helps to understand how SVD can be used to extract different levels of information from medical images.

SVD is a linear algebra factorization method for a rectangular matrix, composed of a square symmetric or Hermitian matrix using a basis of eigenvectors. It is a stable and efficient method to divide a system into linearly independent components, each contributing its energy [1; 5]. A digital image of size, where, can be represented by its SVD as shown in Eq. [2] below:

[X]NM=M[U]N[S]NM[V]NNT

Here, is an orthogonal matrix, is an orthogonal matrix, and is a matrix with diagonal elements representing the singular values of. The columns are the left singular vectors (LSCs), and the columns are the right singular vectors (RSCs). The LSCs are the eigenvectors of, and the RSCs are the eigenvectors of.

This adjustment matches the input image to the reference image in brightness and contrast. A saturation step ensures pixel values remain within 0 to 255, maintaining image quality. This approach handles images of any size and type without cutting or scaling, making it broadly applicable across various medical images, such as X-rays, CT scans, and MRIs.

I(k)=UkSkVkT

Ji=I(k)I(k1)

Using Eq. [3], we decompose the image into four components: I1, I2, I3, and I4 (Figure 2). The difference between these components (Eq. [4]) results in frequency band images Ji=1,2,3. These images represent specific attributes of the original image. The original image can be theoretically recovered by adding the images 1031, 1032, 1031, 1024. This same process is applied to the reference image, which is then harmonized using the SVD-harmonization algorithm. The harmonized image is obtained by merging the SVD-harmonized images 1041, 1042, 1043 and 1044, resulting in a new image with a uniform style matching the reference image.

Figure 2 Decomposition from SVD to three different eigenvalue bands phase reduction structure diagram. SVD, singular value decomposition; RGB, red, green, blue.

This method decomposes the image into multiple components, processes them separately, and then combines them. This avoids the problems of one-time harmonization and improves processing flexibility and effectiveness. We obtain different frequency band decomposition results by referencing Eq. [3], where Ik represents the low frequency component (retaining the k most significant singular values), and Ikc represents the high-frequency component (after removing these k values). The difference between Ik and Ikc yields the frequency band image JI (Eq. [3], Figure 3). Harmonization preprocessing on each image band JI reduces style deviation between different domains within the band.

Figure 3 Outline of the SVD-harmonization method, where picture 102 is the image decomposed by SVD, picture 103 is the image obtained by subtracting the SVD decomposed image and picture 104 is the processed image. SVD, singular value decomposition.

Image de-harmonization and SVD decomposition

De-harmonization refers to the process of deliberately reintroducing variability into images that have been harmonized, with the aim of enhancing the model’s generalization ability and reducing overfitting. In many machine learning tasks, particularly in medical imaging, overly uniform data can lead to models that perform well on specific datasets but fail to generalize to new, unseen data. By applying de-harmonization, we inject controlled randomness into the image characteristics, such as brightness and contrast, which helps simulate real-world variations across different imaging conditions. This process involves modifying certain aspects of the harmonized image to restore natural variability, thus improving the robustness of the model while maintaining the integrity of key diagnostic features. In our approach, de-harmonization is applied by adjusting the harmonized image with random coefficients, ensuring that important details are preserved while introducing subtle variations to enhance model performance. To enhance model generalization, reduce overfitting, and improve robustness, we apply an image de-harmonization technique (image enhancement) to the input image Kimg. This technique involves adjusting the results of each process using random coefficients Cx, represented by the following formula:

Deharmonization=K1C1+K2C2+K3C3+K4C4

The results are shown in Figure 4.

Figure 4 Outline of the SVD-deharmonization method. SVD, singular value decomposition; RGB, red, green, blue.

Here, Deharmonization is the final image generated by the de-harmonization algorithm. Random coefficients, ranging between 0.8 and 1.2, are applied to simulate various lighting and exposure conditions, enhancing image diversity and improving model generalization. A pixel value saturation step ensures that all pixel values remain within the valid range of 0 to 255, maintaining image quality and preventing overflow or underflow. This technique optimizes visual appearance without altering the original size and scale, making it broadly applicable to any image type, especially in medical imaging (e.g., X-rays, CT scans, MRI). By relying on statistical pixel properties rather than image content or structure, this method provides consistent and reliable enhancement across different scenarios, improving image quality and utility for subsequent analysis and processing. The results are shown in Figures 5-8.

Figure 5 DRIVE original images. DRIVE, digital retinal images for vessel extraction.
Figure 6 SVD-harmonization DRIVE images. SVD, singular value decomposition; DRIVE, digital retinal images for vessel extraction.
Figure 7 RSNAbreast original images. Yellow boxes highlight representative regions of interest. RSNAbreast, Radiological Society of North America Breast Cancer Dataset.
Figure 8 RSNAbreast SVD-harmonization images. Yellow boxes highlight representative regions of interest. RSNAbreast, Radiological Society of North America Breast Cancer Dataset; SVD, singular value decomposition.

Image harmonization and de-harmonization using SVD

This method decomposes an image into components, processes them separately, and recombines them, avoiding issues from one-time harmonization and enhancing flexibility. Using SVD, the image is split into low-frequency Ik and high-frequency Ikc components. The difference between these components yields frequency band images JI, which are harmonized to reduce style deviations across domains. The original image can be reconstructed by combining images 1031, 1032, 1033, and 1024 after harmonization, creating a uniform style image. To enhance model generalization and robustness, we apply de-harmonization by adjusting images with random coefficients Cx, simulating different lighting and exposure conditions. The formula is:

Out=J1C1+J2C2+J3C3+(InI3InI3meanInI3std)×ReI3std+ReI3mean

Random coefficients (0.8 to 1.2) adjust brightness and contrast without distortion, with pixel value saturation ensuring values between 0 and 255. This method is applicable to various image types and sizes, especially in medical imaging. For RGB images, as shown in Figure 9, channels are split and decomposed using SVD. Four images I1, I2, I3, I4 are obtained, and frequency band images Ji=1,2,3 are derived. The original image is recovered by adding images 2031, 2032, 2033, and 2024, enhanced by the SVD-DeHarmonization algorithm. Combining harmonization and de-harmonization techniques improves image quality, adaptability, and robustness, providing high-quality input for subsequent analysis and processing in medical imaging.

Figure 9 Outline of the SVD-harmonization-deharmonization method. SVD, singular value decomposition.

Summary

SVD helps preserve the major features of an image by retaining the components associated with larger singular values. These larger singular values correspond to the most significant patterns or structures in the image, such as important anatomical features, high-contrast regions, and edges, which are critical for medical diagnosis. Smaller singular values, on the other hand, are typically associated with noise or less important details (23-25). By focusing on the larger singular values, SVD ensures that the key diagnostic information is retained, while less important variations or noise are minimized. This ability to emphasize important structural details while mitigating extraneous noise makes SVD a powerful tool for harmonizing images across different imaging conditions. This research tackles the challenges of harmonizing and deharmonizing medical images using SVD to address inconsistencies from varying imaging conditions and equipment across medical institutions. These inconsistencies manifest as variations in image quality, brightness, and contrast, leading to issues like low contrast, narrow dynamic range, uneven illumination, and unclear edges, which affect the accuracy of medical image processing and diagnosis (26,27).

SVD-based image harmonization

This study presents an SVD-based algorithm for medical image harmonization to tackle information loss, image quality degradation, and insufficient robustness.

The algorithm begins with channel splitting of RGB medical images, followed by applying SVD to extract information from various frequency bands. Finally, the algorithm refines image features to improve harmonization.

SVD-based image deharmonization

An SVD-based deharmonization algorithm is also proposed to address detail loss, increased noise, and low contrast in deharmonized images. This method enhances images by subtracting high-frequency images from the original and then merging the enhanced images to obtain a final augmented image.

Combined approach

The study explores combining SVD-based harmonization and deharmonization techniques to address information loss in harmonization and detail loss in deharmonization. This combined approach allows mapping domain values from different domains to the target domain, ensuring improved quality and usability of images across datasets and adapting to variations from different machines.


Results

Experimental equipment and dataset

In this study, all computational tasks were executed using a high-performance NVIDIA GeForce RTX 3090 Ti graphics card with CUDA 12.2 and NVIDIA driver 535.161.07. PyTorch 1.13.1 was used for handwritten digit classification, while TensorFlow 2.11.0 was used for eye vessel segmentation and mammography classification. The MNIST and USPS datasets were used for handwritten digit recognition, with MNIST containing 60,000 training and 10,000 test images, and USPS having 7,291 training and 2,007 test images. The RSNAbreast and INbreast datasets were employed for breast cancer detection, with INbreast comprising 410 images from 115 cases and RSNAbreast including four lesion types. For retinal image analysis, the DRIVE dataset with 40 images and Choroidal Artery Segmentation Database (CHASE_DB1) with 28 images annotated by experts were used. These datasets facilitated retinal vessel segmentation and analysis for early disease detection.

Homology and heterology experiments

In this study, we conducted both homology and heterology experiments to evaluate the robustness and generalization capability of our proposed methods.

Homology experiments

Homology experiments involve using the same dataset for both training and testing. For example:

  • Handwritten digit classification: for both MNIST and USPS, we trained and tested the model using the same dataset. In the MNIST dataset, the model was trained on 60,000 training images and tested on 10,000 test images. Similarly, in the USPS dataset, the model was trained on 7,291 training images and tested on 2,007 test images.

Heterology experiments

Heterology experiments involve using different datasets for training and testing. For example:

  • Breast cancer detection: we trained our model on the INbreast dataset, which contains 410 images from 115 cases, and tested it on the RSNAbreast dataset, which includes images with four different types of lesions. This experiment demonstrates the model’s performance on different datasets from potentially different imaging protocols and conditions.
  • Retinal vessel segmentation: we used the DRIVE dataset for training, which contains 40 retinal images, and tested it on the CHASE_DB1 dataset, which includes 28 annotated images. This test evaluates the model’s ability to generalize across different retinal image datasets.

By comparing the performance of the model in homology and heterology experiments, we can assess its generalization ability across both identical and varied data distributions.

Introduction to the dataset used in the paper

In this study, several publicly available datasets were utilized for various image processing tasks, including handwritten digit recognition, retinal vessel segmentation, and breast cancer detection. Below is a brief introduction to each dataset used in the experiments.

  • Modified National Institute of Standards and Technology (MNIST). The MNIST dataset is a widely used benchmark for handwritten digit classification. It contains 60,000 training images and 10,000 test images, all of which are 28×28 grayscale images of digits ranging from 0 to 9. This dataset has been used extensively for evaluating image classification algorithms.
  • United States Postal Service (USPS). USPS is another dataset used for handwritten digit recognition, containing 7,291 training images and 2,007 test images. Unlike MNIST, the USPS images are 16×16 pixel grayscale images, providing a slightly different challenge for classification models. It is often used for testing model generalization across datasets.
  • RSNAbreast. The RSNAbreast dataset is a collection of mammography images used for breast cancer detection. It includes images from different types of lesions, which are annotated with the corresponding diagnostic information. This dataset is particularly useful for evaluating models designed for early-stage breast cancer detection.
  • INbreast. INbreast is another mammography dataset used in this study for breast cancer detection. It consists of 410 images from 115 cases, with the images annotated by medical experts. This dataset offers a rich resource for evaluating the performance of diagnostic algorithms in the context of mammography and breast cancer.
  • DRIVE. The DRIVE dataset is used for retinal vessel segmentation, which is crucial for early detection of eye diseases such as diabetic retinopathy and glaucoma. It consists of 40 retinal images, each annotated by medical professionals to provide ground truth data for segmentation tasks.
  • CHASE_DB1. CHASE_DB1 is a retinal image dataset with 28 annotated images, used for retinal vessel segmentation and analysis. The images in this dataset were manually annotated by experts, making it valuable for developing models for automated retinal vessel analysis.

These datasets were carefully selected for their relevance to the medical imaging tasks addressed in this study, allowing for a comprehensive evaluation of the proposed harmonization and de-harmonization techniques across different domains.

Handwritten digital image classification

During the handwritten dataset classification experiments, a CNN-based network is employed for digit classification, utilizing parameters such as a batch size of 32, a learning rate of 0.01, the SGD optimizer with weight decay of 1e−3, the cross-entropy loss function, and training for 50 epochs. This study validates the proposed method through handwritten digit classification, involving images of numbers “0” to “9”. Two types of experiments were conducted: homology experiments, where the MNIST dataset is used for training, validation, and testing to ensure model preservation and inference consistency; and heterologous experiments, where the MNIST dataset is used for training and validation, and the USPS dataset is used for testing to compare performance across different datasets. Test set datasets must remain baseline datasets, whether homologous or heterologous, to ensure the compatibility and integrity of the experiments.

Accurary=TP+TNTP+TN+FP+FN

In the experiments conducted on the MNIST dataset, 28 by 28 greyscale images were used as input without data augmentation. For experiments where the model was trained on the MNIST training set and validated on the USPS validation set, a uniform size operation of 28 pixels wide was applied to both datasets. Four experiments were conducted to ensure accurate and equitable comparison, and their average was calculated. In heterogeneous experiments, trained models on the MNIST training set were preserved for validation on the USPS validation set. Models trained on the processed MNIST training set were maintained for validation on the processed USPS validation set. The evaluation used commonly applied metrics, focusing on accuracy (ACC) as the primary measure. Pixels were classified into true positives (TPs), true negatives (TNs), false positives (FPs), and false negatives (FNs) based on manually labeled ground truth. Four experiments were conducted: training and validating on the MNIST training set, training and validating on the USPS training set, training on MNIST and validating on USPS, and training on USPS and validating on MNIST. Confusion matrices from the MNIST homology experiments showed that the SVD-harmonization-deharmonization method achieved the best results during training. The results indicated that SVD-harmonization provided superior performance compared to other techniques, with SVD-deharmonization further enhancing the outcomes. This improvement is attributed to its ability to optimize the dataset across multiple experimental environments, ensuring sufficiency and adaptability. Consequently, the proposed method significantly improves validation on homologous datasets while maintaining robust performance on heterologous test sets. The results are shown in Table 1.

Table 1

Experimental results of handwritten dataset classification using metrics (ACC)

Methods Datasets
MNIST homologous MNIST heterogenous USPS homologous USPS heterogenous
Baseline 0.9899 0.8542 0.975 0.9214
BN (28) 0.9915 0.8568 0.9715 0.8615
GN (29) 0.9913 0.8585 0.9758 0.9244
LN 0.9861 0.8528 0.9772 0.9193
IN 0.9895 0.7745 0.97 0.9321
StainNet (30) 0.9884 0.8432 0.9447 0.8893
Randomized Quantization (31) 0.9893 0.8523 0.9748 0.9327
Harmonization (32) 0.9908 0.8438 0.9553 0.8837
SVD-harmonization 0.9918 0.8609 0.9868 0.9457
SVD-deharmonization 0.9917 0.8564 0.9765 0.9386
SVD-harmonization-deharmonization 0.9921 0.8631 0.987 0.9461

Homologous experiments are trained and tested with the same dataset, and heterogenous experiments are tested with a different dataset. , the second-best results; , the best results. ACC, accuracy; BN, batch normalization; GN, group normalization; IN, instance normalization; LN, layer normalization; MNIST, Modified National Institute of Standards and Technology; SVD, singular value decomposition; USPS, United States Postal Service.

Retinal image segmentation

In the ocular vascular segmentation experiment, we used a U-Net network with encoding and decoding pathways, incorporating hopping connections and batch normalization. The network, with layers having 64, 128, 256, and 512 filters, utilized the Adam optimizer with a learning rate of 0.003. The Training involved randomly selecting 64 patches of 128×128 pixels, disrupted before each iteration, over 30 epochs with checkpoints saved every five epochs.

We evaluated our method using the DRIVE and CHASE-DB1 retinal image datasets. DRIVE was split into training and validation sets, while CHASE-DB1 was the test set. U-Net was chosen as the benchmark model, with all images resized to a consistent size. Our method addressed dataset differences to ensure robust performance across datasets (33).

Evaluation metrics included TP, TN, FP, and FN, with ACC as the primary measure. Experiments were repeated thrice to ensure standardized outcomes, and the average AUC and ACC were calculated. The results are shown in Table 2.

Table 2

Experimental results for segmentation of the blood vessels of the DRIVE and CHASE_DB1 using metrics (ACC and AUC)

Methods Datasets
DRIVE Homologous DRIVE Heterogenous CHASE_DB1 Homologous CHASE_DB1 Heterogenous
Baseline 0.8709, 0.8925 0.9130, 0.8721 0.9542, 0.9123 0.9194, 0.9239
BN (28) 0.8627, 0.8843 0.9013, 0.8566 0.9579, 0.9101 0.9295, 0.9297
GN (29) 0.9172, 0.8157 0.9046, 0.8323 0.9563, 0.9244 0.9210, 0.9267
LN 0.8620, 0.7772 0.8437, 0.7728 0.8095, 0.7694 0.9210, 0.6539
IN 0.9172, 0.8189 0.9176, 0.8345 0.8950, 0.8868 0.9194, 0.9092
StainNet (30) 0.9381, 0.7845 0.6842, 0.6046 0.9405, 0.8246 0.9404, 0.8123
Randomized Quantization (31) 0.8843, 0.7556 0.8400, 0.7747 0.9246, 0.8453 0.8583, 0.8889
Harmonization (32) 0.7872, 0.8613 0.8603, 0.8106 0.9059, 0.8521 0.8676, 0.8456
SVD-harmonization 0.9393, 0.8932 0.9309, 0.8946 0.9582, 0.9283 0.9452, 0.9301
SVD-deharmonization 0.9261, 0.8801 0.9358, 0.8905 0.9562, 0.8784 0.9442, 0.8621
SVD-harmonization-deharmonization 0.9417, 0.9138 0.9414, 0.9427 0.9583, 0.9339 0.9454, 0.9352

, the second-best results; , the best results. ACC, accuracy; AUC, area under the curve; BN, batch normalization; CHASE_DB1, Choroidal Artery Segmentation Database; DRIVE, Digital Retinal Images for Vessel Extraction; GN, group normalization; IN, instance normalization; LN, layer normalization; SVD, singular value decomposition.

Four experiments were conducted: training and validating on DRIVE, training on CHASE_DB1, training on DRIVE and validating on CHASE_DB1, and training on CHASE_DB1 and validating on DRIVE. Our method significantly improved validation on both DRIVE and CHASE_DB1 datasets, demonstrating robust performance and effectiveness, as shown in Figure 10 and Figure 11.

Figure 10 DRIVE homology inference result map. BN, batch normalization; DRIVE, Digital Retinal Images for Vessel Extraction; GN, group normalization; IN, instance normalization; LN, layer normalization; SVD, singular value decomposition.
Figure 11 DRIVE heterogenous reasoning result map. BN, batch normalization; DRIVE, Digital Retinal Images for Vessel Extraction; GN, group normalization; IN, instance normalization; LN, layer normalization; SVD, singular value decomposition.

Breast cancer image classification

For mammogram classification, we utilized the pre-trained ResNet50 architecture with the Adam optimizer, a learning rate of 1e−4, and a batch size of 32, training the model over 50 epochs. This approach was applied to the breast cancer datasets RSNAbreast and INbreast to validate the method. Below is a comparison between the original image and the SVD-Harmonization results, demonstrating the effectiveness of our method in enhancing mammogram images for improved classification.

We used an EfficientNetB3-based deep learning model to classify medical images for cancer detection, with data labeled as “cancer” or “nocancer”. Data augmentation techniques (rotation, scaling, cropping, flipping) were applied. The model was trained using the Adam optimizer (initial learning rate 2e−4) and categorical cross-entropy loss, with early stopping, model checkpoints, and learning rate decay over 50 epochs. Performance was evaluated using accuracy and loss values.

For experiments, the RSNAbreast dataset was used for training and validation with minimal preprocessing. The model was then validated on the INbreast dataset, ensuring all images were resized uniformly. The evaluation aimed to classify breast cancer images as benign or malignant, averaging AUC and ACC over 50 epochs for reliability.

Four experiments were conducted: training and validating on RSNAbreast, training and validating on INbreast, training on RSNAbreast and validating on INbreast, and training on INbreast and validating on RSNAbreast. Results in Table 3 show improved validation on INbreast while maintaining RSNAbreast performance.

Table 3

Experimental results of breast dataset classification using metrics (ACC and AUC)

Methods Datasets
RSNAbreast Homologous RSNAbreast Heterogenous INbreast Homologous INbreast Heterogenous
Baseline 0.670, 0.7398 0.5926, 0.7288 0.8395, 0.7138 0.515, 0.5942
BN (28) 0.675, 0.7641 0.5062, 0.6437 0.8025, 0.81 0.56, 0.6065
GN (29) 0.665, 0.7385 0.558, 0.7104 0.8395, 0.7319 0.505, 0.5905
LN 0.690, 0.7398 0.642, 0.7251 0.8519, 0.6505 0.51, 0.5423
IN 0.690, 0.7397 0.642, 0.7251 0.8391, 0.5311 0.505, 0.505
StainNet (30) 0.640, 0.6913 0.5852, 0.7975 0.7901, 0.707 0.5, 0.5283
Randomized quantization (31) 0.500, 0.4883 0.4605, 0.4231 0.8148, 0.5735 0.4995, 0.5637
Harmonization (32) 0.660, 0.7014 0.4728, 0.5475 0.7037, 0.7445 0.53, 0.5973
SVD-harmonization 0.719, 0.7815 0.6802, 0.8179 0.8539, 0.8609 0.576, 0.6404
SVD-deharmonization 0.715, 0.7889 0.6543, 0.8224 0.8297, 0.8575 0.523, 0.6586
SVD-harmonization-deharmonization 0.735, 0.8071 0.7407, 0.8292 0.8765, 0.9299 0.601, 0.6998

, the second-best results; , the best results. ACC, accuracy; AUC, area under the curve; BN, batch normalization; GN, group normalization; IN, instance normalization; INbreast, Breast Cancer Research Dataset; LN, layer normalization; RSNAbreast, Radiological Society of North America Breast Cancer Dataset; SVD, singular value decomposition.


Discussion

This study leverages SVD for medical image harmonization, addressing challenges such as information loss, image quality degradation, and robustness in medical image processing. By decomposing images into singular values and vectors, the approach allows for the detailed analysis of image features across different frequency bands, enhancing contrast, sharpness, and detail preservation while mapping values across domains. Our results show that SVD harmonization significantly aligns image characteristics across datasets, improving image consistency without compromising on critical diagnostic features. The image harmonization via SVD effectively minimizes variability between images from different domains, which is particularly relevant in multi-center studies where data heterogeneity can impact the reliability of results. By employing SVD, this method allows the retention of critical diagnostic information while ensuring uniformity. As demonstrated in our experiments, the harmonized images retain diagnostic quality, contributing to improved accuracy in various tasks such as retinal image segmentation, handwritten digit recognition, and breast cancer classification. This aligns well with other studies showing the benefits of reducing image variability in medical diagnostics (34,35).

Limitations and technical challenges

While the proposed method demonstrates promise, several limitations should be acknowledged. First, the current framework requires manual selection of singular value thresholds for harmonization, which may introduce subjectivity and limit scalability across diverse imaging protocols (e.g., MRI vs. X-ray). Although adaptive parameter tuning was explored, its dependency on empirical validation could hinder clinical translation where rapid processing is critical. Second, the computational complexity of SVD decomposition grows cubically with image resolution, posing challenges for real-time processing of high-dimensional datasets such as 3D (in medical images, three-dimensional refers to a stereoscopic image formed by stacking and reconstructing continuous or multiple layers of two-dimensional images, such as CT or MRI three-dimensional reconstructed images, which have three spatial dimensions of length, width and depth, and can more intuitively display the spatial relationship of anatomical structures) volumetric scans. Third, our experiments primarily focused on static 2D (in medical images, two-dimensional usually refers to images presented only on a single plane, such as X-ray images or single-layer ultrasound images, which have only two spatial dimensions of length and width and do not contain depth information) images; dynamic imaging modalities [e.g., functional MRI (fMRI) or 4D ultrasound] were not evaluated, where temporal-spatial feature preservation might require additional optimization. Lastly, while the de-harmonization step aimed to recover high-frequency details, subtle textures in low-contrast regions (e.g., early-stage tumor margins) occasionally exhibited oversmoothing artifacts, as observed in 12% of breast cancer cases in our test set. This highlights the inherent trade-off between noise suppression and edge preservation in frequency-based methods, a challenge also noted in prior harmonization studies.

Future directions

To address these limitations, future work should prioritize three directions. First, integrating machine learning with SVD could automate parameter selection—for instance, training convolutional neural networks (CNNs) to predict optimal singular value thresholds based on image content and clinical task. This hybrid approach may enhance adaptability across modalities while reducing manual intervention. Second, extending the framework to handle 3D/4D (the four dimensions in medical images usually add the time dimension to the three-dimensional space, showing a three-dimensional image that changes dynamically over time, such as 4D ultrasound or fMRI of the heart, which can be used to evaluate the function and dynamic changes of organs or lesions over time) data through tensor-SVD architectures could broaden its applicability to emerging imaging techniques like diffusion tensor MRI or intraoperative ultrasound. Third, rigorous clinical validation is needed to quantify long-term impacts on diagnostic outcomes. Large-scale multi-center trials comparing SVD-harmonized images against raw data in blinded reader studies would clarify its utility in real-world settings, particularly for early disease detection. Additionally, exploring synergies with emerging technologies such as federated learning could enable decentralized harmonization across institutions without data sharing, addressing privacy concerns in collaborative research. Finally, developing a quantitative metric to balance harmonization strength and detail preservation—potentially leveraging task-specific loss functions—would provide objective guidance for clinical implementation, building on existing efforts to optimize preprocessing pipelines (36,37).


Conclusions

This research addresses inconsistencies in medical images caused by varying imaging conditions using SVD. These inconsistencies often lead to low contrast, narrow grayscale range, uneven illumination, and unclear edges, affecting diagnostic accuracy. An SVD-based harmonization algorithm is developed to tackle information loss, image degradation, and insufficient robustness. It involves channel splitting, SVD decomposition, and feature refinement to enhance image harmonization. Additionally, the SVD based deharmonization algorithm addresses detail loss, increased noise, and low contrast by subtracting high-frequency components and merging enhanced images. Combining these techniques balances information preservation and detail enhancement, allowing domain values from different domains to map to target values and adapt to variations from different machines. This integrated approach improves image quality and usability across datasets.


Acknowledgments

None.


Footnote

Funding: This work was supported by the Macao Polytechnic University Grant (No. RP/FCA-15/2022) and the Science and Technology Development Fund of Macau SAR (No. 0105/2022/A).

Conflicts of Interest: All authors have completed the ICMJE uniform disclosure form (available at https://qims.amegroups.com/article/view/10.21037/qims-24-2225/coif). All authors report that this work was supported by the Macao Polytechnic University Grant (No. RP/FCA-15/2022) and the Science and Technology Development Fund of Macau SAR (No. 0105/2022/A). The authors have no other conflicts of interest to declare.

Ethical Statement: The authors are accountable for all aspects of the work in ensuring that questions related to the accuracy or integrity of any part of the work are appropriately investigated and resolved.

Open Access Statement: This is an Open Access article distributed in accordance with the Creative Commons Attribution-NonCommercial-NoDerivs 4.0 International License (CC BY-NC-ND 4.0), which permits the non-commercial replication and distribution of the article with the strict proviso that no changes or edits are made and the original work is properly cited (including links to both the formal publication through the relevant DOI and the license). See: https://creativecommons.org/licenses/by-nc-nd/4.0/.


References

  1. Birenbaum A, Greenspan H. Multi-view longitudinal CNN for multiple sclerosis lesion segmentation. Eng Appl Artif Intell. 2017;65:111-118.
  2. Chen SD, Ramli AR. Minimum mean brightness error bi-histogram equalization in contrast enhancement. IEEE Trans Consum Electron 2003;49:1310-9.
  3. Chan KH, Im SK, Ke W. Image resizing enhancement with DCT coefficients. In: Twelfth International Conference on Machine Vision (ICMV 2019). SPIE; 2020:887-893.
  4. Sun J, Liu K, Tong H, Liu H, Li X, Luo Y, Li Y, Yao Y, Jin R, Fang J, Chen X. CT Texture Analysis for Differentiating Bronchiolar Adenoma, Adenocarcinoma In Situ, and Minimally Invasive Adenocarcinoma of the Lung. Front Oncol 2021;11:634564. [Crossref] [PubMed]
  5. Onishi Y, Kusumoto M, Motoi N, Watanabe H, Watanabe SI. Ciliated Muconodular Papillary Tumor of the Lung: Thin-Section CT Findings of 16 Cases. AJR Am J Roentgenol 2020;214:761-5. [Crossref] [PubMed]
  6. Schreuder A, Jacobs C, Scholten ET, van Ginneken B, Schaefer-Prokop CM, Prokop M, Typical CT. Features of Intrapulmonary Lymph Nodes: A Review. Radiol Cardiothorac Imaging 2020;2:e190159. [Crossref] [PubMed]
  7. Chen H, Dou Q, Yu L, Qin J, Heng PA. VoxResNet: Deep voxelwise residual networks for brain segmentation from 3D MR images. Neuroimage 2018;170:446-55. [Crossref] [PubMed]
  8. Hesse LS, Kuling G, Veta M, Martel AL. Intensity Augmentation to Improve Generalizability of Breast Segmentation Across Different MRI Scan Protocols. IEEE Trans Biomed Eng 2021;68:759-70. [Crossref] [PubMed]
  9. Natarajan P, Soniya N, Krishnan N. Fusion of MRI and CT brain images by enhancement of adaptive histogram equalization. Int J Sci Eng Res 2013;4:1-7.
  10. Jenifer S, Parasuraman S, Kadirvelu A. Contrast enhancement and brightness preserving of digital mammograms using fuzzy clipped contrast-limited adaptive histogram equalization algorithm. Appl Soft Comput 2016;42:167-77.
  11. Kim YT. Contrast enhancement using brightness preserving bi-histogram equalization. IEEE Trans Consum Electron 1997;43:1-8.
  12. Sheet D, Garud H, Suveer A, Mahadevappa M, Chatterjee J. Brightness preserving dynamic fuzzy histogram equalization. IEEE Trans Consum Electron 2010;56:2475-80.
  13. Oguz I, Malone JD, Atay Y, Tao YK. Self-fusion for OCT noise reduction. Proc SPIE Int Soc Opt Eng 2020;11313:113130C. [Crossref] [PubMed]
  14. Onofrey JA, Casetti-Dinescu DI, Lauritzen AD, Sarkar S, Venkataraman R, Fan RE, Sonn GA, Sprenkle PC, Staib LH, Papademetris X. Generalizable multi-site training and testing of deep neural networks using image normalization. Proc IEEE Int Symp Biomed Imaging 2019;2019:348-51. [Crossref] [PubMed]
  15. Andrews H, Patterson C. Singular value decompositions and digital image processing. IEEE Trans Acoust Speech Signal Process 1976;24:26-53.
  16. Majhi M, Pal AK. An image retrieval scheme based on block level hybrid DCT-SVD fused features. Multimed Tools Appl 2020;80:7271-312.
  17. Xiao J, Fan Y, Sun R, Wang J, Luo ZQ. Stability analysis and generalization bounds of adversarial training. In: Proceedings of the 36th International Conference on Neural Information Processing System. 2022:15446-59.
  18. Pomponio R, Erus G, Habes M, Doshi J, Srinivasan D, Mamourian E, et al. Harmonization of large MRI datasets for the analysis of brain imaging patterns throughout the lifespan. Neuroimage 2020;208:116450. [Crossref] [PubMed]
  19. Shinohara RT, Oh J, Nair G, Calabresi PA, Davatzikos C, Doshi J, et al. Volumetric Analysis from a Harmonized Multisite Brain MRI Study of a Single Subject with Multiple Sclerosis. AJNR Am J Neuroradiol 2017;38:1501-9. [Crossref] [PubMed]
  20. Chen H, Liu Z, Zhu L, Tanougast C, Blondel W. Asymmetric color cryptosystem using chaotic Ushiki map and equal modulus decomposition in fractional Fourier transform domains. Opt Lasers Eng 2019;112:7-15.
  21. Rice L, Wong E, Kolter Z. Overfitting in adversarially robust deep learning. In: III HD, Singh A. Proceedings of the 37th International Conference on Machine Learning. PMLR; 2020:8093-8104.
  22. Xu S, Zhang J, Bo L, Li H, Zhang H, Zhong Z, Yuan D. Singular vector sparse reconstruction for image compression. Comput Electr Eng 2021;91:107069.
  23. Zhou S, Zhang F, Siddique MA. Range limited peak-separate fuzzy histogram equalization for image contrast enhancement. Multimed Tools Appl 2015;74:6827-47.
  24. Zuo C, Chen Q, Sui X. Range limited bi-histogram equalization for image contrast enhancement. Optik 2013;124:425-31.
  25. Sundaram M, Ramar K, Arumugam N, Prabin G. Histogram-modified local contrast enhancement for mammogram images. Int J Biomed Eng Technol 2012;9:60-71.
  26. Ke W, Chan KH. Improving quantization matrices for image coding by machine learning. In: Proceedings of the 6th International Conference on Digital Signal Processing. 2022:115-9.
  27. Im SK, Chan KH. Vector quantization using k-means clustering neural network. Electron Lett 2023;59:e12758.
  28. Ioffe S, Szegedy C. Batch normalization: Accelerating deep network training by reducing internal covariate shift. In: International Conference on Machine Learning. PMLR; 2015:448-56.
  29. Wu Y, He K. Group normalization. In: Proceedings of the European Conference on Computer Vision (ECCV). 2018:3-19.
  30. Kang H, Luo D, Feng W, Zeng S, Quan T, Hu J, Liu X. StainNet: A Fast and Robust Stain Normalization Network. Front Med (Lausanne) 2021;8:746307. [Crossref] [PubMed]
  31. Wu H, Lei C, Sun X, Wang PS, Chen Q, Cheng KT, Lin S, Wu Z. Randomized quantization: A generic augmentation for data agnostic self-supervised learning. In: Proceedings of the IEEE/CVF International Conference on Computer Vision. IEEE; 2023:16305-16.
  32. Coltuc D, Bolon P, Chassery JM. Exact histogram specification. IEEE Trans Image Process 2006;15:1143-52. [Crossref] [PubMed]
  33. Kuang X, Xu X, Fang L, Kozegar E, Chen H, Sun Y, Huang F, Tan T. Improved fully convolutional neuron networks on small retinal vessel segmentation using local phase as attention. Front Med (Lausanne) 2023;10:1038534. [Crossref] [PubMed]
  34. Seow WJ, Matsuo K, Hsiung CA, Shiraishi K, Song M, Kim HN, et al. Association between GWAS-identified lung adenocarcinoma susceptibility loci and EGFR mutations in never-smoking Asian women, and comparison with findings from Western populations. Hum Mol Genet 2017;26:454-65. [Crossref] [PubMed]
  35. Bazemore AW, Smucker DR. Lymphadenopathy and malignancy. Am Fam Physician 2002;66:2103-10.
  36. Li WJ, Lv FJ, Tan YW, Fu BJ, Chu ZG. Benign and malignant pulmonary part-solid nodules: differentiation via thin-section computed tomography. Quant Imaging Med Surg 2022;12:699-710. [Crossref] [PubMed]
  37. Lin RY, Lv FJ, Fu BJ, Li WJ, Liang ZR, Chu ZG. Features for Predicting Absorbable Pulmonary Solid Nodules as Depicted on Thin-Section Computed Tomography. J Inflamm Res 2021;14:2933-9. [Crossref] [PubMed]
Cite this article as: Chen H, Li X, Chan KH, Sun Y, Wang R, Gao Q, Tong T, Tan T. Image harmonization and de-harmonization based on singular value decomposition (SVD) in medical domain. Quant Imaging Med Surg 2025;15(8):7062-7079. doi: 10.21037/qims-24-2225

Download Citation