Exploring correlation information for image compression of four-dimensional computed tomography
Introduction
Internal organ motion caused by respiratory, digestive, and cardiac systems could be significant during radiation treatments. This kind of intra-fraction motion limits our ability in accurately delivering dose to planning target volume. For this purpose, four-dimensional computed tomography (4DCT) was invented with the emerging application of real-time surrogate motion tracking systems, such as RPM (Varian Medical System, Palo Alto, USA). In radiotherapy, 4DCT is mostly used for motion management and treatment planning purposes (1,2). Though its acquisition is less frequent comparing to cone-beam computed tomography (CBCT), it is more important for planning and verification purposes of radiotherapy. For a lung cancer patient scanned with a typical 4DCT protocol (an external respiration signal was used to acquire the respiratory signal and to sort the raw data into 10 breathing phases.) the size of 4DCT dataset is approximately 0.7 GB and equal to the size of total 28 CBCT datasets.
With the increasing number of lung patients and respiration gated radiotherapy, the demand for 4DCT keeps growing and its storage become more urgent. One way to solve this issue is image compression which converts original images into compressed file with less size. Image compression is developed for reducing image size on the storage device while maintain relevant content. It takes advantages of redundant information that occurs spatially, temporally, and spectrally within data. It can be categorized in lossless and lossy techniques (3). Lossless techniques are reversible methods and the resulting compression rates are low. Lossy techniques are irreversible methods and the resulting compression rates are higher. Because of the regulatory policies required by many agencies, there is less clinical use of lossy compression for medical images (4). In clinic, the lossless JPEG (Joint Photographic Experts Group) and lossless Wavelet has been widely used and adopted by the Digital Imaging and Communications in Medicine (DICOM) group in 2001 (5-8).
Since 4DCT consisting of multiple phase CT subsets which are co-registered during scanning time, the correlation between them is higher and could be utilized to reduce redundancy for image compression (9). In this study we proposed two ordering methods and tested three video encoders for 4DCT image compression. The performance of these methods was investigated on a publicly available database. In Section 2, the principles of video compression were introduced in brief. Then, two ordering methods incorporating temporal and spatial correlation information of multiple CT subsets were explained. In Section 3, the effect of two ordering methods and three video encoders were assessed on a publicly available 4DCT database. Finally, the advantages and disadvantages of the new method were discussed.
Methods
Principle of video compression
The goal of image compression is to reduce image size for efficient storage and transmission without losing relevant image information. For static image, such as photos and pictures, there are hundreds of compression algorithms categorized into lossless and lossy types. The most popular compression algorithms in the medical community are discrete cosine transform (DCT) based algorithms and wavelet transform (WT) based algorithms proposed by JPEG, which were both adopted by DICOM in 2001 (10,11). For motion image, such as video and animation, numerous video compression algorithms were developed and can be categorized into several types including inter-frame prediction coding, three-dimensional transformation coding, and mode-based coding, etc. Inter-frame prediction coding utilizes the strong correlation between successive frames of a video and only storing the difference between the reference and predicted frames. It is the most successful video coding algorithm and adopted by many international standards organization such as MPEG-4 or H.264 developed by Moving Picture Experts Group (MPEG) and Visual Coding Experts Group (VCEG) (12). For coexisting with MPEG, JPEG also provided corresponding file format such as Motion JPEG for motion sequence. Motion JPEG uses intra-frame coding algorithm and each frame is coded independently.
Video compression exploits redundancy of information using both intra-frame and inter-frame coding algorithms. The principle of intra-frame coding algorithm, such as Motion JPEG, is illustrated in Figure 1A. It begins by calculating the DCT/WT transform over image blocks. This produces many 2D blocks of coefficients that are quantized by discarding some of trivial coefficients. The quantized coefficients are then sorted and encoded to a compressed file. The decompression is the inverse process of encoder. The compressed file is first decoded and de-quantized to 2D blocks of coefficients. Then, image is recovered by inverse transform from 2D blocks of coefficients.
Inter-frame prediction coding exploits temporal redundancy by predicting the future frame from reference frame as illustrated in Figure 1B. The image blocks of original image are searched for the similar blocks in reference image. This searching results in motion vectors which indicate the relative offsets between two images. Based on the motion vectors the predicted image is then generated and the difference image between original and predicted images is calculated. The difference image is sparse distribution and transformed to 2D blocks of coefficients which consists of many trivial values. After quantization and coding processes, the difference image is compressed to video file. The decompression is the inverse process of compression. The video file is decoded and de-quantized to 2D blocks of coefficients. The difference image is recovered by inverse transform from 2D blocks of coefficients. The predicted image is reconstructed from reference image and motion vectors. Combining the difference image and predicted image together, the target image is finally recovered. Since only information of the difference image and motion vector is necessarily stored in video file, inter-frame prediction coding algorithm always results in higher reduction of image size.
Ordering methods
Usually, 4DCT images are assigned in 10 phased CT subsets and an ordering method is needed to put them together into a sequence for the input of video encoder. The order of images in a sequence will affect the similarity between the adjacent images of a sequence. As a result, the compression performance of video encoder employing inter-frame prediction coding algorithm could vary for different image sequences. In this study two ordering methods, location-prioritized (LP) method and phase-prioritized (PP) method, were employed and assessed as illustrated in Figure 2.
The LP ordering method arranges 4DCT images based on the slice location of CT image in a patient as shown in Figure 2A. Since CT subsets are co-registered during scanning time, there is no matching error between the phased CT subsets. Therefore, the similarity between CT slices at the same location would be higher. The PP ordering method arranges 4D images based to their respiration phases and is the conventional way to arrange images in a sequence as shown in Figure 2B. Since anatomical structures are continuous in three dimensions, the similarity between adjacent CT slices would be higher.
Experiments
In this study three video encoders were investigated for 4DCT image compression. They are Motion JPEG 2000 (MJ2) which is a lossless WT-based JPEG encoder and uses intra-frame coding algorithm, Motion JPEG Audio Video Interleaved (AVI) which is lossy DCT-based JPEG encoder and uses intra-frame coding algorithm, and MPEG-4 (MP4) which is lossy MPEG encoder and uses inter-frame prediction coding algorithm. Three video encoders are provided by Matlab function, videowriter, (MathWorks, Inc. Natick, MA). For MJ2, compression ratio is not specified in advance to allow video to be compressed as much as possible. For AVI and MP4, the video quality is set to its maximum number 100 to allow the best quality of video. The programs for data processing were written in Matlab language and ran on a personal computer equipped with Intel i7 CPU 2.4-GHz and 72 GB RAM.
In this study, publicly available 4D-Lung dataset collected at VCU Massey Cancer Center were used for the testing purpose (13). This dataset consists of 82 4DCT datasets in a population of 20 locally advanced non-small cell lung cancer patients undergoing radiation therapy. 4DCT images were acquired on a 16-slice helical CT simulator with 10 breathing phases, 0–90%, using a phase sorting approach. The 0% phase corresponds to end of inhalation. The reconstructed slice thickness is 3 mm for all images and in-plane spacing is ~1 mm. The total size of 4DCT data is approximately 59 GB.
The compression performance of the video encoders is quantified by compression ratio (CR) which is the ratio between the sizes of 4DCT image set and video file. The inter-frame similarity of a sequence is quantified by the mean inter-frame difference (MIFD) and the mean inter-frame correlation coefficient (MIFCC). The MIFD is calculated by the average value of inter-frame differences of images in a sequence, while the MIFCC is calculated by the average value of inter-frame correlation coefficients of images in a sequence.
For evaluating the loss of image quality due to compression, it is quantified by two popular metrics, mean square error (MSE) and peak signal-to-noise ratio (PSNR) (14). The MSE is calculated by comparing original 4DCT and decompressed 4DCT images pixel by pixel as defined below.
PSNR is the ratio between the maximum power of a signal and the power of corrupting noise that affects the fidelity of its representation as defined below.
Here, MAX is the maximum possible pixel value of the image and 65,535 in this study. Typical values for the PSNR of lossy image and video compression are between 60 and 80 dB, provided the bit depth is 16 bits.
Results
The MIFD and MIFCC are 24.61±18.98 and 0.98±0.05 for PP ordering method, while 14.04±5.61 and 0.99±0.01 for LP ordering method. The MIFD of LP ordering method is 40% less than that of PP ordering method, while the MIFCC of LP ordering method is 1% more than that of PP ordering method. The compression ratios of MJ2 and AVI are 7.16±0.79 and 16.39±7.13, respectively. Since both MJ2 and AVI using intra-frame coding algorithm, their compression ratios are irrelevant to the ordering methods. The compression ratio of AVI is higher than that of MJ2, but far less than that of MP4.
The compression ratio of MP4 is highest among three video encoders. For MP4 the mean value of CR is 260.89±134.88 for PP ordering method and 310.10±71.14 for LP ordering method. The mean value of CR of LP ordering method is 20% higher than that of PP ordering method. The standard deviation of CR of LP ordering method is smaller compared to that of PP ordering method. The values of CR, MIFD, and MIFCC for total 82 4DCT datasets are shown in Figure 3. For all tested 4DCT datasets, CR and MIFCC of LP ordering method are consistently higher than those of PP ordering method, while MIFD of LP ordering method is consistently lower than that of PP ordering method. For MP4, the plots of MIFD vs. CR and MIFCC vs. CR are shown in Figure 4A and B. The LP and PP ordering method for each 4DCT dataset are represented by red plus and blue minus in the plots. For AVI, corresponding plots are shown in Figure 4C and D. As demonstrated in Figure 4A and C, CR is inversely linear correlation with MIFD for both PP and LP ordering method. However CR is only linear correlation with MIFCC for LP ordering method but PP ordering method.
The loss of image quality due to compression was evaluated for all tested 4DCT datasets and summarized in Table 1. Since MJ2 using lossless coding algorithm, MSE and PSNR are zero and infinite for MJ2 and not listed in Table 1. For AVI, the mean values of MSE and PSNR are 1.66e-2 and 66.71 for both ordering methods. The values of image loss for both PP and LP ordering methods are identical. For MP4, the average values of MSE and PSNR are 3.80e-5 and 92.75 for PP ordering method, while 3.64e-5 and 92.51 for LP ordering method. The values of image loss for both PP and LP ordering methods are close. Overall, MP4 has less image loss than that of AVI.
Full table
Discussion
Inter-frame prediction coding algorithm (employed by MP4) demonstrated superior compression capability over intra-frame coding algorithms (employed by AVI and MJ2). This is attributed to the way in utilizing the inter-frame similarity information of a sequence. The compression ratio of MP4 for LP ordering method was higher than that of MP4 for PP ordering method. Also the MIFD of LP ordering method was smaller than that of PP ordering method. Both facts implied that LP ordering method could reduce average inter-frame variation of a sequence and consequently improve compression ratio of video encoder employing inter-frame prediction coding algorithm such as MP4. The linear correlation between CR and MIFD/MIFCC also implied that the improvement of inter-frame similarity in a sequence could increase compression ratio of MP4 video encoder. For total 82 4DCT datasets, the compression ratios of LP ordering method were consistently higher than that of PP ordering method for MP4 video encoder. For few cases the compression ratios of LP ordering method were similar to those of PP ordering method. This might be caused by the poor similarity between phased CT subsets of 4DCT dataset which results in the gap of compression performance between PP and LP ordering methods less evident.
The loss of image quality due to compression is real but its effect could be minor in clinical application. Image loss caused by MP4 was less than that of AVI according to the metrics of MSE and PSNR. Although MSE of AVI video encoder was 103 larger than that of MP4 video encoder, PSNR of AVI was 66 dB which was typical for video compression. In contrast, PSNR of MP4 was 92 dB which demonstrated better image quality of decompressed images. For all tested cases, MSE was below 0.0001 and PSNR was within the range of [60–90] which indicated that the major information of 4DCT images was well persevered. It should note that the MP4 video encoder provided by Matlab function is based on H.264 standard which is mainly designed for video transmitting with lower image quality. Its successor, H.265, provides the substantially improved video quality at the same bit rate. If the video encoder based on H.265 is used, the image loss of MP4 video encoder could be further reduced. In addition to higher compression ratio of video encoder, the video file can be easily displayed without decompression. This would be an advantage over most image compression approaches and potentially beneficial to many clinical applications.
So far there was no report on 4DCT image compression with video encoders. This study proposed an efficient way to compress 4DCT image with existing video coding algorithms. In addition to the benefits of video encoder, there are certain limitations to be considered in clinical application. First, the video encoders were mainly developed for multi-media and allowed certain degradation of image quality during compression. If it is critical to preserve all image contents, the lossless video encoder is recommended but with very low compression performance. Second, for certain video encoders, the dynamic range of video frames was limited to 0–255, which required the original image to be rescaled or down-sampled before processed by video encoder. For higher image resolution after compression, the value of original images should be decomposed into multiple color channels and processed separately during compression. It should note that besides 4DCT this method could be apply to the other 4D image modalities, such as 4D-CBCT, 4D-MRI, 4D-ultrasounds etc. Since these image modalities are different in principle, we will investigate effective pre-processing and ordering methods for them in our future study.
Conclusions
The ordering method based on slice locations is more effective than that based on respiration phases in 4DCT image compression due to the capability in incorporating inter-frame similarity information of a sequence. Video encoder employing inter-frame coding algorithm provides a highly effective tool for 4DCT image compression than those employing intra-frame coding algorithm. The MPEG-4 video encoder using LP ordering method would be highly suitable for 4DCT image compression and potentially applicable for the other 4D image modalities.
Acknowledgments
Funding: This work was partially supported by National Key R&D Program of China (2016YFC0904600) and China National Science Foundation (CNSF11875320).
Footnote
Conflicts of Interest: The authors have no conflicts of interest to declare.
References
- Purdie TG, Bissonnette JP, Franks K, Bezjak A, Payne D, Sie F, Sharpe MB, Jaffray DA. Cone-beam computed tomography for on-line image guidance of lung stereotactic radiotherapy: localization, verification, and intrafraction tumor position. Int J Radiat Oncol Biol Phys 2007;68:243-52. [Crossref] [PubMed]
- Balik S, Weiss E, Jan N, Roman N, Sleeman WC, Fatyga M, Christensen GE, Zhang C, Murphy MJ, Lu J, Keall P, Williamson JF, Hugo GD. Evaluation of 4-dimensional computed tomography to 4-dimensional cone-beam computed tomography deformable image registration for lung cancer adaptive radiation therapy. Int J Radiat Oncol Biol Phys 2013;86:372-9. [Crossref] [PubMed]
- Ivetic D, Dragan D. Medical image on the go. J Med Syst 2011;35:499-516. [Crossref] [PubMed]
- Koff DA, Shulman H. An overview of digital compression of medical images: can we use lossy image compression in radiology? Can Assoc Radiol J 2006;57:211-7. [PubMed]
- Miaou SG, Ke FS, Chen SC. A lossless compression method for medical image sequences using JPEG-LS and interframe coding. IEEE Trans Inf Technol Biomed 2009;13:818-21. [Crossref] [PubMed]
- Zukoski MJ, Boult T, Iyriboz T. A novel approach to medical image compression. Int J Bioinform Res Appl 2006;2:89-103. [Crossref] [PubMed]
- Baker WA, Hearne SE, Spero LA, Morris KG, Harrington RA, Sketch MH Jr, Behar VS, Kong Y, Peter RH, Bashore TM, Harrison JK, Cusma JT. Lossy (15:1) JPEG compression of digital coronary angiograms does not limit detection of subtle morphological features. Circulation 1997;96:1157-64. [Crossref] [PubMed]
- Brennecke R, Bürgel U, Rippin G, Post F, Rupprecht HJ, Meyer J. Comparison of image compression viability for lossy and lossless JPEG and Wavelet data reduction in coronary angiography. Int J Cardiovasc Imaging 2001;17:1-12. [Crossref] [PubMed]
- Fidler A, Skaleric U, Likar B. The impact of image information on compressibility and degradation in medical image compression. Med Phys 2006;33:2832-8. [Crossref] [PubMed]
- Kajiwara K. JPEG compression for PACS. Comput Methods Programs Biomed 1992;37:343-51. [Crossref] [PubMed]
- Xu R, Pattanaik SN, Hughes CE. High-dynamic-range still-image encoding in JPEG 2000. IEEE Comput Graph Appl 2005;25:57-64. [Crossref] [PubMed]
- Lan C, Shi G, Wu F. Compress compound images in H.264/MPGE-4 AVC by exploiting spatial correlation. IEEE Trans Image Process 2010;19:946-57. [Crossref] [PubMed]
- Hugo GD, Weiss E, Sleeman WC, Balik S, Keall PJ, Lu J, Williamson JF. A longitudinal four-dimensional computed tomography and cone beam computed tomography dataset for image-guided radiation therapy research in lung cancer. Med Phys 2017;44:762-71. [Crossref] [PubMed]
- Frankewitsch T, Söhnlein S, Müller M, Prokosch HU. Computed Quality Assessment of MPEG4-compressed DICOM Video Data. Stud Health Technol Inform 2005;116:447-52. [PubMed]