Comparison of brain volumetry in patients with non-lesional epilepsy on 3 and 7 T MPRAGE
Introduction
Magnetic resonance imaging (MRI) is integral to the diagnosis and management of epilepsy, with a primary focus on identifying potential epileptogenic lesions. While reductions in brain volume play a comparatively minor role in diagnosis, they can nonetheless serve as indicators of structural abnormalities that may act as both a cause and a consequence of epilepsy, which is considered a network disorder (1,2).
As evidenced by previous studies, higher magnetic field strengths have been shown to enhance lesion detection (3,4). Ultra-high field MRI at 7 T offers superior signal-to-noise ratio and spatial resolution (5), which is expected to significantly enhance volumetric analysis compared to 3 T MRI. However, higher field strength also brings increased B0- and B1-inhomogeneities, altered relaxation behavior for both T1-weighted and T2-weighted images, and increased specific absorption rates (6,7), resulting in artifacts and image inhomogeneities at 7 T. In the latest generation of ultrahigh field MRI, efforts to mitigate such artifacts include the use of parallel transmission technology (8), which was not available at the beginning of our study.
FreeSurfer (9) is a standard tool for volumetric brain assessment in imaging research, especially in dementia diagnostics (10). FastSurfer (11) is a deep-learning-based alternative to FreeSurfer, capable of replacing major components of the FreeSurfer pipeline while significantly reducing runtime.
The use of toolboxes and frameworks, e.g., unified segmentation (12), from external programs like SPM12 is often preferred for pre-processing and even recommended in the 2015 FreeSurfer documentation (https://surfer.nmr.mgh.harvard.edu/fswiki/HighFieldRecon). These settings can influence volumetric results, making an inbuilt standardized process preferable. Since both FastSurfer and FreeSurfer have built-in pre-processing pipelines for skull stripping and N4 bias correction that are continuously optimized, and considering that the recommendations in the FreeSurfer documentation are nearly 10 years old, we consciously decided to forgo external pre-processing.
Our objective was to evaluate these approaches (FreeSurfer vs. FastSurfer across 3 and 7 T) in a routine clinical setting, including scans of medium quality. Since brain atrophy can often be detected in epilepsy, volumetry may provide direct and indirect clues to the origin (13) when compared to healthy controls, which are being examined in our ongoing studies. For methodological reasons, we first focused on assessing the feasibility of FreeSurfer and FastSurfer at both 3 and 7 T MRI. Since the hippocampus and amygdala also plays an important role in research, we additionally tested the FreeSurfer hippocampus scripts on the FreeSurfer as well as FastSurfer segmentations.
Figure 1 demonstrates an example of coronary T1-weighted and T2-weighted images of 3 and 7 T MRI.
Methods
Study design
This study is a retrospective analysis of prospectively collected data. The study was ethically approved by the institutional review board of the University of Magdeburg (Leipziger Str. 44, D-39104 Magdeburg, Germany, No. 100/24) and was conducted in accordance with the Declaration of Helsinki and its subsequent amendments. The institutional review board waived the requirement for informed consent because of the retrospective nature of the study. Patients had previously consented to a secondary evaluation of their MRI data. All methods were performed in accordance with relevant guidelines and regulations.
Participant population
Seventeen patients with non-lesional epilepsy diagnosed via 3 T MRI were included in the study. The patients are a subgroup of a prospective study by Kukhlenko et al. (14).
The primary inclusion criterion was the availability of 3D T1-weighted MPRAGE scans at both 3 and 7 T. The exclusion criterion was the failure of segmentation using FreeSurfer or FastSurfer on the 3 T or 7 T MPRAGE data.
MRI protocol and technical details
We performed protocols optimized by MRI physicists on the PRISMA 3 T and MAGNETOM Terra 7 T (Siemens Healthineers AG, Werner-von-Siemens-Str. 1, 80333 Munich, Germany).
The 3D T1-weighted-MPRAGE with an isotropic voxel-size of 0.7 mm × 0.7 mm × 0.7 mm was performed on 3 T (TR 2,000 ms, TE 2.3 ms) and 7 T (TR 2,500 ms, TE 2 ms). The mean interval between the 3 and 7 T scans was 11.4±23.2 days (mean ± standard deviation). The acquisition times for T1-weighted-MPRAGE were 277 seconds at 3 T and 599 seconds at 7 T. GRAPPA acceleration was used for 3 T data only and B1 shimming as well as TR-FOCI for 7 T data. Additionally, a transition from Tx1/Rx32 coils to Tx8/Rx32 coils was implemented during the course of the study. Further details regarding the scan parameters can be found in Table S1.
We additionally evaluated coronal T2-weighted STIR sequences (TR 5,390 ms, TE 25 ms, 0.4 mm × 0.4 mm resolution, 1.75 mm slice thickness) on 3 T; and coronal T2-weighted TSE sequences (TR 7,500 ms, TE 88 ms, 0.4 mm × 0.4 mm resolution, 1.75 mm slice thickness) on 7 T.
Standard pre-processing
We did not also use skull stripping because our focus was on the hippocampal regions. We initially did not perform any external bias correction. We only used the standard N4 bias correction (“ANTS N4 bias correction”) of FreeSurfer or the slightly modified standard correction of FastSurfer (15).
Additional pre-processing pipeline
Finally, we aimed to assess the extent to which an additional N4 bias correction influences the results. To assess the isolated effect of our own N4 bias correction, we applied it during pre-processing without disabling the standard bias correction routines in FreeSurfer and FastSurfer. Our method was implemented as follows: Bias field estimation was performed using the N4 algorithm (16), which derives the bias field directly from the image. The correction was applied with a spline order of 3, a control point grid size of 4×4×4, a full width at half maximum (FWHM) of 0.15, and a convergence threshold of 0.00001. Following bias correction, skull-stripping was carried out using SynthStrip (17).
Volumetric assessment
We ran FreeSurfer (Version 7.4.1) as well as FastSurfer (Version 2.3.0) on T1-weighted-MPRAGE data of 3 T and 7 T MRI, and calculated Automated Segmentation (ASEG) (18) and Desikan-Killiany-Tourville (DKT) (19) atlas volumes. Additionally, we measured subvolumes of hippocampus and amygdala using a dedicated FreeSurfer script (20-22) with and without additional T2-weighted data. We used the FreeSurfer Version 7.4.1 with additional head, body, tail (FS60) parcellation (23). FS60 mimics the FreeSurfer 6.0 hippocampal module and therefore allows for improved comparability. Additionally, we volumetrically assessed the thalamic nuclei and the brain stem with the standard FreeSurfer script (24,25).
Quality assessment
We initially checked the FreeSurfer results for failed segmentations.
For the comparison of the quality of the eight hippocampal segmentations, we conducted a rating of the anatomic regions. We used the FreeSurfer segmentation of the 3 T T1-weighted-MPRAGE without T2-weighted data as the standard for visual comparison and set the quality rating to a fixed value of three. Visual assessment was performed by a neuroradiologist with 6 years of training and a research assistant using the schema in Figure 2 including criteria for manual segmentation from Iglesias et al. (20). Raters scored on a Likert scale from 1 to 5. Differences in a primary criterion were rated either 1 or 5, and in a secondary criterion either 2 or 4, depending on whether the comparative scan was worse (<3) or better (>3) than 3 T T1-weighted MPRAGE.
We compared the hippocampal standard segmentation with the three remaining FreeSurfer and four FastSurfer segmentations for all patients.
Statistical analysis
We used Python3 (Version 3.9.18, Matplotlib, nibabel) for statistical programming (partially), for extracting and converting data from FastSurfer and FreeSurfer output files into Excel formats as well as the voxel-wise calculation of Dice coefficient. Basic statistical analyses were performed using R (Version 4.4.2.; Packages: lme4, lmerTest, broom.mixed, performance, emmeans). We used a mixed linear model with patient and region as random effect, setting FreeSurfer as reference. Analysis on individual region level was performed using separate linear mixed models. Additionally, we applied Bonferroni’s correction (26) for multiple testing. The significance level was set to 5%, the adjusted significance level to 0.1%.
As baseline analysis, we compared 216 regions, including ASEG, DKT, Hypothalamic, Amygdalar, and Thalamic nuclei segmentations, for both 3 and 7 T acquisition datasets.
Calculation of Dice-coefficients between ASEG+DKT segmentation in FastSurfer and FreeSurfer for both 3 T and 7 T was performed for each region using:
For the analysis of variance (ANOVA), the R programming language (R Core Team, version 4.2.2) was used together with relevant packages (“stats”) and the commands “aov” and “TukeyHSD” for Tukey’s Honestly Significant Difference (HSD) post hoc test. We compared the four segmentations (3 and 7 T T1-weighted only, 3 and 7 T T1-weighted plus T2-weighted) of the hippocampus with an repeated measures ANOVA using Tukey’s test (27). We calculated the intraclass correlation coefficient (ICC) and estimated the 95% confidence intervals with R as well (packages irr, readxl, lpSolve and psych).
Additionally, we used the coefficient of variation (CV) (28,29) and Cohen’s d (30) to assess inter-scanner reliability.
Results
Participants
From the original cohort of 17 patients, we had to exclude one patient due to an incomplete MR protocol and an additional three patients due to severe artifacts. FreeSurfer or FastSurfer segmentation failed in 2 out of the remaining 13 patients (one for each). In two additional patients, both segmentation methods failed. These four patients were therefore excluded due to substantial segmentation errors. Thus, 9 out of the 13 remaining patients were included in the final evaluation. The inclusion flow chart is shown in Figure 3.
Figure 4 demonstrates examples of both successful and failed FreeSurfer segmentations at 7 T.
The mean age of the finally included nine patients with non-lesional epilepsy was 31.3±8.3 years. Electroencephalography (EEG) revealed a left temporal focus in three patients, a mixed left/right frontal focus with right pronunciation in one patient, a right frontal focus in two patients and a right temporal focus in three patients.
Calculation time
The mean calculation time on a standard server (Ubuntu 22, Intel Core i7-11700, 8 Cores/16 Threads, 2.5GHz, 64 GB RAM, NVIDIA GeForce RTX3090, 10496 CUDA cores) of FreeSurfer/FastSurferVINN was 167/31 min (196/42 min) on 3 T (7 T). The time for the additional scripts ranged from 6 to 20 minutes (3 T: hippocampal subfields T1-weighted: 18 min; T1-weighted + T2-weighted: 6 min; brain stem: 6 min; thalamic nuclei: 11 min; 7 T: hippocampal subfields T1-weighted: 18 min; T1-weighted + T2-weighted: 6 min; brain stem: 6 min; thalamic nuclei: 11 min). It should be noted that FastSurfer leverages GPU acceleration when available, while FreeSurfer relies entirely on CPU-based processing.
Both pipelines were executed using their default command-line settings for 3 T and with hires flag (FreeSurfer) or vox_size =0.7 flag (FastSurfer) for 7 T.
Volumetric analysis
With pre-processing
With additional pre-processing, FastSurfer was able to segment 15/17 patients for 3 T and 17/17 for 7 T. FreeSurfer was able to segment 14/17 in 3 T and 17/17 in 7 T. All failed segmentations were due to failed 3D-normalization in recon-all. For consistency, we only used the results of the nine included patients for the following results and the Dice-calculation.
Comparison of ASEG and DKT atlas volumes with and without pre-processing
For FastSurfer, the mean difference was 676.75±1,722.25 mm3 between volumes calculated with and without pre-processing on 7 T. Out of 93 regions, 39 were found to be significant different. For FreeSurfer, the mean difference was −222.93±783.51 mm3 with 9 regions showing significant differences. A detailed overview of significant regions can be found in Table 1.
Table 1
| Region | FastSurfer P (adjusted) | FreeSurfer P (adjusted) |
|---|---|---|
| ASEG | ||
| Left-lateral-ventricle | 1 | 0.042723376 |
| Left-inf-lat-vent | 0.228810448 | 1 |
| Left-cerebellum-white-matter | 0.026641647 | 1 |
| Left-cerebellum-cortex | 1 | 0.021661475 |
| Left-thalamus | 0.077492443 | 1 |
| Left-caudate | 1 | 1 |
| Left-putamen | 0.617490161 | 1 |
| Left-pallidum | 1 | 1 |
| Left-hippocampus | 0.770121223 | 0.223063389 |
| Left-amygdala | 0.012941901 | 0.469493166 |
| Left-Accumbens-area | 1 | 0.283840813 |
| Left-VentralDC | 0.111056251 | 1 |
| Left-choroid-plexus | 1 | 1 |
| Right-lateral-ventricle | 0.44642086 | 1 |
| Right-inf-lat-vent | 1 | 1 |
| Right-cerebellum-white-matter | 1 | 1 |
| Right-cerebellum-cortex | 1 | 9.76682E-06 |
| Right-thalamus | 0.000228375 | 1 |
| Right-caudate | 0.076544017 | 1 |
| Right-putamen | 1 | 1 |
| Right-pallidum | 0.003122505 | 1 |
| Right-hippocampus | 0.068136152 | 1 |
| Right-amygdala | 0.483481814 | 1 |
| Right-accumbens-area | 0.609640851 | 1 |
| Right-VentralDC | 0.001898902 | 0.652936975 |
| Right-choroid-plexus | 0.208402172 | 1 |
| WM-hypointensities | 1 | 1 |
| 3rd-Ventricle | 1 | 0.170957905 |
| 4th-Ventricle | 1 | 1 |
| Brain-stem | 1 | 1 |
| CSF | 1 | 1 |
| DKT | ||
| Frontal lobe | ||
| ctx-lh-caudalmiddlefrontal | 0.131083749 | 0.997226993 |
| ctx-rh-caudalmiddlefrontal | 0.272948157 | 0.382651565 |
| ctx-lh-rostralmiddlefrontal | 0.026907862 | 0.004389518 |
| ctx-rh-rostralmiddlefrontal | 0.994050463 | 0.195501664 |
| ctx-lh-superiorfrontal | 0.000246885 | 1 |
| ctx-rh-superiorfrontal | 0.003260669 | 1 |
| ctx-lh-lateralorbitofrontal | 0.001027397 | 1 |
| ctx-rh-lateralorbitofrontal | 0.012331138 | 1 |
| ctx-lh-medialorbitofrontal | 9.12487E-06 | 0.786746571 |
| ctx-rh-medialorbitofrontal | 0.000270742 | 1 |
| ctx-lh-parsopercularis | 0.275416382 | 1 |
| ctx-rh-parsopercularis | 0.002833774 | 1 |
| ctx-lh-parsorbitalis | 0.002388439 | 0.142462416 |
| ctx-rh-parsorbitalis | 2.74876E-06 | 0.117912611 |
| ctx-lh-parstriangularis | 1 | 1 |
| ctx-rh-parstriangularis | 1 | 1 |
| Cingulate cortex | ||
| ctx-lh-caudalanteriorcingulate | 6.25931E-05 | 1 |
| ctx-rh-caudalanteriorcingulate | 1.7791E-05 | 1 |
| ctx-lh-rostralanteriorcingulate | 0.394506963 | 1 |
| ctx-rh-rostralanteriorcingulate | 0.002370194 | 1 |
| ctx-lh-posteriorcingulate | 3.68633E-05 | 1 |
| ctx-rh-posteriorcingulate | 0.003548288 | 1 |
| ctx-lh-isthmuscingulate | 0.000250058 | 1 |
| ctx-rh-isthmuscingulate | 1.14177E-05 | 1 |
| Sensorimotor cortex | ||
| ctx-lh-precentral | 1 | 0.046097486 |
| ctx-rh-precentral | 1 | 0.418035485 |
| ctx-lh-postcentral | 1 | 0.100332481 |
| ctx-rh-postcentral | 1 | 1 |
| ctx-lh-paracentral | 1 | 1 |
| ctx-rh-paracentral | 0.131191359 | 1 |
| Parietal lobe | ||
| ctx-lh-inferiorparietal | 0.199230307 | 0.118280481 |
| ctx-rh-inferiorparietal | 1 | 0.181373008 |
| ctx-lh-superiorparietal | 1 | 1 |
| ctx-rh-superiorparietal | 1 | 1 |
| ctx-lh-supramarginal | 1 | 1 |
| ctx-rh-supramarginal | 1 | 1 |
| ctx-lh-precuneus | 1 | 0.256899304 |
| ctx-rh-precuneus | 1 | 1 |
| Temporal lobe | ||
| ctx-lh-middletemporal | 1.32068E-05 | 0.013471687 |
| ctx-rh-middletemporal | 0.000133857 | 0.359273796 |
| ctx-lh-superiortemporal | 2.39937E-06 | 0.160534755 |
| ctx-rh-superiortemporal | 1.13032E-05 | 1 |
| ctx-lh-inferiortemporal | 0.000311848 | 0.086296418 |
| ctx-rh-inferiortemporal | 1.34519E-05 | 0.011535485 |
| ctx-lh-fusiform | 0.017176305 | 1 |
| ctx-rh-fusiform | 0.00305907 | 1 |
| ctx-lh-parahippocampal | 0.090232003 | 1 |
| ctx-rh-parahippocampal | 1 | 1 |
| ctx-lh-entorhinal | 0.017614183 | 1 |
| ctx-rh-entorhinal | 5.35788E-05 | 1 |
| ctx-lh-transversetemporal | 0.544052131 | 1 |
| ctx-rh-transversetemporal | 0.014705947 | 1 |
| Occipital lobe | ||
| ctx-lh-cuneus | 0.449622641 | 1 |
| ctx-rh-cuneus | 0.070867839 | 0.045459272 |
| ctx-lh-lateraloccipital | 1 | 1 |
| ctx-rh-lateraloccipital | 0.119059522 | 0.002234902 |
| ctx-lh-lingual | 0.002226316 | 1 |
| ctx-rh-lingual | 0.010924993 | 1 |
| ctx-lh-pericalcarine | 0.005136744 | 1 |
| ctx-rh-pericalcarine | 0.002747831 | 1 |
| Insula | ||
| ctx-lh-insula | 0.00513421 | 1 |
| ctx-rh-insula | 0.000167276 | 1 |
ASEG, Automatic Segmentation; CSF, cerebrospinal fluid; DC, Diencephalon; DKT, Desikan-Killiany-Tourville Atlas; lh, left hemisphere; rh, right hemisphere.
The five most affected regions are displayed in Figure 5.
Comparison of ASEG and DKT atlas volumes between 3 and 7 T
Comparing 3 and 7 T for FreeSurfer, we found 13 volumes to be significant different with a mean difference of 2,186.77±3,506.65 mm3. For FastSurfer we found 25 significant differing volumes presenting a mean difference of 960.57±1,706.18 mm3.
Comparison of FastSurfer vs. FreeSurfer with pre-processing on 3 T
In the mixed-effects analysis comparing FastSurfer and FreeSurfer (regions n=216 as random effects), we observed a mean volume difference of 940.4±238.6 mm3 (P<0.001), indicating that FastSurfer consistently yields significantly higher volumes than FreeSurfer.
A detailed list of significant structures is provided in Table S2, and a boxplot of the most significant differing volumes is shown in Figure 6.
FastSurfer vs. FreeSurfer with pre-processing on 7 T
Volumetric comparison between FastSurfer and FreeSurfer yielded an average estimate of 407.4±235.7 mm3, with a P value of 0.084, indicating no systematic difference but a trend toward higher volumes with FastSurfer.
A detailed table can be found in Table S2. A Boxplot of the most significant structure can be found in Figure 7.
Inter-scanner reliability
Table 2 reveals the inter-scanner deviations and reliability of 3 and 7 T using CV and Cohen’s d. Due to the inhomogeneities, we observed higher standard deviations on 7 T.
Table 2
| Brain region | Ratio 3 T:7 T (mean ± SD) | Coefficient of variation (%) | P (Free vs. Fast) | Cohen’s d (Free vs. Fast) | |||
|---|---|---|---|---|---|---|---|
| FreeSurfer | FastSurfer | FreeSurfer | FastSurfer | ||||
| ASEG | |||||||
| Hippocampus L | 1.07±0.07 | 1.16±0.07 | 6.97 | 10.92 | 0.53 | 0.20 | |
| Hippocampus R | 1.16±0.06 | 1.11±0.04 | 10.92 | 7.89 | 0.59 | −0.17 | |
| Amygdala L | 1.24±0.11 | 1.74±0.25 | 16.64 | 38.72 | 0.66 | 0.14 | |
| Amygdala R | 1.37±0.18 | 1.31±0.15 | 23.46 | 20.11 | 0.58 | −0.18 | |
| Lateral-ventricle L | 0.95±0.03 | 0.94±0.05 | 4.07 | 5.81 | 0.92 | 0.03 | |
| Lateral-ventricle R | 0.99±0.03 | 0.97±0.03 | 1.77 | 2.59 | 0.89 | 0.05 | |
| VentralDC L | 1.05±0.10 | 1.09±0.04 | 7.7 | 6.74 | 0.69 | −0.13 | |
| VentralDC R | 1.00±0.10 | 1.08±0.02 | 7.39 | 5.85 | 0.67 | 0.14 | |
| Thalamus L | 1.18±0.08 | 1.13±0.03 | 11.82 | 8.96 | 0.60 | −0.17 | |
| Thalamus R | 1.23±0.09 | 1.14±0.02 | 15.08 | 9.52 | 0.89 | −0.05 | |
| Caudate L | 1.07±0.05 | 1.05±0.09 | 4.92 | 3.4 | 0.40 | −0.27 | |
| Caudate R | 1.06±0.08 | 1.00±0.01 | 6.28 | 0.99 | 0.55 | −0.19 | |
| Putamen L | 1.05±0.11 | 1.06±0.05 | 7.34 | 5.04 | 0.37 | −0.29 | |
| Putamen R | 1.18±0.12 | 1.08±0.08 | 11.93 | 7.21 | 0.19 | −0.42 | |
| Pallidum L | 0.98±0.21 | 1.03±0.04 | 13.78 | 3.55 | 0.52 | 0.21 | |
| Pallidum R | 0.97±0.17 | 1.03±0.05 | 13.89 | 4.05 | 0.97 | 0.01 | |
| Accumbens-area L | 0.79±0.13 | 0.99±0.04 | 16.85 | 3.05 | 0.59 | 0.17 | |
| Accumbens-area R | 1.22±0.25 | 1.09±0.02 | 18.47 | 7.42 | 0.43 | −0.25 | |
| Average | 1.09±0.11 | 1.11±0.06 | 11.07 | 8.44 | 0.61 | −0.07 | |
ASEG, Automatic Segmentation; DC, Diencephalon; L, left; R, right; SD, standard deviation.
Hippocampal Subfields
Comparison of hippocampal subfield volumes calculated with T1-weighted import data and T1- and T2-weighted import data
We measured all volumes as described in the methods. Table S3 reveals the segmented mean volumes and standard deviations.
We could not determine any significant differences in the measured volumes between (3 T T1-weighted and 3 T T1-weighted + T2-weighted) or between (7 T T1-weighted and 7 T T1-weighted+T2-weighted) using Tukey’s test for the left hippocampal region. However, there were significant deviations in the volumetry of the right hippocampus at 7 T. In contrast, we had to exclude patients based on incorrect segmentation of the left hippocampus at 7 T. Table S4 shows the results of the ANOVA with Tukey’s test for the segmented hippocampal regions (FreeSurfer).
Comparison of the hippocampal subfield segmentation with visual ratings
We could not detect any significant advantages between the eight different segmentations via visual ratings. Some regions, like the parasubiculum were subjectively better visualized on 7 T, but other regions like the molecular layer performed worse on 7 T compared with 3 T.
In particular, the segmentations in the 7 T with additional T2-weighted data failed noticeably. It seems that additional inhomogeneities or artifacts interfered with the script.
Figure 8 reveals the mean results of the visual ratings.
The interrater reliability was moderate with an ICC (2, k) =0.67 (confidence interval: 0.42, 0.92).
Artifacts
Three of our 17 patients had to be excluded due to severe artifacts, primarily due to motion. Another three patients showed minor motion artifacts, and one patient had severe white-matter inhomogeneities, resulting in failed FastSurfer or FreeSurfer segmentation. Other artifacts, like pulsation artifacts of the anterior cerebral artery were observed to varying extents in both 3 and 7 T scans across all patients. However, these artifacts, affecting primarily the right hemisphere, did not affect the segmentation. Especially in 7 T MRI, inhomogeneities were observed in every patient, ranging from mild to severe, and could only be partially corrected through (n4 bias) pre-processing in FastSurfer and FreeSurfer.
Comparison with Dice-coefficient analysis
Out of 99 volumes (ASEG + DKT), the mean Dice score exceeded 0.9 in 38 volumes at 3 T and in 11 volumes at 7 T, indicating excellent overlap between FreeSurfer and FastSurfer. In 53 volumes (3 T) and 68 volumes (7 T), the mean Dice ranged between 0.8 and 0.9, suggesting good agreement. A mean Dice below 0.8 was observed in 8 volumes at 3 T and 20 volumes at 7 T. Volumes consistently showing low overlap (<0.8) across both field strengths included the left and right choroid plexus, right inferior lateral ventricle, parts of the corpus callosum, and the left and right entorhinal cortices. The distribution of Dice scores across 3 and 7 T is illustrated in Figure 9.
Discussion
In our study of 17 patients with non-lesional epilepsy, we detected more significantly different volumes using FastSurfer (n=25) than with FreeSurfer (n=13) when comparing 3 to 7 T data. Additional pre-processing had a more pronounced impact when using FastSurfer, resulting in 39 significantly different regions, compared to only 9 with FreeSurfer. We also found tendencies of FastSurfer measuring greater volumes than FreeSurfer in both 3 and 7 T. Overall, FreeSurfer still appears to be the more valid choice at 7 T, presumably because FastSurfer was trained on 1.5 and 3 T data (11).
Results with and without external pre-processing
Volumetric analyses at 3 and 7 T including an external pre-processing were straightforward and reliably performed across both software solutions. Using the inbuilt pre-processing pipelines of FreeSurfer and FastSurfer exclusively, we observed a reduced number of patients in whom the segmentation produced satisfactory results, particularly in the right hemisphere. In patients in whom it worked in both 3 and 7 T, we could not detect significant differences between 3 and 7 T in the right hemisphere. When comparing volumes with and without pre-processing in 7 T MRI, we found that FastSurfer is particularly affected in regions with complex geometry and low intensity contrast, such as the entorhinal, temporal, and orbitofrontal cortices, as well as subcortical structures. Additionally, skull stripping can also impact segmentation quality (31).
We have observed various artifacts, including those caused by motion, pulsation of the cerebral anterior artery, and interference at the base of the skull due to air. These artifacts were sometimes so severe that FastSurfer and FreeSurfer crashed or resulted in segmentation errors without an external pre-processing.
It is well-established that 7 T MRI is associated with an increased artifact load due to several physical limitations compared to 3 T MRI (6,7). Prolonged acquisition times can exacerbate motion artifacts (32). Consequently, various strategies have been developed to mitigate artifacts in 7 T imaging, including the use of fast pulse sequences, k-space filling techniques, and parallel transmission technology combined with radiofrequency (RF) shimming (33). Additional efforts to reduce artifacts and inhomogeneities involve post-processing and co-registration techniques. Since our data evaluation was retrospective, we were unable to correct artifacts during data acquisition, which would have enhanced the comparison between 3 and 7 T MRI.
Volumetric results
FreeSurfer is widely established, and is used in everyday clinical practice, especially in dementia more frequently than FastSurfer (10). FastSurfer has some of its own pre-processing procedures and a FastSurferVINN module for higher resolution in the 7 T (34). While currently a larger number of additional scripts (subscripts) exist for FreeSurfer, they are generally compatible with FastSurfer outputs, as both tools rely on the same underlying data structure and labeling conventions (35).
Overall, we found significant evidence in 3 T and a tendency in 7 T for FastSurfer to report greater volumes than FreeSurfer.
Asymmetries between hemispheres are widespread and known to influence segmentation (36). Since artifacts were predominantly found in the right hemisphere, their influence might explain asymmetries, especially in low-contrast subcortical areas already known to be affected by programmatic limitations (24). Furthermore inhomogenities in 7 T, especially in basal regions with air-brain interfaces (37,38), can interfere with the segmentation quality.
The differences between FastSurfer and FreeSurfer can be attributed due to FastSurfer being a deep learning approach, which strongly depends on the quality and diversity of the training data (11), whereas FreeSurfer being an atlas-based segmentation tool without the integration of a deep learning algorithm.
Ultra-high field MRI has also been successful applied in psychiatric disorders (39,40). In patients with schizophrenia, 7 T MRI revealed reduced cerebellar volumes (41) as well as aberrant association fibers (42).
Visual assessment
Visual assessment of hippocampal and amygdalar subfields in 3 and 7 T MRI using FastSurfer and FreeSurfer revealed largely similar segmentation outcomes. Notably, the parasubiculum was slightly better segmented at 7 T compared to 3 T, while the molecular layer demonstrated poorer segmentation performance at 7 T. Co-registration of T2-weighted images did not enhance outcomes beyond T1-weighted image alone.
To the best of our knowledge, there are no standardized visual assessment criteria for evaluating segmentation results. As it is difficult to compare the segmentation of different patients, we decided to use a reference scan for each patient. Since 7 T volumetry is relatively new compared to 3 T volumetry and 3 T MPRAGE is known to have the best test-retest reliability (43), we chose the 3 T T1-weighted sequence as the standard. We introduced primary and secondary criteria to weight different segmentation errors, as they have varying impacts on volume.
Dice analysis
Comparing Dice scores across ASEG+DKT segmentations revealed good overlap for most regions. Eight regions at 3 T and twenty regions at 7 T exceeded a Dice coefficient of 0.8, suggesting increasing divergence between FastSurfer and FreeSurfer with higher field strength. In volumetric analysis, the mean ICC was lower at 7 T compared to 3 T, indicating less consistent measurements across patients. In addition to increased artifacts with higher field strength, differences in the algorithms may contribute to these discrepancies. FastSurfer, which relies on deep learning and has been trained predominantly on 1.5 and 3 T data, is also a plausible source of variation.
Limitations
A major limitation of our study is the small sample size. This is partly due to the rarity of patients with non-lesional focal epilepsy at our center, and partly because the manual assessment of segmentation quality for a larger number of patients would be logistically challenging. Additionally, strong motion artifacts, particularly prevalent in 7 T imaging due to longer acquisition times, further reduced the number of patients included in the study. The visual detection of epileptogenic lesions (e.g., focal cortical dysplasia), which may be better differentiated due to the higher resolution in 7 T, was not part of this study. Nor was it intended to detect a reduction in volume due to the disease. Thus, a control group was not part of this study. Without a known ground truth, the interpretation of the results becomes more challenging.
One limitation is the relatively old design of the 7 T sequences without parallel transmission, as these techniques were not available at the time the study began. Another factor worth mentioning is that our scanner parameters are not ideal identical. Since 7 T data in our facility are designed to be comparable with one another rather than necessarily with clinical 3 T scans, there are differences in several parameters that could potentially interfere with the results, which may affect the comparison between 3 and 7 T scans. Given that the data were utilized secondarily for evaluation and that the number of patients with both 3 and 7 T scans was limited, we had to accept these differences. We intentionally focused on MPRAGE to enable an objective comparison between 3 and 7 T and FreeSurfer and FastSurfer. For the subfield analysis of hippocampus and amygdala, we performed no external pre-processing, as it is recommended for MP2RAGE. This external parameter-dependent pre-processing could most likely influence the segmentation by shifting the white-gray contrast and influencing the comparison of 3 and 7 T.
An additional limitation lies in the fact that the deep learning model used by FastSurfer was not trained on 7 T MRI data (11). As a result, it remains unclear whether the model can accurately segment images acquired at this field strength. Therefore, we conducted manual quality control to verify the results.
Outlook
The image quality in the 7 T is improving slowly, but steadily, while the image quality in the 3 T already seems to have reached its peak. If the T1-weighted sequences at 7 T contain fewer artifacts and become more homogeneous, the usage of segmentation algorithms will be more promising.
The usage of MP2RAGE instead of MPRAGE was recommended on 7 T (44). MP2RAGE could improve the segmentation by providing a higher gray-white matter contrast and higher contrast-noise-ratio (45). Despite the self-bias correcting properties, we detected even more severe inhomogeneities of the frontal and temporal skull base in our 7 T-MP2RAGE images (compared with MPRAGE), which caused FreeSurfer as well as FastSurfer to crash.
In the future, time-resolved frequency offset corrected inversion (TR-FOCI) pulses, or advances in parallel transmission systems might reduce these artifacts (45). While MP2RAGE estimates the T1-mapping from two scans, other T1-mapping methods may be more accurate (46,47).
Pre-processing procedures can temporarily solve the problem, as shown for myelin segmentations (48,49). However, intensive pre-/post-processing always carries the risk of losing information or artificially generating false information.
Conclusions
Our study reveals systematic differences between two widely used automatic segmentation tools. FastSurfer consistently estimates higher volumes compared to FreeSurfer, particularly within cortical regions of the temporal and frontal lobes, across both 3 and 7 T field strengths.
The built-in pre-processing pipelines of FastSurfer and FreeSurfer seem sufficient to handle 7 T MRI scans with moderate quality and artifacts; however, in cases with heavily artifact-laden scans where the built-in pipeline fails, optimization through external programs becomes necessary. Pre-processing of the 7 T data led to a slight improvement in the FreeSurfer results, whereas the FastSurfer results showed a considerably greater deviation from the 3 T results.
Acknowledgments
We acknowledge the support from the Open Access Publication fund of medical faculty of the Otto-von-Guericke-University Magdeburg.
Footnote
Data Sharing Statement: Available at https://qims.amegroups.com/article/view/10.21037/qims-2025-1263/dss
Funding: None.
Conflicts of Interest: All authors have completed the ICMJE uniform disclosure form (available at https://qims.amegroups.com/article/view/10.21037/qims-2025-1263/coif). D.B. receives consulting fees from Phenox, Acandis, Balt, Stryker and receives honoria for lectures from Acandis. L.B. receives honoria for lectures and teaching from LIAM Magdeburg (laboratory for innovation, application and medical education in image guided neurosurgery). S.J.M. receives honoraria for lectures from BrainLab and EliLilly. The other authors have no conflicts of interest to declare.
Ethical Statement: The authors are accountable for all aspects of the work in ensuring that questions related to the accuracy or integrity of any part of the work are appropriately investigated and resolved. This study is a retrospective analysis of prospectively collected data. The study was ethically approved by the institutional review board of the University of Magdeburg (Leipziger Str. 44, D-39104 Magdeburg, Germany, No. 100/24) and was conducted in accordance with the Declaration of Helsinki and its subsequent amendments. The institutional review board waived the requirement for informed consent because of the retrospective nature of the study.
Open Access Statement: This is an Open Access article distributed in accordance with the Creative Commons Attribution-NonCommercial-NoDerivs 4.0 International License (CC BY-NC-ND 4.0), which permits the non-commercial replication and distribution of the article with the strict proviso that no changes or edits are made and the original work is properly cited (including links to both the formal publication through the relevant DOI and the license). See: https://creativecommons.org/licenses/by-nc-nd/4.0/.
References
- Burman RJ, Parrish RR. The Widespread Network Effects of Focal Epilepsy. J Neurosci 2018;38:8107-9. [Crossref] [PubMed]
- Slinger G, Otte WM, Braun KPJ, van Diessen E. An updated systematic review and meta-analysis of brain network organization in focal epilepsy: Looking back and forth. Neurosci Biobehav Rev 2022;132:211-23. [Crossref] [PubMed]
- Feldman RE, Delman BN, Pawha PS, Dyvorne H, Rutland JW, Yoo J, Fields MC, Marcuse LV, Balchandani P. 7T MRI in epilepsy patients with previously normal clinical MRI exams compared against healthy controls. PLoS One 2019;14:e0213642. [Crossref] [PubMed]
- De Ciantis A, Barba C, Tassi L, Cosottini M, Tosetti M, Costagli M, Bramerio M, Bartolini E, Biagi L, Cossu M, Pelliccia V, Symms MR, Guerrini R. 7T MRI in focal epilepsy with unrevealing conventional field strength imaging. Epilepsia 2016;57:445-54. [Crossref] [PubMed]
- Burkett BJ, Fagan AJ, Felmlee JP, Black DF, Lane JI, Port JD, Rydberg CH, Welker KM. Clinical 7-T MRI for neuroradiology: strengths, weaknesses, and ongoing challenges. Neuroradiology 2021;63:167-77. [Crossref] [PubMed]
- Balchandani P, Naidich TP. Ultra-High-Field MR Neuroimaging. AJNR Am J Neuroradiol 2015;36:1204-15. [Crossref] [PubMed]
- Okada T, Fujimoto K, Fushimi Y, Akasaka T, Thuy DHD, Shima A, Sawamoto N, Oishi N, Zhang Z, Funaki T, Nakamoto Y, Murai T, Miyamoto S, Takahashi R, Isa T. Neuroimaging at 7 Tesla: a pictorial narrative review. Quant Imaging Med Surg 2022;12:3406-35. [Crossref] [PubMed]
- Uğurbil K. Imaging at ultrahigh magnetic fields: History, challenges, and solutions. Neuroimage 2018;168:7-32. [Crossref] [PubMed]
- Fischl B. FreeSurfer. Neuroimage 2012;62:774-81. [Crossref] [PubMed]
- Khadhraoui E, Nickl-Jockschat T, Henkes H, Behme D, Müller SJ. Automated brain segmentation and volumetry in dementia diagnostics: a narrative review with emphasis on FreeSurfer. Front Aging Neurosci 2024;16:1459652. [Crossref] [PubMed]
- Henschel L, Conjeti S, Estrada S, Diers K, Fischl B, Reuter M. FastSurfer - A fast and accurate deep learning based neuroimaging pipeline. Neuroimage 2020;219:117012. [Crossref] [PubMed]
- Ashburner J, Friston KJ. Unified segmentation. Neuroimage 2005;26:839-51. [Crossref] [PubMed]
- Wan X, Zeng Y, Wang J, Tian M, Yin X, Zhang J. Structural and functional abnormalities and cognitive profiles in older adults with early-onset and late-onset focal epilepsy. Cereb Cortex 2024;34:bhae300. [Crossref] [PubMed]
- Kukhlenko O, Kukhlenko R, Tempelmann C, Speck O, Hinrichs H, Heinze HJ, et al. Study protocol: value of 7-T MRI with prospective motion correction and postprocessing for patients with nonlesional epilepsy. Clin Epileptol 2023;36:320-6.
- Sled JG, Zijdenbos AP, Evans AC. A nonparametric method for automatic correction of intensity nonuniformity in MRI data. IEEE Trans Med Imaging 1998;17:87-97. [Crossref] [PubMed]
- Tustison NJ, Avants BB, Cook PA, Zheng Y, Egan A, Yushkevich PA, Gee JC. N4ITK: improved N3 bias correction. IEEE Trans Med Imaging 2010;29:1310-20. [Crossref] [PubMed]
- Hoopes A, Mora JS, Dalca AV, Fischl B, Hoffmann M. SynthStrip: skull-stripping for any brain image. Neuroimage 2022;260:119474. [Crossref] [PubMed]
- Fischl B, Salat DH, Busa E, Albert M, Dieterich M, Haselgrove C, van der Kouwe A, Killiany R, Kennedy D, Klaveness S, Montillo A, Makris N, Rosen B, Dale AM. Whole brain segmentation: automated labeling of neuroanatomical structures in the human brain. Neuron 2002;33:341-55. [Crossref] [PubMed]
- Klein A, Tourville J. 101 labeled brain images and a consistent human cortical labeling protocol. Front Neurosci 2012;6:171. [Crossref] [PubMed]
- Iglesias JE, Augustinack JC, Nguyen K, Player CM, Player A, Wright M, Roy N, Frosch MP, McKee AC, Wald LL, Fischl B, Van Leemput KAlzheimer's Disease Neuroimaging Initiative. A computational atlas of the hippocampal formation using ex vivo, ultra-high resolution MRI: Application to adaptive segmentation of in vivo MRI. Neuroimage 2015;115:117-37. [Crossref] [PubMed]
- Saygin ZM, Kliemann D, Iglesias JE, van der Kouwe AJW, Boyd E, Reuter M, Stevens A, Van Leemput K, McKee A, Frosch MP, Fischl B, Augustinack JCAlzheimer's Disease Neuroimaging Initiative. High-resolution magnetic resonance imaging reveals nuclei of the human amygdala: manual segmentation to automatic atlas. Neuroimage 2017;155:370-82. [Crossref] [PubMed]
- Iglesias JE, Van Leemput K, Augustinack J, Insausti R, Fischl B, Reuter MAlzheimer's Disease Neuroimaging Initiative. Bayesian longitudinal segmentation of hippocampal substructures in brain MRI using subject-specific atlases. Neuroimage 2016;141:542-55. [Crossref] [PubMed]
- Yushkevich PA, Amaral RS, Augustinack JC, Bender AR, Bernstein JD, Boccardi M, et al. Quantitative comparison of 21 protocols for labeling hippocampal subfields and parahippocampal subregions in in vivo MRI: towards a harmonized segmentation protocol. Neuroimage 2015;111:526-41. [Crossref] [PubMed]
- Iglesias JE, Insausti R, Lerma-Usabiaga G, Bocchetta M, Van Leemput K, Greve DN, van der Kouwe AAlzheimer's Disease Neuroimaging Initiative. Fischl B, Caballero-Gaudes C, Paz-Alonso PM. A probabilistic atlas of the human thalamic nuclei combining ex vivo MRI and histology. Neuroimage 2018;183:314-26. [Crossref] [PubMed]
- Iglesias JE, Van Leemput K, Bhatt P, Casillas C, Dutt S, Schuff N, Truran-Sacrey D, Boxer A, Fischl BAlzheimer's Disease Neuroimaging Initiative. Bayesian segmentation of brainstem structures in MRI. Neuroimage 2015;113:184-95. [Crossref] [PubMed]
- Shaffer JP. Multiple Hypothesis Testing. Annu Rev Psychol 1995;46:561-84.
- Tukey JW. Comparing individual means in the analysis of variance. Biometrics 1949;5:99-114.
- KESTEVEN GL. The coefficient of variation. Nature 1946;158:520. [Crossref] [PubMed]
- Hyslop NP, White WH. Estimating precision using duplicate measurements. J Air Waste Manag Assoc 2009;59:1032-9. [Crossref] [PubMed]
- Sullivan GM, Feinn R. Using Effect Size—or Why the P Value Is Not Enough. J Grad Med Educ 2012;4:279-82. [Crossref] [PubMed]
- Eggert LD, Sommer J, Jansen A, Kircher T, Konrad C. Accuracy and reliability of automated gray matter segmentation pathways on real and simulated structural magnetic resonance images of the human brain. PLoS One 2012;7:e45081. [Crossref] [PubMed]
- Hedley M, Yan H. Motion artifact suppression: a review of post-processing techniques. Magn Reson Imaging 1992;10:627-35. [Crossref] [PubMed]
- Ladd ME, Bachert P, Meyerspeer M, Moser E, Nagel AM, Norris DG, Schmitter S, Speck O, Straub S, Zaiss M. Pros and cons of ultra-high-field MRI/MRS for human application. Prog Nucl Magn Reson Spectrosc 2018;109:1-50. [Crossref] [PubMed]
- Henschel L, Kügler D, Reuter M. FastSurferVINN: Building resolution-independence into deep learning segmentation methods—A solution for HighRes brain MRI. NeuroImage 2022;251:118933. [Crossref] [PubMed]
- Müller SJ, Khadhraoui E, Hansen N, Jamous A, Langer P, Wiltfang J, et al. Brainstem atrophy in dementia with Lewy bodies compared with progressive supranuclear palsy and Parkinson’s disease on MRI. BMC Neurol 2023;23:114. [Crossref] [PubMed]
- Kong XZ, Mathias SR, Guadalupe TENIGMA Laterality Working Group. Mapping cortical brain asymmetry in 17,141 healthy individuals worldwide via the ENIGMA Consortium. Proc Natl Acad Sci 2018;115:E5154-63. [Crossref] [PubMed]
- Chu C, Santini T, Liou JJ, Cohen AD, Maki PM, Marsland AL, Thurston RC, Gianaros PJ, Ibrahim TS. Brain Morphometrics Correlations With Age Among 350 Participants Imaged With Both 3T and 7T MRI: 7T Improves Statistical Power and Reduces Required Sample Size. Hum Brain Mapp 2025;46:e70195. [Crossref] [PubMed]
- Truong TK, Chakeres DW, Beversdorf DQ, Scharre DW, Schmalbrock P. Effects of static and radiofrequency magnetic field inhomogeneity in ultra-high field magnetic resonance imaging. Magn Reson Imaging 2006;24:103-12. [Crossref] [PubMed]
- Palaniyappan L, Kanagasabai K, Lavigne KM. Psychiatric applications of ultra-high field MR neuroimaging. In: Advances in Magnetic Resonance Technology and Applications. Elsevier; 2023. p. 563-74.
- Lavigne KM, Kanagasabai K, Palaniyappan L. Ultra-high field neuroimaging in psychosis: A narrative review. Front Psychiatry 2022;13:994372. [Crossref] [PubMed]
- Toyota E, Mackinley M, Silva AM, Jiang Y, Dalal TC, Nettekoven C, Palaniyappan L. Cerebellum as a neural substrate for impoverishment in early psychosis. Neuropsychologia 2025;210:109094. [Crossref] [PubMed]
- Kai J, Mackinley M, Khan AR, Palaniyappan L. Aberrant frontal lobe “U”-shaped association fibers in first-episode schizophrenia: A 7-Tesla Diffusion Imaging Study. NeuroImage Clin 2023;38:103367. [Crossref] [PubMed]
- Seiger R, Hahn A, Hummer A, Kranz GS, Ganger S, Küblböck M, Kraus C, Sladky R, Kasper S, Windischberger C, Lanzenberger R. Voxel-based morphometry at ultra-high fields. a comparison of 7T and 3T MRI data. Neuroimage 2015;113:207-16. [Crossref] [PubMed]
- Kronlage C, Heide EC, Hagberg GE, Bender B, Scheffler K, Martin P, Focke N. MP2RAGE vs. MPRAGE surface-based morphometry in focal epilepsy. PLoS One 2024;19:e0296843. [Crossref] [PubMed]
- Oliveira ÍAF, Roos T, Dumoulin SO, Siero JCW, van der Zwaag W. Can 7T MPRAGE match MP2RAGE for gray-white matter contrast? Neuroimage 2021;240:118384. [Crossref] [PubMed]
- Wang X, Roeloffs V, Merboldt KD, Voit D, Schätz S, Frahm J. Single-shot Multi-slice T1 Mapping at High Spatial Resolution – Inversion-Recovery FLASH with Radial Undersampling and Iterative Reconstruction. Open Med Imaging J 2015;9:1-8.
- Müller SJ, Khadhraoui E, Voit D, Riedel CH, Frahm J, Ernst M. First clinical application of a novel T1 mapping of the whole brain. Neuroradiol J 2022;35:684-91. [Crossref] [PubMed]
- Pokošová P, Kala D, Šanda J, Ježdík P, Prysiazhniuk Y, Faridová A, Jahodová A, Bělohlávková A, Kalina A, Holubová Z, Jurášek B, Kynčl M, Otáhal J. Magnetic resonance imaging techniques for indirect assessment of myelin content in the brain using standard T1w and T2w MRI sequences and postprocessing analysis. Physiol Res 2023;72:S573-85. [Crossref] [PubMed]
- Mueller SG. 7T MP2RAGE for cortical myelin segmentation: Impact of aging. PLoS One 2024;19:e0299670. [Crossref] [PubMed]



