Clinical application of automated optical coherence tomography angiography for retinal fluid segmentation: a study of real-world data

Anhai Wei; Yangchen Zhou; Rui Nie; Qi Huang; Zhenwei Du; Hehua Zhang

doi:10.21037/qims-2025-1-2602

Original Article

Clinical application of automated optical coherence tomography angiography for retinal fluid segmentation: a study of real-world data

Anhai Wei^1,2#, Yangchen Zhou^3#, Rui Nie¹, Qi Huang¹, Zhenwei Du¹, Hehua Zhang¹

¹Department of Medical Engineering, Daping Hospital, Army Medical University, Chongqing, China; ²School of Microelectronics and Communication Engineering, Chongqing University, Chongqing, China; ³Department of Ophthalmology, Daping Hospital, Army Medical University, Chongqing, China

Contributions: (I) Conception and design: Z Du, H Zhang; (II) Administrative support: Z Du, H Zhang; (III) Provision of study materials or patients: Y Zhou; (IV) Collection and assembly of data: Y Zhou, A Wei; (V) Data analysis and interpretation: A Wei, R Nie, Q Huang; (VI) Manuscript writing: All authors; (VII) Final approval of manuscript: All authors.

^#These authors contributed equally to this work as co-first authors.

Correspondence to: Zhenwei Du, ME; Hehua Zhang, PhD. Department of Medical Engineering, Daping Hospital, Army Medical University, No. 10, Changjiang Branch Road, Yuzhong District, Chongqing 400042, China. Email: peter11dzw@tmmu.edu.cn; zhanghehua@tmmu.edu.cn.

Background: Precise quantification of subretinal fluid (SRF) is crucial for the diagnosis, management, and prognosis of central serous chorioretinopathy (CSC), yet the automated SRF segmentation function of the Intalight VG100 optical coherence tomography angiography (OCTA) device lacks real-world clinical validation. This study aimed to clinically validate the Intalight VG100 OCTA device’s automated segmentation of SRF area and volume against physician annotations.

Methods: A retrospective analysis was conducted on OCTA images from 78 patients with CSC (139 eyes). The device’s automated segmentation SRF area and volume were quantified via the VG100’s companion software VanGoghReview, and physician-annotated segmentation served as the reference gold standard. The Wilcoxon signed-rank test, Spearman rank correlation analysis, and Bland-Altman analysis were used to compare quantitative differences, assess correlations, and evaluate measurement agreement, respectively. Additionally, SRF area and volume error distributions were analyzed, with assessments stratified by lesion size (area: 0 mm² ≤ area <5 mm², 5 mm²≤ area <10 mm², and area ≥10 mm²; volume: 0 mm³ ≤ volume <0.5 mm³, 0.5 mm³≤ volume <1 mm³, and volume ≥1 mm³).

Results: No significant difference in SRF area was noted between device’s automated segmentation and the physician-annotated segmentation (P>0.05), while volume was significantly overestimated by the device (P<0.05). Both parameters exhibited excellent correlations (area: Spearman ρ=0.981; volume: ρ=0.995; both P values <0.05). Bland-Altman analysis revealed a low proportion of outliers, with 6.47% of samples for SRF area and 1.44% for SRF volume falling outside the 95% limits of agreement (LoA). Compared to physician annotations, the device’s automated segmentation underestimated SRF area by a mean of 0.25 mm² while overestimating SRF volume by 0.11 mm³. In contrast, the performance for area segmentation was significantly better: 88% of absolute errors were within 0–2 mm², and 81% of relative errors were <20%. For volume, 91% of absolute errors were within 0–0.2 mm³, but only 40% of relative errors were <20%. Small-to-moderate lesions (area <10 mm²) showed high consistency in area segmentation, while large lesions (area ≥10 mm²) exhibited more dispersed errors. Volume segmentation errors were scattered regardless of lesion size, with error dispersion increasing with larger lesion size.

Conclusions: This study is the first to evaluate the clinical application of the Intalight VG100’s automated segmentation SRF with real-world data. Its SRF area quantification is reliable for small-to-moderate lesions, supporting routine clinical monitoring of CSC. In contrast, the device consistently overestimates volume quantification and yields considerable variability, necessitating manual verification for large lesions. These findings confirm the device’s clinical utility and identify key directions for algorithm optimization.

Keywords: Optical coherence tomography angiography (OCTA); subretinal fluid (SRF); automated segmentation; real-world data; artificial intelligence (AI)

Submitted Dec 03, 2025. Accepted for publication Mar 09, 2026. Published online Apr 09, 2026.

doi: 10.21037/qims-2025-1-2602

Introduction

Retinal fluid is a hallmark feature of multiple exudative retinal diseases, including central serous chorioretinopathy (CSC), diabetic macular edema (DME), and neovascular age-related macular degeneration (nAMD). As critical biomarkers, the volume and area of retinal fluid directly guide disease diagnosis, the formulation of treatment strategies, and prognostic assessment (1-3). Optical coherence tomography angiography (OCTA) has become an indispensable tool in ophthalmic practice (4-6). Notably, while OCTA generates high-quality imaging data, translating the visual insights it provides into reliable quantitative measurements (i.e., retinal fluid volume and area) relies on a single critical step: precise segmentation of fluid regions in OCTA images.

Quantitative analysis of retinal fluid relies heavily on precise segmentation. physician-annotated segmentation has long been regarded as the clinical gold standard as it is based on expert judgment. However, this approach is inherently time-consuming and labor-intensive, limiting its applicability in large-scale cohort studies or for patients requiring frequent follow-up for disease progression monitoring. Furthermore, physician-annotated segmentation is susceptible to substantial intra- and interobserver variability due to differences in experience, subjective judgment, and difficulty identifying subtle fluid margins (7-10). These limitations highlight the urgent need for efficient, accurate, and standardized automated segmentation techniques that can facilitate the reliable quantification of retinal fluid in clinical practice.

In recent years, advances in deep learning have driven the development of various automated retinal fluid segmentation algorithms (11-13). Despite promising technical performance (e.g., high Dice coefficients) reported in controlled studies, most of this research has focused solely on algorithmic validation via idealized datasets, and systematic evaluation in real-world clinical scenarios remains lacking (14,15). Real-world data are heterogeneous in disease type, stage, and imaging quality (e.g., presence of motion artifacts and media opacities)—factors that pose significant challenges to translating automated algorithms into clinical practice (16). Although commercial OCTA devices (e.g., the Intalight VG100 system) now integrate built-in automated segmentation functions to facilitate rapid quantitative analysis, the accuracy and reliability of these functions in real-world clinical settings have not been fully validated through independent studies. This deficiency undermines confidence in their utility in guiding clinical decisions.

To address this issue, we used real-world clinical data to evaluate the performance of the Intalight VG100’s automated retinal fluid segmentation, moving beyond previous research limited to idealized conditions. We compared the area and volume of subretinal fluid (SRF) derived from the automated segmentation function of the Intalight VG100 system with those derived from annotated manual segmentation by experienced ophthalmologists. The primary objective was to systematically assess the accuracy of the VG100’s automated function and provide evidence-based validation for its clinical applicability. Additionally, this study was designed to offer practical insights for standardizing the quantitative analysis of retinal fluid, ultimately supporting more consistent and reliable disease management in patients with exudative retinal diseases. We present this article in accordance with the STARD reporting checklist (available at https://qims.amegroups.com/article/view/10.21037/qims-2025-1-2602/rc).

Methods

The study was conducted in accordance with the Declaration of Helsinki and its subsequent amendments and was approved by Ethics Committee of Army Medical Center of PLA (medical research ethics approval No. 251, 2025). The requirement for informed consent was waived due to the retrospective nature of the analysis. Clinical and imaging data were retrospectively collected from patients who were admitted to the Department of Ophthalmology of Army Medical Center between January 2020 and June 2025.

Inclusion and exclusion criteria

The inclusion criteria were as follows: (I) age ≥18 years; (II) a clinical diagnosis of with CSC, characterized by SRF and/or retinal pigment epithelium (RPE) detachment; (III) chronic CSC (disease duration ≥3 months) or acute CSC (disease duration <6 months); and (IV) OCTA images of adequate quality, free of motion artifacts or significant media opacities that would impair assessment. Meanwhile, the exclusion criteria were as follows: (I) age <18 years; (II) concurrent presence of other ocular inflammatory, infectious, or fundus diseases that could confound the assessment results (e.g., glaucoma, optic atrophy); and (III) failure to cooperate with the OCTA examination (e.g., nystagmus and severe visual impairment precluding fixation).

Measurement device

The OCTA device used was the Intalight VG100 system (Vision Medical Technology Co., Ltd., Kunshan, China), a swept-source OCT system. The core technical specifications of the system are as follows: central wavelength of the swept laser, 1,050 nm; scanning speed, 100,000 A-scans per second; intra-tissue imaging depth, 2.7 mm; and maximum three-dimensional scanning and angiography range, 15×12 mm.

Data collection

All enrolled patients underwent comprehensive routine ophthalmic examinations, which included medical history taking, intraocular pressure (IOP) measurement with a TX-20 tonometer (Canon Medical Systems, Otawara, Japan), autorefraction with an APK-1 autorefractor (NIDEK Co., Ltd., Gamagori, Japan), and slit-lamp biomicroscopy with the BQ 900 slit-lamp biomicroscope (66 Vision Tech Co., Ltd., Suzhou, China). Additionally, OCTA scans were performed with the Intalight VG100 system by a single technician with over 5 years of specialized experience in ophthalmic imaging. The scans were completed under the OCTA Angio 6×6 512×512 R4 modes. The resulting images had a standardized spatial resolution (x: 1,016; y: 512; z: 512).

Automated data acquisition

The advanced analysis module of VanGoghReview version 3.1.352—the companion software of the Intalight VG100 that is integrated with the device—was employed to automatically identify the SRF region of interest (ROI) within the OCTA image datasets and quantify its area and volume. The software directly outputs the area (mm²) and volume (mm³) of SRF as automated segmentation results, and in our study, these values were recorded for each case without additional processing.

Manual data acquisition

Digital Imaging and Communications in Medicine (DICO)-formatted data from patient scans were exported from the VanGoghReview software. Two ophthalmologists with more than 5 years of experience in OCT image analysis independently conducted manual segmentation of the SRF and pigment epithelial detachment (PED) regions using three-dimensional (3D) Slicer version 5.8.1 software. They were blinded to the device-generated automated segmentation results to prevent potential bias. Subsequently, the two ophthalmologists cross-checked each other’s segmentation results, and any discrepancies were resolved through in-depth discussion to reach a preliminary consensus. Finally, a senior physician with more than 10 years of experience in OCT image interpretation reviewed all preliminary consensus segmentations to confirm accuracy and finalize the annotations.

Voxel resolution (interaxis spacing) was strictly derived from the physical scan range and pixel matrix dimensions of the OCTA data via the following: y-axis voxel spacing =6.0 mm/512 ≈ 0.0117 mm (11.7 µm); z-axis voxel spacing =6.0 mm/512 ≈ 0.0117 mm (11.7 µm); x-axis voxel spacing =2.7 mm/1,016 ≈ 0.00266 mm (2.66 µm). Quantitative analyses, including cross-sectional area and 3D volumetric measurements of the manually segmented SRF regions, were performed with Python version 3.13 software (Python Software Foundation, Wilmington, DE, USA) integrated with the SimpleITK library for medical image computation.

Error metrics definition

To comprehensively characterize the deviations between the VG100 system’s automated segmentation results and physician-annotated segmentation, three core error metrics—raw error, absolute error, and relative error—were defined for both the area and volume of SRF in this study. Absolute error quantifies the raw numerical deviation of automated measurements from the gold standard. Relative error normalizes such deviations to the gold standard values, which is used to assess the proximity of automated measurements to the gold standard. The specific computational formulations for all aforementioned error metrics were defined as follows:

SRF area error = device’s automated segmentation area − physician-annotated segmentation area
SRF absolute area error = |device’s automated segmentation area − physician-annotated segmentation area|
SRF relative area error = |device’s automated segmentation area − physician-annotated segmentation area| /physician-annotated segmentation area
SRF volume error = device’s automated segmentation volume − physician-annotated segmentation volume

SRF absolute volume error = |device’s automated segmentation volume − physician-annotated segmentation volume|
SRF relative volume error = |device’s automated segmentation volume − physician-annotated segmentation volume|/physician-annotated segmentation volume

Statistical analysis

Statistical analyses were performed with SPSS version 25.0 (IBM Corp., Armonk, NY, USA). The Shapiro-Wilk test was used to evaluate the normality of data distribution for SRF area and volume measurements obtained through both the device’s automated segmentation and physician-annotated segmentation. The results showed that the data were not normally distributed. Therefore, continuous variables are presented as the median and IQR, and nonparametric statistical methods were adopted.

The Wilcoxon signed-rank test was used to compare differences between the device’s automated segmentation and physician-annotated segmentation results, as the data failed the Shapiro-Wilk normality test. Moreover, Spearman rank correlation analysis was performed to assess the strength of correlation between the two methods, as it is appropriate for nonnormal data with potential outliers. Bland-Altman analysis was applied to evaluate consistency and agreement between the two measurement methods via calculation of the 95% limits of agreement (LoA) computed as the mean difference ±1.96 standard deviation (SD) of the differences. Notably, the intraclass correlation coefficient is another method for assessing agreement between two measurement techniques; however, given the comprehensive Bland-Altman analysis and error distribution evaluation already performed, it was deemed unnecessary for this study.

A two-tailed P value <0.05 was considered statistically significant for all analyses. Given that two primary comparisons (SRF area and volume) were performed via Wilcoxon tests, multiple-comparison correction was not strictly necessary, as one comparison was nonsignificant (area) and the other was highly significant (volume), with no impact on the overall conclusions.

Results

After the inclusion and exclusion criteria were applied, 78 patients with CSC (139 eyes) were enrolled in this study, including 59 (75.64%) males and 19 (24.36%) females. The age of the patients ranged from 31 to 71 years, with a mean age of 47±9 years. Both eyes were included when eligible, and each eye was treated as an independent sample for analysis. Consistent with similar method-comparison studies in ophthalmology, no additional adjustment was made for within-patient clustering, as this was deemed to have minimal impact on the primary objective of comparing the accuracy between automated and manual segmentation.

Performance assessment of the different methods

Comparison and correlation

Table 1 presents the comparative measurements of physician-annotated and automated segmentation SRF area (mm²) and volume (mm³). For SRF area, the median value of the physician-annotated SRF was 6.19 mm² (IQR, 3.31–9.97 mm²), while that of automated segmentation SRF was 6.58 mm² (IQR, 3.60–9.93 mm²). The Wilcoxon signed-rank test showed no statistically significant difference between the two groups (Z=−0.808; P>0.05). Spearman rank correlation analysis revealed a very strong positive correlation between physician-annotated and automated segmentation SRF area (ρ=0.981; P<0.05). Regarding SRF volume (mm³), the median value for the physician-annotated SRF was 0.45 (IQR, 0.16–0.81 mm³), while that for the automated segmentation SRF was 0.59 mm³ (IQR, 0.23–0.96 mm³). The SRF volume derived from automated segmentation was significantly higher than that from physician annotation (Z=−10.224; P<0.05). Meanwhile, Spearman analysis indicated an extremely strong positive correlation between the two methods for SRF volume (ρ=0.995; P<0.05).

Table 1

Comparison of measurements between the different methods

Parameter	Physician-annotated SRF	Device automated segmentation SRF	Wilcoxon		Spearman
Parameter	Physician-annotated SRF	Device automated segmentation SRF	Z	P	ρ	P
Area (mm²)	6.19 (3.31, 9.97)	6.58 (3.6, 9.93)	−0.808	0.419	0.981	<0.001
Volume (mm³)	0.45 (0.16, 0.81)	0.59 (0.23, 0.96)	−10.224	<0.001	0.995	<0.001

Data are presented as median (interquartile range). SRF, subretinal fluid.

Consistency analysis

Table 2 presents the Bland-Altman analysis results for the consistency of area and volume measurements between physician-annotated and automated segmentation. For the area of SRF, the difference between automated segmentation and physician-annotated measurements ranged from a minimum of −5.82 mm² to a maximum of 3.82 mm². The mean difference was −0.25 mm² (SD 1.31 mm²). The 95% LoA were −2.81 to 2.31 mm². A total of 9 (6.47%) eyes fell outside the 95% LoA (Figure 1). Regarding SRF volume, the difference ranged from 0 to 0.35 mm³. The mean difference was 0.11 mm³ (SD 0.07 mm³). The 95% LoA ranged from −0.03 to 0.24 mm³. A total of 2 (1.44%) eyes were outside the 95% LoA (Figure 2).

Table 2

Consistency of area and volume measurements between the different methods

Measurement parameter	Difference min value	Difference max value	Difference mean value	Difference standard deviation	95% LoA
Area (mm²)	−5.82	3.82	−0.25	1.31	−2.81 to 2.31
Volume (mm³)	0	0.35	0.11	0.07	−0.03 to 0.24

LoA, limits of agreement.

Figure 1 Bland-Altman plot for the consistency of SRF area measurements between the device’s automated segmentation and the physician-annotated segmentation. SD, standard deviation; SRF, subretinal fluid.

Figure 2 Bland-Altman plot for the consistency of SRF volume measurements between the device’s automated segmentation and the physician-annotated segmentation. SD, standard deviation; SRF, subretinal fluid.

Error analysis

Table 3 summarizes the error metrics, namely mean error (ME), root mean-squared error (RMSE), and mean absolute error (MAE), for the measurements of SRF area and volume between the device’s automated segmentation SRF and the physician-annotated segmentation SRF.

Table 3

Error distribution in area and volume measurements

Measurement	ME	RMSE	MAE
Area	−0.25	1.33	0.87
Volume	0.11	0.13	0.11

MAE, mean absolute error; ME, mean error; RMSE, root mean-squared error.

For SRF area, the mean error (ME) was −0.25 mm², the mean absolute error (MAE) was 0.87 mm², and the root mean-squared error (RMSE) was 1.33 mm², indicating that the device’s automated segmentation area was, on average, 0.25 mm² lower than the physician-annotated values. In contrast, the ME for SRF volume was 0.11 mm³, the MAE was 0.11 mm³, and the RMSE was 0.13 mm³, suggesting the device’s automated segmentation volume was, on average, 0.11 mm³ higher than the physician-annotated values. Notably, the two metrics exhibited systematic biases in opposing directions.

For SRF area, the RMSE (1.33 mm²) was markedly greater than was the MAE (0.87 mm²), suggesting that abnormally large errors may exist in some samples (e.g., significant under- or overestimation of SRF area by the device). In contrast, for SRF volume, the RMSE (0.13 mm³) was closely aligned with the MAE (0.11 mm³), indicating that the error distribution of volume measurements was relatively uniform across samples, with no substantial fluctuations in error magnitude.

Distribution analysis of area error

For absolute SRF area error (Figure 3), the distribution was heavily skewed toward very small errors, with the 0 to 1-mm² interval accounting for the highest proportion (74% of samples) and approximately 95% of samples having absolute errors <3 mm². As the error interval increased, the frequency declined sharply, with only 1 sample (0.7%) exhibiting an absolute error ≥5 mm² (final cumulative probability 100%). This confirmed that absolute area errors were predominantly concentrated in low error ranges.

Figure 3 Distribution frequency and cumulative probability of absolute area error between the device’s automated segmentation and the physician-annotated segmentation.

For relative SRF area error (Figure 4), the 0–0.2 interval (relative error <20%) was predominant, accounting for 81% of samples. Frequency decreased steadily with larger error intervals, and relative errors >60% were extremely rare (only 3 samples in total; cumulative probability 99–100%). This indicates that most relative area errors were confined to the low error range of ≤20%.

Figure 4 Distribution frequency and cumulative probability of the relative area error between the device’s automated segmentation and the physician-annotated segmentation.

Collectively, these findings (shown in Figures 3,4) demonstrate that the VG100 system’s automated segmentation yields small SRF area errors in most cases: absolute errors were predominantly <1 mm², and relative errors were mostly <20%. Large errors (absolute ≥5 mm² and relative ≥60%) were exceedingly rare, likely associated with complex clinical scenarios such as SRF with irregular boundaries or ambiguous margins.

The distribution of volume error analysis

For absolute SRF volume error (Figure 5), the frequency distribution exhibited a pattern of moderate-to-small error concentration: 91% of samples had absolute errors <0.2 mm³, and only 2 samples (1.44%) fell into the ≥0.25 mm³ interval, with cumulative probability gradually approaching 100% as the error interval expanded. Notably, the 0.05 to 0.1-mm³ interval showed a slight increase in frequency compared to the 0 to 0.05-mm³ interval, with the frequency decreasing with greater error.

Figure 5 Distribution frequency and cumulative probability of absolute volume error between the device’s automated segmentation and the physician-annotated segmentation.

For relative SRF volume error (Figure 6), there was a concentration of low error: 76% of samples had relative errors ≤40% (with 40% of samples having errors <20%). The frequency declined significantly as the error interval expanded, with 4 (2.88%) samples having relative errors ≥80%.

Figure 6 Distribution frequency and cumulative probability of relative volume error between the device’s automated segmentation and the physician-annotated segmentation.

Collectively, these findings (shown in Figures 5,6) demonstrate that the Intalight VG100 system’s automated segmentation produces small SRF volume errors in most cases: 91% of samples had absolute errors <0.2 mm³, yet only 40% of samples had relative errors <20%. Large errors (absolute error ≥0.25 mm³ and relative error ≥80%) were extremely rare, likely attributable to complex clinical scenarios such as SRF with large volume, irregular morphology, or ambiguous boundaries. Compared to the error distribution of SRF area, that of volume error showed slightly higher dispersion in the midrange intervals, but still exhibited an overall tendency to decrease as the error interval increased.

Lesion size—stratified error analysis and typical cases of discrepancy

Lesion size—stratified error analysis

For areas ≥0 and <5 mm², errors were highly concentrated in the ≥0 and <1 mm² interval (87%), with negligible distribution in larger intervals. For areas ≥5 and <10 mm², the ≥0 and <1 mm² interval remained predominant (78%), while the proportion of moderate errors (area error ≥1 and <2 mm²) rose to 20%. For areas ≥10 mm², errors became markedly dispersed, with only 38% falling into the 0 and <1 mm² interval and the rest being distributed across wider error ranges.

Figure 7 illustrates a clear pattern: as SRF area increases, the dispersion of automated segmentation errors becomes increasingly pronounced—for small SRF areas (≥1 and <2 mm²), the high concentration of errors indicates robust and consistent segmentation performance; as area grows to ≥5 and <10 mm², the rise in moderate errors signals emerging boundary definition challenges. Meanwhile, for large SRF areas (area ≥10 mm²), the widespread error distribution reflects significant variability in segmentation accuracy. This suggests the algorithm’s performance degrades for larger or more complex lesions, likely due to difficulties in segmenting extensive or irregularly shaped fluid regions. Additionally, as SRF area increases from <5 to ≥10 mm², the proportion of the small error interval (≥0 and <1 mm²) decreases dramatically from 87% to 38%, while larger error intervals account for a significantly higher share, indicating larger SRF areas introduce markedly more dispersed errors. Clinically, the device’s automated segmentation accuracy for SRF area may be less consistent for larger lesions, and thus careful interpretation of quantitative results is warranted.

Figure 7 Stacked distribution of area error intervals stratified by SRF area. SRF, subretinal fluid.

For volumes >0 and <0.5 mm³, volume errors were mainly concentrated in small intervals, with 41% falling within >0 and <0.05 mm³ and 37% within >0.05 and <0.1 mm³. For volumes >0.5 and <1 mm³, errors shifted to midrange intervals, with 41% within >0.1 and <0.15 mm³ and 36% within >0.15 and <0.2 mm³. For volume ≥1 mm³, the distribution shifted further toward larger error intervals, with over 60% of errors with volume >0.15 and <0.25 mm³.

Figure 8 illustrates a pattern: as SRF volume increases, the proportion of small volume error intervals (>0 and <0.05 mm³) decreases sharply from 41% to 4%, while that of larger error intervals increase substantially. This indicates larger SRF volumes are associated with higher inherent variability in segmentation accuracy—consistent with the algorithmic challenge in segmenting extensive or irregular fluid regions.

Figure 8 Stacked distribution of volume error intervals stratified by SRF volume. SRF, subretinal fluid.

Typical cases of discrepancy

The primary causes underlying the observed segmentation errors were analyzed, with the results (as presented in Figure 9) being as follows: (I) for incomplete SRF identification in large lesions (Row 1), the VG100’s automated segmentation exhibited missed recognition of partial SRF regions, as demonstrated in this row’s images—certain areas of SRF were not captured by the device’s algorithm. This missed detection error may be attributed to the algorithm’s training dataset, which mainly consists of small-to-medium lesions and typical hyporeflective SRF and lacks samples of large lesions and heterogeneous SRF. This resulted in insufficient generalization ability of the algorithm for complex scenarios, leading to missed detection in real clinical data. (II) There was algorithmic oversegmentation of SRF (Row 2). Specifically, in retinal OCT images, regions with ambiguous boundaries or low imaging reflectivity were misclassified as SRF by the VG100’s automated algorithm, leading to an overexpanded SRF area relative to manual annotation. This oversegmentation error mainly stemmed from the algorithm’s excessive sensitivity and misidentification of certain physiological hyporeflective structures in the retina. It incorrectly classified non-SRF hyporeflective regions (such as retinal edema) as fluid, thereby causing oversegmentation. (III) The algorithm failed to differentiate PED from SRF (Row 3). For patients with concurrent PED and SRF, the device’s automated algorithm could not distinguish these two distinct structures: partial PED (manually annotated in red in Column C) was misidentified as SRF, resulting in an overestimated SRF volume compared to physician-annotated segmentation. The confusion between PED and SRF was primarily due to the algorithm’s failure to accurately segment key anatomical structures and discern their spatial relationships. Both PED and SRF present as hyporeflective cystic structures in OCTA scans, and without incorporating the anatomical prior of retinal layering, the algorithm could not distinguish between PED and SRF solely based on gray-scale and morphological features.

Figure 9 Examples of segmentation discrepancies between device’s automated segmentation and physician-annotated segmentation. Column A: original OCT images of lesions. Column B: device’s automated segmentation (orange: segmented SRF). Column C: physician-annotated segmentation (green: manually annotated SRF; red: manually annotated PED). Row 1: missed SRF identification by device. Row 2: overidentification of SRF by device. Row 3: misidentification of PED as SRF by device. OCT, optical coherence tomography; PED, pigment epithelial detachment; SRF, subretinal fluid.

Discussion

Quantitative changes in SRF volume and area serve as key biomarkers for monitoring visual function in patients with CSC. A multicenter retrospective study by Subhi et al. underscored the clinical relevance of SRF quantification, confirming a significant positive correlation between SRF volume reduction and improved best-corrected visual acuity (r=0.339; P=0.030) (17). Zhou et al. and Suzuki et al. extended these findings by demonstrating that larger baseline SRF volume or prolonged SRF persistence correlates with poorer postoperative visual recovery and more pronounced outer retinal degeneration, highlighting the prognostic value of accurate SRF assessment (18,19).

Notably, clinical practice has delineated clear quantitative thresholds for CSC treatment decision-making, which directly frame the significance of SRF segmentation precision. Arrigo et al. defined a >20% reduction in SRF volume as a valid morphological treatment response, with patients failing to meet this threshold after 4 months of eplerenone therapy (i.e., SRF reduction <20%) switched to half-dose photodynamic therapy (PDT)—thus establishing the 20% cutoff as a critical determinant for PDT initiation (20). In line with this, Fasler et al. found that a ≥50% SRF volume reduction constitutes a secondary endpoint for evaluating therapeutic efficacy across both acute and chronic CSC subtypes (21).

These lines of evidence converge to show that increased SRF volume or prolonged persistence in CSC is associated with visual acuity decline; notably, large-volume or long-standing SRF is more prone to eliciting irreversible retinal structural damage and subsequent visual impairment. Accurate segmentation and quantification of SRF volume and area are therefore clinically indispensable, with a tolerable relative error threshold of ≤20% for large lesions being consistent with the aforementioned clinical decision criteria. Exceeding this error range risks misclassifying treatment response, which could lead to erroneous decisions (e.g., inappropriately initiating or avoiding PDT) and compromise patient outcomes. Precision is integral to guiding specific disease management strategies (e.g., selecting eplerenone over PDT).

With the advancement of artificial intelligence (AI) and computer vision, significant progress has been made in automatic retinal fluid segmentation, with representative methods grouped by model architecture. Convolutional neural network (CNN)—based approaches include the cross-device compatible RetiFluidNet [Dice similarity coefficients (DSC) 93.6% for SRF segmentation] (22), a task-decoupled three-stage fully convolutional network (DSC 0.82) (23), and the multiscale attention-based U-Net (DSC 0.93) (24). Meanwhile, transformer-based models include SwinVFTR (3D segmentation DSC 0.72 on Spectralis and 0.59/0.68 on Cirrus/Topcon) (25) and SegFormer (DSC 0.94 for rhegmatogenous retinal detachment—related SRF segmentation) (26). Finally hybrid CNN—transformer architectures, such as HyFormer, that integrated local feature extraction and global dependency modeling have also been applied (27). The bulk of the studies in this area have focused on standalone AI algorithms, with deep learning models showing high DSC and sensitivity in retinal fluid segmentation. However, these metrics only reflect algorithmic accuracy and lack assessments of real-world clinical utility, especially for commercial device-integrated functions and disease-specific scenarios. For example, Keenan et al. (28) compared the standalone Notal Vision OCT analyzer (NOA) with retinal specialists using spectral-domain OCT data in patients with age-related macular degeneration (AMD) (1,127 eyes) but did not involve OCTA or CSC-specific SRF. Terry et al. (29) evaluated the independent Pegasus-OCT system on three datasets (1,460 B-scans) among patients with AMD or DME but did not include CSC or OCTA-derived SRF. Similarly, a Moorfields Eye Hospital study (30) validated a deep learning model’s segmentation performance via spectral-domain-OCT but did not examine commercial OCTA devices or CSC.

Notably, while these studies have advanced the validation of AI-based fluid segmentation, two key issues remain outstanding: First, the clinical efficacy of segmentation functions built into commercial OCTA devices (not standalone AI tools) has not been determined. Second, few studies have targeted OCTA-specific SRF in CSC. To address this, our study evaluated the built-in automatic segmentation function of the commercial Intalight VG100 OCTA device, focusing on SRF in patients with CSC. We systematically analyzed the accuracy, consistency, correlation, and error distribution of segmented SRF area and volume against the physician-annotated ground truth.

Several limitations to this study should be acknowledged. To begin, the spectrum of diseases examined was not extensive, as only patients with CSC were enrolled, with other common retinal diseases, such as nAMD and DME, being omitted. The composition of retinal fluid varies significantly across different diseases. For instance, nAMD is often accompanied by hemorrhage, lipid exudates, or fibrin deposition, while DME may present with cystoid spaces and hyperreflective exudates. These heterogeneous lesions impose higher requirements on the OCTA segmentation algorithm in terms of gray-scale thresholding, texture feature extraction, and deep learning generalization ability. Additionally, as we employed a single-center, retrospective design, all patients were recruited from a single institution, and imaging was performed by a single experienced technician using the same device. Although this ensured research consistency, it may limit the generalizability of the results. Consequently, the conclusions of this study may not be applicable to other diseases or different medical settings. Future studies should construct multidisease and multicenter cohort datasets to systematically evaluate the robustness and generalizability of the algorithm across various pathological subtypes and real-world clinical scenarios.

Second, the fixed scanning mode may introduce measurement bias, and imaging artifacts in real-world settings pose unresolved challenges for clinical application. This study uniformly adopted the 6×6 mm scanning mode, a common protocol for the clinical assessment of CSC, which effectively covers the central macular area (the primary site of SRF lesions in CSC) while balancing imaging coverage and spatial resolution, making it suitable for evaluating typical macular serous detachment. However, OCTA technology offers multiple scanning options, including the 3×3 mm mode, which provides higher axial and lateral resolution despite its smaller field of view, and the 12×12 mm mode, which is suitable for wide-field assessment, especially for diffuse PED extending peripherally in CSC. Differences in pixel size, signal-to-noise ratio, and motion artifact correction algorithms across different modes may directly affect the clarity of fluid boundaries and segmentation consistency. Although this study excluded images with significant motion artifacts or poor quality to ensure data reliability, such artifacts are unavoidable in real-world clinical practice due to various factors (e.g., patient blinking, involuntary eye movement, and poor fixation). These may further degrade image quality, blur fluid margins, and amplify segmentation errors across different scanning modes, posing substantial challenges to the stability and reliability of the algorithm during routine clinical deployment. Thus, the study conclusions are only applicable to images of reasonably good quality from the 6×6 mm OCTA scanning protocol and cannot fully reflect the device’s performance in other clinical scenarios.

Third, methodological differences between physician annotation/calculation and the manufacturer’s built-in software may lead to systematic biases, and intereye correlation was not adjusted for. In this study, physician annotation was performed via 3D Slicer, and volume calculation was implemented with a custom Python script rather than with the direct use of the original segmentation and quantification module of the VG100 device. The computational logic of 3D Slicer and the Python script may have subtle differences when compared with commercial software in terms of interpolation algorithms, coordinate system settings, and voxel resampling strategies. Notably, retinal curvature correction was not performed in this study, nor were certain optimization procedures built into the device software (e.g., axial distortion correction and ocular magnification correction) incorporated. These methodological differences might have produced constant biases between manual and device measurements. Additionally, since some patients had both eyes analyzed in the study, the dataset included inherently correlated observations, and we did not adjust for intereye correlation in the analysis. For future large-scale studies, corresponding statistical methods can be used to correct for such correlation, if necessary, thereby enhancing the rigor of the results.

Conclusions

We systematically evaluated the performance of the Intalight VG100 system’s automated SRF segmentation against physician-annotated segmentation using real-world data from patients with CSC. Our key findings reveal that the device’s automated segmentation closely aligned with physician annotated for SRF area, yet it exhibited a consistent tendency to overestimate SRF volume. For SRF area, automated segmentation results were not significantly different from physician-annotated values and demonstrated an excellent Spearman correlation. In contrast, automated SRF volume measurements were significantly higher than the physician-annotated values, with a mean overestimation of 0.11 mm³—indicating a clear systematic bias—despite an equally strong correlation. Notably, relative errors were substantially more pronounced for larger SRF accumulations, particularly in volume quantification. Collectively, these results confirm that while the VG100’s algorithm is highly effective and reliable for SRF area quantification, its volume measurements exhibit greater variability and systematic overestimation, and thus caution in clinical application—especially for large lesions—is warranted.

In summary, while this study provides preliminary evidence for the clinical utility of the VG100’s automated SRF segmentation, its generalizability is constrained by the single-disease enrollment, fixed scanning protocols, and idealized data selection. Future validation in multicenter cohorts that include a greater diversity of disease types, scanning modes, and real-world imaging artifacts is essential to confirming the algorithm’s robustness and to promoting its reliable clinical deployment.

Acknowledgments

None.

Footnote

Reporting Checklist: The authors have completed the STARD reporting checklist. Available at https://qims.amegroups.com/article/view/10.21037/qims-2025-1-2602/rc

Data Sharing Statement: Available at https://qims.amegroups.com/article/view/10.21037/qims-2025-1-2602/dss

Funding: This study was funded by Chongqing Key Project of Technological Innovation and Application Development Special Program (No. CSTB2021TIAD-KPX0074 to H.Z.) and Chongqing Natural Science Foundation General Program (No. CSTB2023NSCQ-MSX0199 to R.N.).

Conflicts of Interest: All authors have completed the ICMJE uniform disclosure form (available at https://qims.amegroups.com/article/view/10.21037/qims-2025-1-2602/coif). H.Z. reports this study was funded by Chongqing Key Project of Technological Innovation and Application Development Special Program (No. CSTB2021TIAD-KPX0074). R.N. reports this study was funded by Chongqing Natural Science Foundation General Program (No. CSTB2023NSCQ-MSX0199). The other authors have no conflicts of interest to declare.

Ethical Statement: The authors are accountable for all aspects of the work in ensuring that questions related to the accuracy or integrity of any part of the work are appropriately investigated and resolved. The study was conducted in accordance with the Declaration of Helsinki and its subsequent amendments. The study was approved by Ethics Committee of Army Medical Center of PLA (medical research ethics approval No. 251, 2025) and individual consent for this retrospective analysis was waived.

Open Access Statement: This is an Open Access article distributed in accordance with the Creative Commons Attribution-NonCommercial-NoDerivs 4.0 International License (CC BY-NC-ND 4.0), which permits the non-commercial replication and distribution of the article with the strict proviso that no changes or edits are made and the original work is properly cited (including links to both the formal publication through the relevant DOI and the license). See: https://creativecommons.org/licenses/by-nc-nd/4.0/.

References

Kabiri P, Künzel S, Ashraf-Vaghefi S, Bonaventura T, Rübsam A, Joussen A, Zeitz O. Changes in choroidal vessel morphology associated with fluid leakage in central serous chorioretinopathy; a comparison of OCTA and ICGA. Graefes Arch Clin Exp Ophthalmol 2025;263:3065-72. [Crossref] [PubMed]
Tan B, Lim NA, Tan R, Gan ATL, Chua J, Nusinovici S, Cheung CMG, Chakravarthy U, Wong TY, Schmetterer L, Tan G. Combining retinal and choroidal microvascular metrics improves discriminative power for diabetic retinopathy. Br J Ophthalmol 2023;107:993-9. [Crossref] [PubMed]
Schranz M, Told R, Hacker V, Reiter GS, Reumueller A, Vogl WD, Bogunovic H, Sacu S, Schmidt-Erfurth U, Roberts PK. Correlation of vascular and fluid-related parameters in neovascular age-related macular degeneration using deep learning. Acta Ophthalmol 2023;101:e95-e105. [Crossref] [PubMed]
Carta A, Donnio A, Dore S, Fossarello M, Farci R. Fractal analysis for OCT-A images of central serous chorioretinopathy. Photodiagnosis Photodyn Ther 2025;54:104642. [Crossref] [PubMed]
Koutsiaris AG, Batis V, Liakopoulou G, Tachmitzi SV, Detorakis ET, Tsironi EE. Optical Coherence Tomography Angiography (OCTA) of the eye: A review on basic principles, advantages, disadvantages and device specifications. Clin Hemorheol Microcirc 2023;83:247-71. [Crossref] [PubMed]
Foulsham W, Chien J, Lenis TL, Papakostas TD. Optical Coherence Tomography Angiography: Clinical Utility and Future Directions. J Vitreoretin Dis 2022;6:229-42. [Crossref] [PubMed]
Schmidt-Erfurth U, Reiter GS, Riedl S, Seeböck P, Vogl WD, Blodi BA, Domalpally A, Fawzi A, Jia Y, Sarraf D, Bogunović H. AI-based monitoring of retinal fluid in disease activity and under therapy. Prog Retin Eye Res 2022;86:100972. [Crossref] [PubMed]
Bogunovic H, Venhuizen F, Klimscha S, Apostolopoulos S, Bab-Hadiashar A, Bagci U, et al. RETOUCH: The Retinal OCT Fluid Detection and Segmentation Benchmark and Challenge. IEEE Trans Med Imaging 2019;38:1858-74. [Crossref] [PubMed]
Afolabi SO, Gheisi L, Shan J, Shen LQ, Wang M, Shi M. Equity-enhanced glaucoma progression prediction from OCT with knowledge distillation. NPJ Digit Med 2025;8:477. [Crossref] [PubMed]
Sampson DM, Dubis AM, Chen FK, Zawadzki RJ, Sampson DD. Towards standardizing retinal optical coherence tomography angiography: a review. Light Sci Appl 2022;11:63. [Crossref] [PubMed]
Schlegl T, Waldstein SM, Bogunovic H, Endstraßer F, Sadeghipour A, Philip AM, Podkowinski D, Gerendas BS, Langs G, Schmidt-Erfurth U. Fully Automated Detection and Quantification of Macular Fluid in OCT Using Deep Learning. Ophthalmology 2018;125:549-58. [Crossref] [PubMed]
Lin M, Bao G, Sang X, Wu Y. Recent Advanced Deep Learning Architectures for Retinal Fluid Segmentation on Optical Coherence Tomography Images. Sensors (Basel) 2022;22:3055. [Crossref] [PubMed]
Tang W, Ye Y, Chen X, Shi F, Xiang D, Chen Z, Zhu W. Multi-class retinal fluid joint segmentation based on cascaded convolutional neural networks. Phys Med Biol 2022; [Crossref] [PubMed]
Lu D, Heisler M, Lee S, Ding GW, Navajas E, Sarunic MV, Beg MF. Deep-learning based multiclass retinal fluid segmentation and detection in optical coherence tomography images using a fully convolutional neural network. Med Image Anal 2019;54:100-10. [Crossref] [PubMed]
Borrelli E, Oakley JD, Iaccarino G, Russakoff DB, Battista M, Grosso D, Borghesan F, Barresi C, Sacconi R, Bandello F, Querques G. Deep-learning based automated quantification of critical optical coherence tomography features in neovascular age-related macular degeneration. Eye (Lond) 2024;38:537-44. [Crossref] [PubMed]
Liu F, Panagiotakos D. Real-world data: a brief review of the methods, applications, challenges and opportunities. BMC Med Res Methodol 2022;22:287. [Crossref] [PubMed]
Subhi Y, Bjerager J, Boon CJF, van Dijk EHC. Subretinal fluid morphology in chronic central serous chorioretinopathy and its relationship to treatment: a retrospective analysis on PLACE trial data. Acta Ophthalmol 2022;100:89-95. [Crossref] [PubMed]
Zhou F, Yao J, Jiang Q, Yang W. Efficacy of Navigated Laser Photocoagulation for Chronic Central Serous Chorioretinopathy: A Retrospective Observational Study. Dis Markers 2022;2022:7792291. [Crossref] [PubMed]
Suzuki T, Sasajima H, Otaki C, Ueta Y, Tate H. Association of Subretinal Fluid Duration and Baseline Chorioretinal Structure With Optical Coherence Tomography in Central Serous Chorioretinopathy. Transl Vis Sci Technol 2023;12:12. [Crossref] [PubMed]
Arrigo A, Calamuneri A, Aragona E, Bordato A, Grazioli Moretti A, Amato A, Bandello F, Battaglia Parodi M. Structural OCT Parameters Associated with Treatment Response and Macular Neovascularization Onset in Central Serous Chorioretinopathy. Ophthalmol Ther 2021;10:289-98. [Crossref] [PubMed]
Fasler K, Gunzinger JM, Barthelmes D, Zweifel SA. Routine Clinical Practice Treatment Outcomes of Eplerenone in Acute and Chronic Central Serous Chorioretinopathy. Front Pharmacol 2021;12:675295. [Crossref] [PubMed]
Rasti R, Biglari A, Rezapourian M, Yang Z, Farsiu S. RetiFluidNet: A Self-Adaptive and Multi-Attention Deep Convolutional Network for Retinal OCT Fluid Segmentation. IEEE Trans Med Imaging 2023;42:1413-23. [Crossref] [PubMed]
Melo T, Carneiro Â, Campilho A, Mendonça AM. Retinal layer and fluid segmentation in optical coherence tomography images using a hierarchical framework. J Med Imaging (Bellingham) 2023;10:014006. [Crossref] [PubMed]
Karn PK, Abdulla WH. Precision Segmentation of Subretinal Fluids in OCT Using Multiscale Attention-Based U-Net Architecture. Bioengineering (Basel) 2024;11:1032. [Crossref] [PubMed]
Hossain KF, Kamran SA, Tavakkoli A, Bebis G, Baker S. SwinVFTR: a novel volumetric feature-learning transformer for 3D OCT fluid segmentation. In: 2025 IEEE 22nd International Symposium on Biomedical Imaging (ISBI); Houston, TX, USA; 2025:1-5.
Midroni J, Longwell J, Bhambra N, Demian S, Pecaku A, Martins Melo I, Muni RH. Automated Segmentation of Subretinal Fluid from OCT: A Vision Transformer Approach with Cross-Validation. Ophthalmol Sci 2025;5:100852. [Crossref] [PubMed]
Jiang Q, Fan Y, Li M, Fang S, Zhu W, Xiang D, Peng T, Chen X, Xu X, Shi F. HyFormer: a hybrid transformer-CNN architecture for retinal OCT image segmentation. Biomed Opt Express 2024;15:6156-70. [Crossref] [PubMed]
Keenan TDL, Clemons TE, Domalpally A, Elman MJ, Havilio M, Agrón E, Benyamini G, Chew EY. Retinal Specialist versus Artificial Intelligence Detection of Retinal Fluid from OCT: Age-Related Eye Disease Study 2: 10-Year Follow-On Study. Ophthalmology 2021;128:100-9. [Crossref] [PubMed]
Terry L, Trikha S, Bhatia KK, Graham MS, Wood A. Evaluation of Automated Multiclass Fluid Segmentation in Optical Coherence Tomography Images Using the Pegasus Fluid Segmentation Algorithms. Transl Vis Sci Technol 2021;10:27. [Crossref] [PubMed]
Wilson M, Chopra R, Wilson MZ, Cooper C, MacWilliams P, Liu Y, Wulczyn E, Florea D, Hughes CO, Karthikesalingam A, Khalid H, Vermeirsch S, Nicholson L, Keane PA, Balaskas K, Kelly CJ. Validation and Clinical Applicability of Whole-Volume Automated Segmentation of Optical Coherence Tomography in Retinal Disease Using Deep Learning. JAMA Ophthalmol 2021;139:964-73. [Crossref] [PubMed]

Cite this article as: Wei A, Zhou Y, Nie R, Huang Q, Du Z, Zhang H. Clinical application of automated optical coherence tomography angiography for retinal fluid segmentation: a study of real-world data. Quant Imaging Med Surg 2026;16(5):397. doi: 10.21037/qims-2025-1-2602

Clinical application of automated optical coherence tomography angiography for retinal fluid segmentation: a study of real-world data

Introduction

Methods

Inclusion and exclusion criteria

Measurement device

Data collection

Automated data acquisition

Manual data acquisition

Error metrics definition

Statistical analysis

Results

Performance assessment of the different methods

Comparison and correlation

Table 1

Consistency analysis

Table 2

Error analysis

Table 3

Distribution analysis of area error

The distribution of volume error analysis

Lesion size—stratified error analysis and typical cases of discrepancy

Lesion size—stratified error analysis

Typical cases of discrepancy

Discussion

Conclusions

Acknowledgments

Footnote

References

Article Options

Download Citation

Share