Development and validation of an interpretable machine learning model for standard spleen volume prediction

Jinyu Lin; Jian Yang; Yinling Qian; Xuanshuang Tang; Minheng Zhu; Wang Luo; Wenjun Lin; Mengjing Chen; Xianqing Zheng; Xiangdong Yuan; Haisu Tao

doi:10.21037/qims-2024-2954

Original Article

Development and validation of an interpretable machine learning model for standard spleen volume prediction

Jinyu Lin^1,2,3#, Jian Yang^2,3#, Yinling Qian^4#, Xuanshuang Tang⁴, Minheng Zhu^2,3, Wang Luo⁵, Wenjun Lin^2,3, Mengjing Chen⁶, Xianqing Zheng⁶, Xiangdong Yuan¹, Haisu Tao^2,3

¹Department of General Surgery, Guangdong Provincial People’s Hospital (Guangdong Academy of Medical Sciences), Southern Medical University, Guangzhou, China; ²The Department of Hepatobiliary Surgery I, Zhujiang Hospital of Southern Medical University, Guangzhou, China; ³Guangdong Provincial Clinical and Engineering Center of Digital Medicine, Guangzhou, China; ⁴Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen, China; ⁵Department of General Surgery 2nd Division, Guangdong Second Provincial General Hospital, Guangzhou, China; ⁶The Second Clinical School of Medicine, Southern Medical University, Guangzhou, China

Contributions: (I) Conception and design: J Lin, J Yang, W Luo, H Tao; (II) Administrative support: J Yang, Y Qian, X Yuan, H Tao; (III) Provision of study materials or patients: J Lin, J Yang, W Luo, H Tao; (IV) Collection and assembly of data: J Lin, M Zhu, W Luo, W Lin, M Chen, X Zheng; (V) Data analysis and interpretation: J Lin, Y Qian, X Tang, X Yuan, H Tao; (VI) Manuscript writing: All authors; (VII) Final approval of manuscript: All authors.

^#These authors contributed equally to this work as co-first authors.

Correspondence to: Xiangdong Yuan, MM. Department of General Surgery, Guangdong Provincial People’s Hospital (Guangdong Academy of Medical Sciences), Southern Medical University, No. 106 Zhongshan 2nd Road, Yuexiu District, Guangzhou 510080, China. Email: gzdong405@163.com; Haisu Tao, MD. The Department of Hepatobiliary Surgery I, Zhujiang Hospital of Southern Medical University, No. 253 Gongye Avenue, Haizhu District, Guangzhou 510280, China; Guangdong Provincial Clinical and Engineering Center of Digital Medicine, No. 253 Gongye Avenue, Haizhu District, Guangzhou 510280, China. Email: taohaisu_dr@126.com.

Background: Splenomegaly serves as a crucial indicator for various diseases, particularly in hepatosplenomegaly and hematological disorders. Accurate assessment of splenomegaly is essential for improving diagnostic accuracy and treatment decisions, yet individualized diagnosis necessitates a standard reference for splenic volume. This study aimed to develop an interpretable machine learning (ML) model to evaluate standard splenic volume (SSV), enhancing personalized clinical decision-making.

Methods: We conducted a retrospective analysis of 1,186 volunteers from a multicenter cohort and evaluated 11 ML algorithms. SHapley Additive exPlanations (SHAP) were employed for feature selection and interpretation. Model performance was rigorously evaluated through key metrics such as root mean squared error (RMSE), coefficient of determination (R²), and additional validation parameters, further validated through comparisons with prior published formulas. We also developed free, open-access web-based calculators for the predictive model.

Results: Model development and internal validation involved 511 eligible volunteers, with external validation from an additional 111 volunteers. The random forest (RF) model (ML_SSV) integrating features such as age, body weight (BW), body height, body mass index (BMI), body surface area (BSA), red blood cell count, platelet count, total bilirubin, fibrinogen, and D-dimer, demonstrated exceptional predictive accuracy. In external validation, the model achieved an RMSE of 22.6 mL (R²=0.80), with residual analysis confirming normally distributed errors (range: −58.32 to 67.01 mL; P=0.201). Notably, a simplified RF model (ML_SSVa) utilizing only four non-invasive parameters (age, BW, BMI, BSA) retained robust performance, with an RMSE of 36.0 mL (R²=0.70) in external validation. Furthermore, both models outperformed all existing formulas in cross-validation analyses. The models were deployed as open-access calculators at https://mlssv.vip.cpolar.cn (ML_SSV) and https://mlssva.vip.cpolar.cn (ML_SSVa), enabling real-time estimation with SHAP-based interpretability.

Conclusions: This study establishes a novel interpretable ML model rigorously validated through statistical and clinical benchmarks. These models enable the assessment of SSV, providing a reference baseline for the individualized diagnosis of splenomegaly to enhance diagnostic accuracy and support data-driven clinical decision-making.

Keywords: Standard splenic volume (SSV); machine learning (ML); predictive model; splenomegaly; clinical decision

Submitted Dec 26, 2024. Accepted for publication Mar 25, 2025. Published online Jun 03, 2025.

doi: 10.21037/qims-2024-2954

Introduction

The preservation of splenic function is increasingly prioritized, especially in hepatosplenic surgery (1-3). Splenomegaly is considered a significant indicator of physiological dysfunction in the liver and spleen (4,5), which has led to the development of various predictive models for disease diagnosis and prognosis based on its severity (6-8). However, the diagnosis and grading of splenomegaly has historically been based on physical examination or imaging, leading to potentially subjective and inaccurate linear measurements that inadequately represent the three-dimensional (3D) complexity of the organ (4,9-11). Despite the availability of 3D visualization for direct volume assessment based on organ reconstruction, an individualized diagnosis of splenomegaly and precise grading necessitates a standardized baseline, known as the standard splenic volume (SSV).

In previous studies, several formulas for estimating the individualized SSV for each patient have been developed (12-16), but their practical performance has proven less than satisfactory. This may be attributed to a variety of factors, such as differences among ethnic groups and the diversity of influencing factors (9,17,18). In addition, the development and validation of these models has mainly relied on small-sample, single-center studies. Moreover, the accuracy of the resulting formulas and their applicability have not been thoroughly verified. It should also be noted that most studies have relied on traditional analytical methods, which may result in diagnostic outcomes with lower performance, rather than employing more advanced machine learning (ML) techniques (19,20). Additionally, there is a noticeable lack of interpretability for the available formulas, which is crucial for user comprehension and practical application (21). For the enhancement of diagnostic and therapeutic methodologies, it is necessary to develop an intelligent predictive model that combines efficiency, accuracy, and robust interpretability.

In the past decades, ML has enabled significant advances in medicine (22,23). By discerning latent patterns within expansive and intricate datasets, ML models utilize an array of sophisticated ensemble algorithms to devise potent predictive models (24-30). Despite these advancements, there is a notable absence of reports on ML models for the assessment of SSV. Therefore, this multicenter study was designed to develop and validate an ML-based model that utilizes spleen volume data derived from 3D visualization, to efficiently and accurately assess SSV. Furthermore, the study integrates interpretability and introduces a visual web-based calculator, thereby promoting comprehension and application among clinicians. Finally, this study successfully established personalized benchmarks for the assessment of splenomegaly, standardizing the evaluation process in the clinical context. We present this article in accordance with the TRIPOD+AI reporting checklist (available at https://qims.amegroups.com/article/view/10.21037/qims-2024-2954/rc).

Methods

Volunteers

This multicenter, retrospective study included volunteers who underwent upper abdominal computed tomography (CT) examinations at Zhujiang Hospital of Southern Medical University and Guangdong Second Provincial General Hospital with identical inclusion and exclusion criteria between June 2019 and May 2024. Volunteers from Zhujiang Hospital (real-world cohort 1) were randomly divided into a training cohort and an internal validation cohort at a ratio of 7:3, and volunteers from Guangdong Second Provincial General Hospital were included in the external validation cohort (real-world cohort 2) (Figure 1). The study was conducted in accordance with the Declaration of Helsinki and its subsequent amendments. The study was approved by the Ethics Review Board of Zhujiang Hospital (No. 2018-GDYK-003). Written informed consent was provided by each volunteer. The other participating hospital was informed and agreed with the study.

Figure 1 Flow diagram of study population. ML_SSV: the ML-based model for standard splenic volume developed in this study. ML, machine learning.

Inclusion and exclusion criteria

Volunteers that meet the following criteria were screened for this study: (I) age over 18 years. (II) Have complete upper abdominal CT imaging data with a slice thickness of 1 mm. (III) Their medical records contain detailed clinical information. The exclusion criteria were as follows: (I) general conditions: (i) congenital or acquired developmental disorders. (ii) Pregnancy within the preceding year. (iii) Previous abdominal surgery history. (iv) History of splenic trauma; (II) hematological disorders: (i) hemolytic anemia. (ii) Thalassemia. (iii) Idiopathic thrombocytopenic purpura. (iv) Leukemia; (III) gastrointestinal system disease: (i) liver cirrhosis and portal hypertension. (ii) Thrombosis of the splenic artery, vein, and portal vein system. (iii) Hepatic or pancreatic malignancies. (iv) Acute or chronic pancreatitis. (v) Pancreatic and splenic neoplasms; (IV) special medical history: (i) chronic illnesses such as nephritis, tuberculosis, cardiac disease, systemic or localized cancer, persistent unexplained fever. (ii) Infections with specific pathogens, including hepatitis B and C viruses, human immunodeficiency virus (HIV), Epstein-Barr virus, cytomegalovirus, and syphilis; (V) insufficient clinical data in medical records.

Data collection

Before starting the data collection process, all personnel involved in data collection were trained on the data extraction form. The medical records of eligible volunteers were carefully reviewed. Clinical data, including sex, age, anthropometric parameters, and laboratory test indicators were systematically extracted into standardized forms. To ensure the relevance and consistency of the data, only laboratory tests performed within 7 days before the abdominal CT examination were considered. Extreme outliers were re-evaluated by the designated attending physician to determine their validity and exclude data entry errors.

CT data collection protocol

Data acquisition: all participants underwent upper abdominal enhanced scans using a Philips Brilliance 256-slice spiral CT scanner (Philips, Amsterdam, The Netherlands). The contrast agents used were iopamidol injection (370 mg I/mL) or iopromide injection (370 mg I/mL).
Scan parameters: the voltage was set at 120 kV, the current at 200 mA, and the detector pitch at 0.984. The slice thickness and interval were both 5 mm, and the tube rotation time per week was 0.5 seconds.
Pre-scan preparation: participants were required to fast and avoid water for 4 hours before the scan. They also received breathing training in advance to reduce artifacts in abdominal organs caused by respiratory motion.
Scan procedure: all participants were in a supine position, and the scan was performed from the top of the diaphragm to the lower edge of the kidneys. First, a plain scan of the upper abdomen was conducted. Then, 80–100 mL of iopamidol injection or iopromide injection was administered intravenously through the antecubital vein at a rate of 5 mL per second using a power injector for enhanced scanning. The arterial phase of the enhanced scan was triggered automatically 8 seconds after the CT value of the abdominal aorta reached 100 Hounsfield units (HU) by tracking the contrast agent. The same principle was applied to the venous phase, but the delay time was changed to 60 seconds.
The raw CT dataset was then re-sliced into images with a slice thickness of 1 mm and saved in Digital Imaging and Communications in Medicine (DICOM) format.

Spleen volume obtained using 3D visualization software

The collected CT data were seamlessly integrated into the medical image processing system (IPS V3.1, Yorktal, China, Software Copyright No. 2022SR0796185). Initial calculations were performed using the preliminary ResUNet model, which is based on the no new U-Net (nnU-Net) and ResNet framework (31-33). This model was used to resample and normalize the CT data to yield preliminary results. Subsequently, the region of interest, specifically the area corresponding to the spleen, was extracted, followed by further resampling and standardization.

The next step involved fine segmentation to achieve a more detailed spleen segmentation result. To enhance accuracy, discrepancies in the model output were addressed by manually adjusting spleen segmentation. The individuals who performed the fine segmentation were professional hepatobiliary surgeons with over 5 years of experience in the field. Finally, the 3D shape of the spleen was reconstructed using the moving cube algorithm and its volume was calculated (Figure 2).

Figure 2 Segmenting computed tomography data to reconstruct the spleen and calculating spleen volume using three-dimensional visualization software.

Statistical analysis

Volunteer data were presented as continuous variables or categorical variables. The Shapiro-Wilk test was used to assess whether the data conformed to normal distribution. For normally-distributed continuous variables, data were expressed as mean [standard deviation (SD)] and compared using the t-test. If the continuous variable did not conform to normal distribution, the Mann-Whitney U test was used and the results were presented as median [interquartile range (IQR)]. Categorical data were presented as numbers and frequencies and compared using the chi-square test. All statistical tests were two-sided, with P values <0.05 indicating statistical significance. The software SPSS 26.0 (IBM Corp., Armonk, NY, USA) was used in this study.

During data preprocessing, over 20% of missing parameters were excluded from the analysis and other missing data were supplemented using the median. The spline interpolation method was used to handle outliers (34). The 19 variables including sex, age, body weight (BW), body height (BH), body mass index (BMI), body surface area (BSA), white blood cells (WBC), red blood cells (RBC), hemoglobin (Hb), platelet count (PLT), alanine aminotransferase (ALT), aspartate aminotransferase (AST), total bilirubin (TBil), albumin (ALB), creatinine (CRE), prothrombin time (PT), international normalized ratio (INR), fibrinogen (FG), and D-dimer (DDI) from the training cohort were included in principal component analysis to reduce the influence of noise during model training. Then, the selected variables from the training cohort were fed into 11 ML algorithms including random forest (RF), light gradient boosting machine (LightGBM), gradient boosting regression (GBR), extreme gradient boosting (XGBoost), adaptive boosting (AdaBoost), multilayer perceptron (MLP), support vector regression (SVR), decision tree (DT), lasso regression, ridge regression, and elastic net regression to generate a pre-model of SSV prediction (Figure 3). To avoid overfitting, 5-fold cross-validation was used in the ML model building process. SHapley Additive exPlanations (SHAP) were used to assess the importance of each clinical feature and to quantitatively describe the overall relationship between spleen volume and all 15 features according to the pre-model (35). Based on the ranking of SHAP values, we selected the top 10 variables with the highest feature importance for further work. Continuous calculators were developed for all variables using continuously added predictors. The stopping point was the optimal coefficient of determination (R²) and corresponding feature subset obtained when all permutations of predictors were combined to build the model. Using this approach, we performed 1,023 feature selection iterations on each model and finally selected the feature subset that produced the best R². To enhance the convenience and practicality of the predictive model, we retrained it exclusively on more readily available age and anthropometric parameters. The grid search was used to optimize the hyperparameters of the model. Based on the feature selection and hyperparameter tuning of the training set, the data of the internal and external validation cohorts were correspondingly transformed. The diagnostic performance was evaluated based on the mean squared error (MSE), root mean squared error (RMSE), mean absolute error (MAE), R², and adjusted R². Residual analysis was used to assess the diagnostic performance of the model. To determine whether the new SHAP method for variable determination is superior to previous modeling using linear regression analysis, a formula based on stepwise multiple linear regression was developed and compared. In addition, we also compared other existing formulas for predicting the spleen volume. Python (version 3.8.0; Python Software Foundation, Wilmington, DE, USA) was used to compile codes in this study. Finally, the web-based tools for SSV prediction were developed.

Figure 3 Flowchart detailing the development of machine learning-based model for standard splenic volume. ML_SSV: the ML-based model for standard splenic volume developed in this study; ML_SSVa: the ML-based model for standard spleen volume developed only with age and anthropometric variables. BMI, body mass index; BSA, body surface area; BW, body weight; ML, machine learning; SHAP, SHapley Additive exPlanations.

Results

Volunteer characteristics

A total of 1,186 volunteers underwent data screening, 564 of whom were excluded, so that 622 volunteers without diseases related to spleen volume were included in the analysis (Figure 1).

The baseline characteristics of real-world cohorts 1 (n=511) and 2 (n=111) are presented in Table 1. In real-world cohort 1, the mean (SD) age was 52.67 (12.85) years and 47.75% of the volunteers were male. The median (IQR) spleen volume was 139.99 (102.00–190.56) mL. In the real-world cohort 2, the mean (SD) age was 55.85 (12.40) years and 51.35% of the volunteers were male. The median (IQR) spleen volume was 154.40 (127.67–189.38) mL.

Table 1

Baseline characteristics of the study

Characteristics	Real-world cohort 1 (n=511)	Real-world cohort 2 (n=111)	P value
Age (years)	52.67±12.85	55.85±12.40	0.018
Male	244 (47.75)	57 (51.35)	0.492
BH (cm)	162.00 [157.00–170.00]	163.00 [158.00–171.00]	0.129
BW (kg)	62.00 [55.00–70.00]	62.00 [54.00–73.00]	0.411
BMI (kg/m²)	23.24 [21.23–25.61]	23.41 [21.24–26.27]	0.690
BSA (m²)	1.63 [1.52–1.75]	1.68 [1.54–1.80]	0.012
WBC (G/L)	6.42 [5.27–7.78]	6.15 [5.35–7.75]	0.768
RBC (T/L)	4.52 [4.14–4.93]	4.53 [4.15–4.96]	0.932
PLT (G/L)	240.50 [199.00–291.00]	242.00 [205.00–281.00]	0.918
ALB (g/L)	39.80 [37.52–42.40]	43.10 [38.10–45.10]	<0.001
Hb (g/L)	132.00 [122.00–142.00]	136.00 [124.00–148.00]	0.021
TBil (μmol/L)	10.30 [7.40–14.20]	15.70 [11.50–18.20]	<0.001
CRE (μmol/L)	72.00 [60.00–85.00]	70.60 [57.50–82.90]	0.529
ALT (IU/L)	19.00 [13.00–31.00]	25.00 [19.00–35.00]	<0.001
AST (IU/L)	19.00 [16.00–25.00]	21.00 [18.00–26.00]	0.006
PT (s)	12.00 [11.50–12.80]	12.80 [12.10–13.40]	<0.001
INR	1.00 [0.95–1.04]	0.99 [0.94–1.05]	0.308
FG (g/L)	3.07 [2.61–3.57]	3.40 [2.93–3.92]	<0.001
DDI (mg/L)	0.33 [0.22–0.64]	0.22 [0.13–0.46]	<0.001
Spleen volume (mL)	139.99 [102.00–190.56]	151.75 [127.67–189.38]	0.006

Values are presented as mean ± standard deviation, frequency (%), or median [interquartile range]. ALB, albumin; ALT, alanine aminotransferase; AST, aspartate transaminase; BH, body height; BMI, body mass index; BSA, body surface area; BW, body weight; CRE, creatinine; DDI, D-dimer; FG, fibrinogen; Hb, hemoglobin; INR, international normalized ratio; PLT, platelet count; PT, prothrombin time; RBC, red blood cells; TBil, total bilirubin; WBC, white blood cells.

Final model performance in spleen volume determination

After controlling for noise, the 15 variables in the training cohort were integrated into 11 ML algorithms to create a pre-model. Subsequently, the SHAP method was applied to evaluate the significance of each variable and screen out 10 with the highest feature importance (Figure 4). Continuous calculators were then performed to obtain the optimal feature subset with the best predictive performance, which was used for the final model. The MSE of 11 ML models for determining the spleen volume ranged from 761.2 to 3,581.1 mL² in the training cohort, whereas the R² ranged from 0.31 to 0.83 (Table 2).

Figure 4 SHAP plots of 11 machine learning models. The impact of features for spleen volume determination in the random forest, light gradient boosting machine, gradient boosting regression, extreme gradient boosting, adaptive boosting, multilayer perceptron, support vector regression, decision tree, lasso regression, ridge regression, and elastic net regression. (A) Random forest; (B) light gradient boosting machine; (C) gradient boosting regression; (D) extreme gradient boosting; (E) adaptive boosting; (F) multilayer perceptron; (G) support vector regression; (H) decision tree; (I) lasso regression; (J) ridge regression; (K) elastic net regression. ALB, albumin; BH, body height; BMI, body mass index; BSA, body surface area; BW, body weight; DDI, D-dimer; FG, fibrinogen; INR, international normalized ratio; PLT, platelet count; PT, prothrombin time; RBC, red blood cells; SHAP, SHapley Additive exPlanations; TBil, total bilirubin; WBC, white blood cells.

Table 2

The optimal feature subsets and predictive performance of 11 ML algorithms during training process

ML models	Best feature subset	MSE (mL²)	R²
RF	‘Age’, ‘BW’, ‘BMI’, ‘BSA’, ‘RBC’, ‘PLT’, ‘TBil’, ‘FG’, ‘DDI’	801.6	0.82
LightGBM	‘Age’, ‘BW’, ‘BMI’, ‘BSA’, ‘WBC’, ‘PLT’, ‘TBil’, ‘ALB’, ‘FG’, ‘DDI’	1,998.1	0.63
GBR	‘Age’, ‘BW’, ‘BMI’, ‘RBC’, ‘PLT’, ‘TBil’, ‘ALB’, ‘FG’, ‘DDI’	2,832.4	0.49
XGBoost	‘Age’, ‘BW’, ‘RBC’, ‘PLT’, ‘TBil’, ‘ALB’, ‘PT’	2,062.7	0.63
AdaBoost	‘Age’, ‘BMI’, ‘BSA’, ‘RBC’, ‘PLT’, ‘PT’, ‘FG’, ‘DDI’	2,847.2	0.42
MLP	‘Age’, ‘BH’, ‘BW’, ‘BMI’, ‘WBC’, ‘PLT’, ‘TBil’, ‘ALB’, ‘FG’	3,423.4	0.38
SVR	‘Age’, ‘BW’, ‘BMI’, ‘PLT’, ‘TBil’, ‘ALB’, ‘FG’	3,581.1	0.35
DT	‘Age’, ‘BH’, ‘BW’, ‘BSA’, ‘RBC’, ‘PLT’, ‘TBil’, ‘FG’, ‘DDI’	2,303.7	0.58
Lasso	‘Age’, ‘BW’, ‘PLT’, ‘TBil’	3,544.7	0.36
Ridge	‘Age’, ‘BH’, ‘BW’, ‘BMI’, ‘WBC’, ‘RBC’, ‘PLT’, ‘TBil’, ‘FG’, ‘sex’	2,873.3	0.31
Elastic net	‘Age’, ‘BH’, ‘BW’, ‘BMI’, ‘PLT’, ‘TBil’, ‘ALB’, ‘PT’, ‘FG’, ‘sex’	3,409.6	0.38

AdaBoost, adaptive boosting; ALB, albumin; BH, body height; BMI, body mass index; BSA, body surface area; BW, body weight; DDI, D-dimer; DT, decision tree; Elastic Net, elastic net regression; FG, fibrinogen; GBR, gradient boosting regression; Lasso, lasso regression; LightGBM, light gradient boosting machine; ML, machine learning; MLP, multilayer perceptron; MSE, mean squared error; PLT, platelet count; PT, prothrombin time; R², coefficient of determination; RBC, red blood cells; RF, random forest; Ridge, ridge regression; SVR, support vector regression; TBil, total bilirubin; XGBoost, extreme gradient boosting.

Finally, the RF model containing nine features (‘Age’, ‘BW’, ‘BMI’, ‘BSA’, ‘RBC’, ‘PLT’, ‘TBil’, ‘FG’, ‘DDI’) was selected as the best ML model for predicting the SSV (ML_SSV). The MSE, RMSE, MAE, R², and adjusted R² values of ML_SSV in the training cohort were 801.6 mL², 28.3 mL, 20.2 mL, 0.82, and 0.82, respectively. In the internal validation cohort, the corresponding values were 937.7 mL², 30.6 mL, 22.1 mL, 0.83, and 0.82. In the external validation cohort, the RMSE and R² values of ML_SSV were 22.6 mL and 0.80, respectively (Table 3). A comparison plot of predicted versus actual values and a residual plot were used to further evaluate the predictive performance of ML_SSV in external validation (Figure 5A-5C). The residuals of ML_SSV in external validation ranged from −58.32 to 67.01, and the residual distribution was normal (P=0.201).

Table 3

The predictive performance of ML_SSV and ML_SSVa

Models	MSE (mL²)	RMSE (mL)	MAE (mL)	R²	Adj R²
ML_SSV
Training cohort	801.6	28.3	20.2	0.82	0.82
Internal validation cohort	937.7	30.6	22.1	0.83	0.82
External validation cohort	511.5	22.6	17.7	0.80	0.79
ML_SSVa
Training cohort	1,122.8	33.5	24.0	0.75	0.75
Internal validation cohort	1,369.1	37.0	26.5	0.75	0.74
External validation cohort	1,295.1	36.0	26.0	0.70	0.69

ML_SSV: the ML-based model for standard splenic volume developed in this study; ML_SSVa: the ML-based model for standard spleen volume developed only with age and anthropometric variables. Adj R², adjusted R²; MAE, mean absolute error; ML, machine learning; MSE, mean squared error; R², coefficient of determination; RMSE, root mean squared error.

Figure 5 Predictive performance of ML_SSV and ML_SSVa in external validation cohort. (A) Plot of predicted versus actual values of spleen volume assessed by ML_SSV. (B) Residual plot for spleen volumes assessed by ML_SSV. (C) Quantile-quantile plot of residuals for spleen volumes assessed by ML_SSV. (D) Plot of predicted versus actual spleen volumes as assessed by ML_SSVa. (E) Residual plot for spleen volumes assessed by ML_SSVa. (F) Quantile-quantile plot of residuals for spleen volumes assessed by ML_SSVa. ML_SSV: the ML-based model for standard splenic volume developed in this study; ML_SSVa: the ML-based model for standard spleen volume developed only with age and anthropometric variables. ML, machine learning.

Expansion of model applicability

To improve the model’s applicability, we eliminated complex indicators that require invasive testing, and retrained the final RF model using age, BW, BMI, and BSA. The predictive performance of the expand ML_SSV (ML_SSVa) obtained from training is shown in Table 3, with the external validation showing MSE, RMSE, and MAE values of 1,295.1 mL², 36.0 mL, and 26.0 mL. The R² and adjusted R² of ML_SSVa for the external validation were 0.70 and 0.69, respectively. The predictive performance of ML_SSVa was further underscored by comparison charts between predicted and actual values, complemented by residual plots, which indicated that it has acceptable forecasting accuracy (Figure 5D-5F).

Final model explanation

The SHAP method assigns importance values to features, encompassing both global model interpretation at the feature level and local interpretation at the individual level to assess the final model’s output. In the global interpretation, the SHAP summary plot of ML_SSV (Figure 6A) combines importance and effect graphs to show the distribution of SHAP values for each feature, helping us understand the contribution of each feature to the model prediction. The SHAP heatmap of ML_SSV (Figure 6B) can be used to effectively explore and understand the importance of features and their interrelationship in complex models, thereby improving the interpretability and explainability of the model. In terms of local interpretation, force plots clearly illustrated the contribution of each feature to the predicted value of a sample (Figure 6C-6E). For example, in a sample SHAP force plot of the internal validation cohort (Figure 6D), the baseline denotes the model’s anticipated value. The lengths of corresponding bars reflect each feature’s influence on the prediction. Red bars denote a positive impact, enhancing the prediction, whereas blue bars suggest a negative effect, reducing the forecasted value. Summing the baseline with all feature contributions, the final value of ML_SSV predicted splenic volume in this example is 113.84 mL. Similarly, the detailed explanation of ML_SSVa can be found in Figure 7.

Figure 6 Summary SHAP plots: global and local interpretation of ML_SSV. (A) SHAP summary plot, with divergence on the x-axis representing the impact on model output, and colors indicating low (blue) to high (red) values of predictors. (B) SHAP heatmap, where SHAP values measure the contribution of each feature to model predictions, with positive (red) and negative (blue) influences. Darker shades denote greater impact, and features are ranked by their average effect on model predictions. (C-E) SHAP force plots, with the baseline indicating the expected model value. The length of each bar represents the feature’s impact on the prediction, with red indicating a positive effect and blue a negative one. The sum of all feature baselines yields the final predicted spleen volume for an individual. ML_SSV, the ML-based model for standard splenic volume developed in this study. BMI, body mass index; BSA, body surface area; BW, body weight; DDI, D-dimer; FG, fibrinogen; ML, machine learning; PLT, platelet count; RBC, red blood cells; SHAP, SHapley Additive exPlanations; TBil, total bilirubin.

Figure 7 Summary SHAP plots: global and local interpretation of the ML_SSVa. (A) SHAP summary plot, diversion on the x-axis represents impact on model output, with colors used to represent low (blue) to high (red) value of predictors. (B) SHAP heatmap, SHAP values quantify the contribution of each feature to the model predictions, both positive (red) and negative (blue). Darker colours indicate greater impact. Features are ranked in order of their average impact on model predictions. (C-E) SHAP force plots, the baseline represents the expected value of the model. The length of the corresponding long bar reflects the impact of each feature on the prediction. Red bars indicate a positive influence, which enhances the prediction; blue bars indicate a negative influence, which decreases the prediction. The baselines contributed by all features are summed to obtain the final value of spleen volume predicted in an individual. ML_SSVa: the ML-based model for standard spleen volume developed only with age and anthropometric variables. BMI, body mass index; BSA, body surface area; BW, body weight; ML, machine learning; SHAP, SHapley Additive exPlanations.

Convenience of application

For enhanced convenience, the final prediction model was implemented into a web application to facilitate its application. When the actual values of the nine features required by ML_SSV or four features required by ML_SSVa are entered, the application will automatically predict the spleen volume of the patient. In addition, a SHAP feature importance plot of the contribution of each feature to the spleen volume prediction result will be displayed, intuitively explaining the factors influencing the spleen volume. The ML_SSV and ML_SSVa web applications can be accessed online at https://mlssv.vip.cpolar.cn and https://mlssva.vip.cpolar.cn, respectively (Figure 8).

Figure 8 Web applications for ML_SSV and ML_SSVa. (A) Upon entering the actual values for the nine features required by ML_SSV, the application automatically provides the patient’s spleen volume along with the corresponding SHAP feature significance map. (B) After inputting the actual values for the four features needed for ML_SSVa, the application automatically retrieves the patient’s spleen volume and the associated SHAP feature importance map. ML_SSV: the ML-based model for standard splenic volume developed in this study; ML_SSVa: the ML-based model for standard spleen volume developed only with age and anthropometric variables. BMI, body mass index; BSA, body surface area; BW, body weight; DDI, D-dimer; FG, fibrinogen; ML, machine learning; PLT, platelet count; RBC, red blood cells; SHAP, SHapley Additive exPlanations; TBil, total bilirubin.

Comparison with a stepwise multiple linear regression formula and existing formulas

We input the same variables that were used to develop ML_SSV into a stepwise multivariate linear regression and created the following score: 126.964 × BSA − 1.527 × Age − 0.185 × PLT + 1.026 × TBil + 57.778. The performance of the ML-based spleen volume prediction model developed in this study was compared with the prediction formula obtained based on stepwise multivariate linear regression analysis and other existing formulas (Table 4). In the external validation cohort, the stepwise multiple linear regression formula resulted in an MSE of 1,490.7 mL², an RMSE of 38.6 mL, an MAE of 27.5 mL, an R² of 0.60, and an adjusted R² of 0.57, all of which were outperformed by the ML-based models ML_SSV and ML_SSVa. Other existing formulas also demonstrated inferior predictive performance compared to the ML_SSV, even the best-performing formula published by Yang et al. (16), with an MSE value of 1,758.8 mL², an R² of 0.51, and an adjusted R² of 0.46.

Table 4

The differences in performance of previously reported SSV formulas in the external validation cohort

SSV formulas	MSE (mL²)	RMSE (mL)	MAE (mL)	R²	Adj R²
ML_SSV	511.5	22.6	17.7	0.80	0.79
ML_SSVa	1,295.1	36.0	26.0	0.70	0.69
Nakamura (12)	2,379.3	48.8	41.4	−0.11	−0.21
Asghar (13)	2,440.2	49.4	37.1	−0.14	−0.24
Harris (14)	3,543.6	59.5	49.8	−0.65	−0.80
Watanabe 1 (15)	1,156.0	34.0	27.7	0.46	0.41
Watanabe 2 (15)	9,268.5	96.3	90.9	−3.33	−3.71
Yang (16)	1,758.8	41.9	29.8	0.51	0.46
Linear formula	1,490.7	38.6	27.5	0.60	0.57

ML_SSV: the ML-based model for standard splenic volume developed in this study; ML_SSVa: the ML-based model for standard spleen volume developed only with age and anthropometric variables. Nakamura’s formula: 177.7 × BSA₁ − 179.9, BSA₁ = BH^0.725 × BW^0.425 × 0.007184; Asghar’s formula: 6.965 × BH − 961.04; Harris’s formula: BSA₂ × [278 × Age^(−0.36)], BSA₂ = [(BH × BW)/3600]^0.5; Watanabe 1’s formula: 6.516 × BW^0.797; Watanabe 2’s formula: BW × (4.473e^−0.026Age); Yang’s formula: 188.813 × BSA − 140.981. Linear formula: the formula of stepwise multiple linear regression analysis in this study, 126.964 × BSA − 1.527 × Age − 0.185 × PLT + 1.026 × TBil + 57.778. Adj R², adjusted R²; BH, body height; BMI, body mass index; BSA, body surface area; BW, body weight; MAE, mean absolute error; ML, machine learning; MSE, mean squared error; R², coefficient of determination; RMSE, root mean squared error; SSV, standard splenic volume.

Discussion

In this study, we developed and validated an interpretable ML model to predict the SSV in the Chinese population, with the aim of providing a baseline for the assessment of abnormal spleen volume after evaluating 11 ML algorithms. This ML_SSV model exhibited accuracy and stability across training, internal validation, and external validation cohorts, with R² values greater than 0.82. Additionally, we refined the model by selectively retaining using age and anthropometric parameters, thereby developing the ML_SSVa model with improved practical utility. We identified a series of easily extractable variables to construct the SSV prediction model, which showed improved accuracy compared with a series of previously published formulas. The intricate anatomical interplay between the liver and spleen renders conditions such as cirrhosis and portal hypertension intricately linked to spleen volume, with the establishment of normative spleen volume measurements facilitating a more personalized and standardized diagnostic approach. Deriving precise spleen volume measurements from straightforward data is essential for clinicians in making clinical decisions for conditions such as splenomegaly, cirrhosis, and portal hypertension. In addition, it also has potential applications in liver and spleen resection or repair surgeries, as well as in liver and spleen transplantation procedures. For the general public, it can serve as a preliminary tool for self-assessment of health status, potentially encouraging individuals to adopt a healthier lifestyle. Our model is entirely based on easily available predictive factors for assessing spleen volume, which can avoid the need for radiological examinations such as CT scans, thereby reducing iatrogenic radiation exposure as well as costs. This makes the model broadly applicable across various medical and community settings.

Previous studies have proposed formulas for estimating spleen volume based on linear regression of local demographic data. However, in the external validation cohort of this study, even the best-performing prediction formula proposed by Yang et al. (16) achieved an MSE value of 1,758.78 mL², an R² of 0.51, and an adjusted R² of 0.46. By contrast, the ML_SSV achieved corresponding values of 511.5 mL², 0.80, and 0.79, indicating that it significantly outperformed available formulas in terms of both accuracy and goodness of fit. It should be noted that many formulas were developed in single-center studies with small sample sizes, which introduces significant limitations such as selection bias and poor generalizability. In this study, the inclusion of a multi-center study population for model construction and validation was conducive to more accurate and effective results.

Improvements in research methodology have promoted the progress of ML models. Through rigorous feature selection and hyperparameter tuning, including a combination of various methods for selecting effective features, as well as model optimization through grid search and 5-fold cross-validation, we identified an RF algorithm model that consistently demonstrated optimal performance in two real-world cohorts from diverse sources, offering higher accuracy and applicability in estimating SSV. Compared to traditional linear analysis, advanced ML algorithms can create more efficient and accurate models for assessing SSVs by uncovering hidden connections in the dataset (36).

Another advantage of our model is its reliance on artificial intelligence algorithms for 3D visualization and segmentation of the spleen, which is essential given the irregular shape of the organ. Traditional anatomical landmarks or linear measurements obtained through physical examination and imaging modalities such as ultrasound or CT cannot accurately determine the volume of the spleen (37). The subjectivity of different radiologists or surgeons in measuring the spleen’s dimensions on two-dimensional (2D) images further complicates the issue. The unpredictable nature of splenic volume makes the spleen index formula derived from CT data less than ideal for volume assessment. The advent of computer-assisted automatic segmentation algorithms has facilitated the acceptance of fully automated organ segmentation and volume assessment in CT imaging (38). In this study, we utilized the ResUNet artificial intelligence model combined with manual correction of automatic segmentation to convert CT data into a complete and accurate 3D splenic model. This approach allows for convenient, precise, and objective determination of splenic volume, effectively mitigating errors associated with unreliable measurements.

Since it is difficult for clinicians and the public to accept predictive models that cannot be directly interpreted and understood, the SHAP method was used to assign feature importance values to interpret the output of the final model, which can help clinicians and patients better understand how the predictive model works. In our study, the global interpretation describes the overall functionality of the model and allows visualization of the impact of features on the model’s predictive results. By contrast, local explanations analyze how certain predictions are made for specific individuals by combining personalized input data. Physiological features are often considered closely related to the condition of internal organs, enabling them to explain and define certain phenomena. Using the interpretability of SHAP, it is clear that BW and age are the most influential predictors of SSV. Global interpretability visualization showed that BW had a positive influence, whereas age was a negative correlate, consistent with previous findings (13,39,40). Considering the economic and health benefits for the target population, we reconfigured the model used for SSV prediction (ML_SSVa) to exclude invasive and complex blood test metrics. Encouragingly, this simplified ML_SSVa showed considerable accuracy and consistency in both the training and validation cohorts, albeit at slightly lower levels compared with the full indicator prediction model. Notably, disease states usually lead to pathological changes of blood-related factors, making ML_SSVa, which contains only demographic variables, somewhat more reliable. This certainly provides more options and convenience for clinicians and the public.

Smart electronic devices and online health records have revolutionized healthcare, making the implementation of relatively complex diagnostic models in routine clinical practice feasible. This has created potential for the routine use of artificial intelligence models. In this study, we have developed free and open-access portable calculators to assist medical staff and the public in obtaining SSV values. The SHAP feature importance plots generated in the calculator can better facilitate the understanding of the effect of different features on spleen volume. The SSV obtained using our model can provide personalized recommendations for the diagnosis and management of conditions such as splenomegaly and liver cirrhosis. Furthermore, for patients requiring hepatic and splenic resection or transplantation surgery, clinicians can conduct more informed surgical planning and risk prediction based on the SSV, thus avoiding unnecessary medical errors and providing optimized disease management.

We acknowledge potential limitations. First, the data from the multicenter cohort were obtained retrospectively, which may have led to selection bias. Although strict inclusion and exclusion criteria may mitigate this shortcoming, further studies should be conducted to validate the performance of ML_SSV in predicting spleen volumes in different populations. In addition, ML_SSV was developed using data from Chinese volunteers and needs further validation in different ethnic groups to ensure its generalizability. Finally, ML_SSV may not be applicable to the pediatric population, as the included volunteers were over the age of 18 years. Subsequently, we will expand the sample to ensure the model’s applicability to a wider population. Although SSV exhibits a certain correlation with the diagnosis and prognosis of hepatosplenic disorders, we posit that more in-depth investigation is warranted to ascertain the precise magnitude of its influence and its concrete clinical value.

Conclusions

We have developed an interpretable ML-based model for determining the SSV in adult populations. The new model is more accurate than any previously reported formula. The ML-based model and web calculators, which are reliably applicable in various scenarios, provide the more accurate, radiation-free method for conveniently assessing the SSV, thereby facilitating individualized diagnosis of splenomegaly and enhancing the formulation of related clinical decisions.

Acknowledgments

None.

Footnote

Reporting Checklist: The authors have completed the TRIPOD+AI reporting checklist. Available at https://qims.amegroups.com/article/view/10.21037/qims-2024-2954/rc

Funding: This study was supported by the National Natural Science Foundation of China (grant No. 82272132); Noncommunicable Chronic Diseases-National Science and Technology Major Project (grant No. 2024ZD0525400); China Postdoctoral Science Foundation (grant Nos. 2022M721514 and 2024T170386); Guangdong Basic and Applied Basic Research Foundation (grant Nos. 2021A1515011869, 2023A1515110602, and 2024A1515012051); Regional Joint Fund of Guangdong (Guangdong-Hong Kong-Macao Research Team Project) (grant No. 2021B1515130003); Key Research and Development Plan Project of Guangzhou (grant No. 2023B03J1246); and Shenzhen Science and Technology Program (grant No. JCYJ20220818101401003).

Conflicts of Interest: All authors have completed the ICMJE uniform disclosure form (available at https://qims.amegroups.com/article/view/10.21037/qims-2024-2954/coif). The authors have no conflicts of interest to declare.

Ethical Statement: The authors are accountable for all aspects of the work in ensuring that questions related to the accuracy or integrity of any part of the work are appropriately investigated and resolved. The study was conducted in accordance with the Declaration of Helsinki and its subsequent amendments. The study was approved by the Ethics Review Board of Zhujiang Hospital (No. 2018-GDYK-003). Written informed consent was obtained from each volunteer. The other participating hospital was informed and agreed with the study.

Open Access Statement: This is an Open Access article distributed in accordance with the Creative Commons Attribution-NonCommercial-NoDerivs 4.0 International License (CC BY-NC-ND 4.0), which permits the non-commercial replication and distribution of the article with the strict proviso that no changes or edits are made and the original work is properly cited (including links to both the formal publication through the relevant DOI and the license). See: https://creativecommons.org/licenses/by-nc-nd/4.0/.

References

Liu G, Fan Y. Feasibility and Safety of Laparoscopic Partial Splenectomy: A Systematic Review. World J Surg 2019;43:1505-18. [Crossref] [PubMed]
Pinto VM, Gianesin B, Piel FB, Longo F, Rigano P, Quota A, Spadola V, Graziadei G, Mazzi F, Cappellini MD, Maggio A, Piga A, De Franceschi L, Forni GL. Morbidity and mortality of sickle cell disease patients is unaffected by splenectomy: evidence from three decades of follow-up in a high-income setting. Haematologica 2023;108:1158-62. [Crossref] [PubMed]
Yoshizumi T, Itoh S, Shimokawa M, Inokuchi S, Harada N, Takeishi K, Mano Y, Yoshiya S, Kurihara T, Nagao Y, Ikegami T, Soejima Y, Mori M. Simultaneous splenectomy improves outcomes after adult living donor liver transplantation. J Hepatol 2021;74:372-9. [Crossref] [PubMed]
Aldulaimi S, Mendez AM. Splenomegaly: Diagnosis and Management in Adults. Am Fam Physician 2021;104:271-6.
Heo S, Lee SS, Choi SH, Kim DW, Park HJ, Kim SY, Lee SJ, Kim KM, Shin YM CT. Rule-in and Rule-out Criteria for Clinically Significant Portal Hypertension in Chronic Liver Disease. Radiology 2023;309:e231208. [Crossref] [PubMed]
Yu Q, Xu C, Li Q, Ding Z, Lv Y, Liu C, Huang Y, Zhou J, Huang S, Xia C, Meng X, Lu C, Li Y, Tang T, Wang Y, Song Y, Qi X, Ye J, Ju S. Spleen volume-based non-invasive tool for predicting hepatic decompensation in people with compensated cirrhosis (CHESS1701). JHEP Rep 2022;4:100575. [Crossref] [PubMed]
Lee CM, Lee SS, Choi WM, Kim KM, Sung YS, Lee S, Lee SJ, Yoon JS, Suk HI. An index based on deep learning-measured spleen volume on CT for the assessment of high-risk varix in B-viral compensated cirrhosis. Eur Radiol 2021;31:3355-65. [Crossref] [PubMed]
Romero-Cristóbal M, Clemente-Sánchez A, Ramón E, Téllez L, Canales E, Ortega-Lobete O, Velilla-Aparicio E, Catalina MV, Ibáñez-Samaniego L, Alonso S, Colón A, Matilla AM, Salcedo M, Albillos A, Bañares R, Rincón D. CT-derived liver and spleen volume accurately diagnose clinically significant portal hypertension in patients with hepatocellular carcinoma. JHEP Rep 2023;5:100645. [Crossref] [PubMed]
Chow KU, Luxembourg B, Seifried E, Bonig H. Spleen Size Is Significantly Influenced by Body Height and Sex: Establishment of Normal Values for Spleen Size at US with a Cohort of 1200 Healthy Individuals. Radiology 2016;279:306-13. [Crossref] [PubMed]
Kucybała I, Ciuk S, Tęczar J. Spleen enlargement assessment using computed tomography: which coefficient correlates the strongest with the real volume of the spleen? Abdom Radiol (NY) 2018;43:2455-61. [Crossref] [PubMed]
Linguraru MG, Sandberg JK, Jones EC, Summers RM. Assessing splenomegaly: automated volumetric analysis of the spleen. Acad Radiol 2013;20:675-84. [Crossref] [PubMed]
Nakamura S, Takahara T, Hasegawa Y, Katagiri H, Kanno S, Akiyama Y, Iwaya T, Nitta H, Otsuka K, Koeda K, Sasaki A. Establishment of a method for calculating standard splenic volume and its use in the evaluation of functional hepatic reserve. Biomedical Research 2018;29:1459-64.
Asghar A, Agrawal D, Yunus SM, Sharma PK, Zaidi SH, Sinha A. Standard Splenic Volume Estimation in North Indian Adult Population: Using 3D Reconstruction of Abdominal CT Scan Images. Anat Res Int 2011;2011:707325. [Crossref] [PubMed]
Harris A, Kamishima T, Hao HY, Kato F, Omatsu T, Onodera Y, Terae S, Shirato H. Splenic volume measurements on computed tomography utilizing automatically contouring software and its relationship with age, gender, and anthropometric parameters. Eur J Radiol 2010;75:e97-101. [Crossref] [PubMed]
Watanabe Y, Todani T, Noda T, Yamamoto S. Standard splenic volume in children and young adults measured from CT images. Surg Today 1997;27:726-8. [Crossref] [PubMed]
Yang LB, Xu JY, Tantai XX, Li H, Xiao CL, Yang CF, Zhang H, Dong L, Zhao G. Non-invasive prediction model for high-risk esophageal varices in the Chinese population. World J Gastroenterol 2020;26:2839-51. [Crossref] [PubMed]
Lee HW, Park HS, Park S, Yu MH, Kim YJ, Jung SI. Discrepancies in Splenic Size Measurement: A Comparative Analysis of Ultrasound and Computed Tomography. Diagnostics (Basel) 2024;14:789. [Crossref] [PubMed]
Mustapha Z, Tahir A, Tukur M, Bukar M, Lee WK. Sonographic determination of normal spleen size in an adult African population. Eur J Radiol 2010;75:e133-5. [Crossref] [PubMed]
Topol EJ. High-performance medicine: the convergence of human and artificial intelligence. Nat Med 2019;25:44-56. [Crossref] [PubMed]
Norgeot B, Quer G, Beaulieu-Jones BK, Torkamani A, Dias R, Gianfrancesco M, Arnaout R, Kohane IS, Saria S, Topol E, Obermeyer Z, Yu B, Butte AJ. Minimum information about clinical artificial intelligence modeling: the MI-CLAIM checklist. Nat Med 2020;26:1320-4. [Crossref] [PubMed]
Lipton ZC. The mythos of model interpretability: in machine learning, the concept of interpretability is both important and slippery. Queue 2018;16:31-57.
Swanson K, Wu E, Zhang A, Alizadeh AA, Zou J. From patterns to patients: Advances in clinical machine learning for cancer diagnosis, prognosis, and treatment. Cell 2023;186:1772-91. [Crossref] [PubMed]
Subbiah V. The next generation of evidence-based medicine. Nat Med 2023;29:49-58. [Crossref] [PubMed]
Yan Y, Li Y, Fan C, Zhang Y, Zhang S, Wang Z, Huang T, Ding Z, Hu K, Li L, Ding H. A novel machine learning-based radiomic model for diagnosing high bleeding risk esophageal varices in cirrhotic patients. Hepatol Int 2022;16:423-32. [Crossref] [PubMed]
Reiniš J, Petrenko O, Simbrunner B, Hofer BS, Schepis F, Scoppettuolo M, et al. Assessment of portal hypertension severity using machine learning models in patients with compensated cirrhosis. J Hepatol 2023;78:390-400. [Crossref] [PubMed]
Bhat M, Rabindranath M, Chara BS, Simonetto DA. Artificial intelligence, machine learning, and deep learning in liver transplantation. J Hepatol 2023;78:1216-33. [Crossref] [PubMed]
Azuri I, Wattad A, Peri-Hanania K, Kashti T, Rosen R, Caspi Y, Istaiti M, Wattad M, Applbaum Y, Zimran A, Revel-Vilk S. C Eldar Y. A Deep-Learning Approach to Spleen Volume Estimation in Patients with Gaucher Disease. J Clin Med 2023;12:5361. [Crossref] [PubMed]
Humpire-Mamani GE, Bukala J, Scholten ET, Prokop M, van Ginneken B, Jacobs C. Fully Automatic Volume Measurement of the Spleen at CT Using Deep Learning. Radiol Artif Intell 2020;2:e190102. [Crossref] [PubMed]
Yang Y, Tang Y, Gao R, Bao S, Huo Y, McKenna MT, Savona MR, Abramson RG, Landman BA. Validation and estimation of spleen volume via computer-assisted segmentation on clinically acquired CT scans. J Med Imaging (Bellingham) 2021;8:014004. [Crossref] [PubMed]
Sharbatdaran A, Cohen T, Dev H, Sattar U, Bazojoo V, Wang Y, Hu Z, Zhu C, He X, Romano D, Scandura JM, Prince MR. Model-Assisted Spleen Contouring for Assessing Splenomegaly in Myelofibrosis: A Fast and Reproducible Approach to Evaluate Progression and Treatment Response. J Clin Med 2025;14:443. [Crossref] [PubMed]
Luu MH, Mai HS, Pham XL, Le QA, Le QK, Walsum TV, Le NH, Franklin D, Le VH, Moelker A, Chu DT, Trung NL. Quantification of liver-Lung shunt fraction on 3D SPECT/CT images for selective internal radiation therapy of liver cancer using CNN-based segmentations and non-rigid registration. Comput Methods Programs Biomed 2023;233:107453. [Crossref] [PubMed]
Isensee F, Jaeger PF, Kohl SAA, Petersen J, Maier-Hein KH. nnU-Net: a self-configuring method for deep learning-based biomedical image segmentation. Nat Methods 2021;18:203-11. [Crossref] [PubMed]
Diakogiannis FI, Waldner F, Caccetta P, Wu C. ResUNet-a: A deep learning framework for semantic segmentation of remotely sensed data. ISPRS Journal of Photogrammetry and Remote Sensing 2020;162:94-114.
CubicSpline-SciPy v1.15.1 Manual [cited 2025 Feb 15]. Available online: https://docs.scipy.org/doc/scipy/reference/generated/scipy.interpolate.CubicSpline.html
Lundberg SM, Lee SL. A unified approach to interpreting model predictions. The 31st International Conference on Neural Information Processing Systems 2017:4768-77.
Huang Y, Li J, Zheng T, Ji D, Wong YJ, You H, et al. Development and validation of a machine learning-based model for varices screening in compensated cirrhosis (CHESS2001): an international multicenter study. Gastrointest Endosc 2023;97:435-444.e2. [Crossref] [PubMed]
Perez AA, Noe-Kim V, Lubner MG, Graffy PM, Garrett JW, Elton DC, Summers RM, Pickhardt PJ. Deep Learning CT-based Quantitative Visualization Tool for Liver Volume Estimation: Defining Normal and Hepatomegaly. Radiology 2022;302:336-42. [Crossref] [PubMed]
Perez AA, Noe-Kim V, Lubner MG, Somsen D, Garrett JW, Summers RM, Pickhardt PJ. Automated Deep Learning Artificial Intelligence Tool for Spleen Segmentation on CT: Defining Volume-Based Thresholds for Splenomegaly. AJR Am J Roentgenol 2023;221:611-9. [Crossref] [PubMed]
Fateh SM, Mohammed NA, Mahmood KA, Hasan AH, Tahir SH, Kakamad FH, Salih AM, Abdullah HO, Abdalla BA, Mohammed SH, Hassan HA, Hussein DA. Sonographic measurement of splenic size and its correlation with body parameters. Med Int (Lond) 2023;3:7. [Crossref] [PubMed]
Kim DW, Ha J, Lee SS, Kwon JH, Kim NY, Sung YS, Yoon JS, Suk HI, Lee Y, Kang BK. Population-based and Personalized Reference Intervals for Liver and Spleen Volumes in Healthy Individuals and Those with Viral Hepatitis. Radiology 2021;301:339-47. [Crossref] [PubMed]

Cite this article as: Lin J, Yang J, Qian Y, Tang X, Zhu M, Luo W, Lin W, Chen M, Zheng X, Yuan X, Tao H. Development and validation of an interpretable machine learning model for standard spleen volume prediction. Quant Imaging Med Surg 2025;15(6):5160-5176. doi: 10.21037/qims-2024-2954

Development and validation of an interpretable machine learning model for standard spleen volume prediction

Introduction

Methods

Volunteers

Inclusion and exclusion criteria

Data collection

CT data collection protocol

Spleen volume obtained using 3D visualization software

Statistical analysis

Results

Volunteer characteristics

Table 1

Final model performance in spleen volume determination

Table 2

Table 3

Expansion of model applicability

Final model explanation

Convenience of application

Comparison with a stepwise multiple linear regression formula and existing formulas

Table 4

Discussion

Conclusions

Acknowledgments

Footnote

References

Article Options

Download Citation

Share