Progress in machine learning-assisted medical imaging for osteoarthritis and osteoporosis diagnosis: a narrative review

Wuyi Ming; Ting Liu; Rui Hu; Wenbin He; Yuan Yang

doi:10.21037/qims-2025-aw-2168

Review Article

Progress in machine learning-assisted medical imaging for osteoarthritis and osteoporosis diagnosis: a narrative review

Wuyi Ming^1,2#, Ting Liu^3#, Rui Hu¹, Wenbin He⁴, Yuan Yang^1,5

¹Laboratory of Regenerative Medicine in Sports Science, School of Sports Science, South China Normal University, Guangzhou, China; ²Guangdong HUST Industrial Technology Research Institute, Guangdong Provincial Key Laboratory of Digital Manufacturing Equipment, Dongguan, China; ³Guangdong Provincial Key Laboratory of Computer Integrated Manufacturing, School of Electromechanical Engineering, Guangdong University of Technology, Guangzhou, China; ⁴Mechanical and Electrical Engineering Institute, Zhengzhou University of Light Industry, Zhengzhou, China; ⁵Bone and Joint Research Team of Degeneration and Injury, Guangdong Provincial Academy of Chinese Medical Sciences, Guangzhou, China

Contributions: (I) Conception and design: W Ming, T Liu, Y Yang; (II) Administrative support: Y Yang; (III) Provision of study materials or patients: R Hu, W He, Y Yang; (IV) Collection and assembly of data: T Liu, R Hu; (V) Data analysis and interpretation: W Ming, T Liu; (VI) Manuscript writing: All authors; (VII) Final approval of manuscript: All authors.

^#These authors contributed equally to this work as co-first authors.

Correspondence to: Yuan Yang, PhD. Laboratory of Regenerative Medicine in Sports Science, School of Sports Science, South China Normal University, No. 378, Outer Ring West Road, Panyu District, Guangzhou 510006, China; Bone and Joint Research Team of Degeneration and Injury, Guangdong Provincial Academy of Chinese Medical Sciences, Guangzhou, China. Email: yangyuan@scnu.edu.cn.

Background and Objective: Osteoarthritis (OA) and osteoporosis (OP) are prevalent musculoskeletal disorders with substantial global health and economic burdens. Imaging is central to their diagnosis and monitoring, yet manual interpretation is vulnerable to inter-reader variability and workload-related fatigue. Artificial intelligence (AI), including machine learning (ML) and deep learning (DL), provides data-driven approaches to enhance the accuracy, efficiency, and objectivity of image interpretation. This review summarizes AI-assisted imaging advances for OA and OP over the past decade and discusses translational opportunities and challenges.

Methods: A literature search was conducted in Web of Science, PubMed, and Scopus for English-language studies published between January 2015 and August 2025. Search terms included osteoarthritis, osteoporosis, X-ray, computed tomography (CT), magnetic resonance imaging (MRI), machine learning, deep learning, detection, classification, and diagnosis. Titles and abstracts were screened, and selected full texts were reviewed to summarize advances and diagnostic performance across modalities.

Key Content and Findings: Across X-ray, CT, and MRI, ML/DL approaches enable more objective quantification of OA- and OP-related abnormalities. Using public and cohort-based datasets, studies have evolved from radiomics-based ML pipelines to end-to-end DL frameworks for screening, classification, and grading. For OA, radiographs dominate Kellgren-Lawrence (KL) grading and large-scale screening, complemented by MRI for early tissue biomarkers and CT for quantifying subchondral bone remodeling. For OP, X-ray/CT captures bone texture and trabecular microarchitecture to support detection and classification, with MRI mainly used to assess marrow- and soft-tissue-related markers. Overall, DL typically improves automation and representation learning, while ML remains interpretable and competitive in smaller datasets. Emerging studies suggest that multimodal fusion and longitudinal modeling for progression assessment and prediction may further improve performance.

Conclusions: AI-assisted imaging is reshaping OA and OP assessment by enabling earlier detection and more objective longitudinal monitoring. However, clinical translation is hindered by limited interpretability of many DL models and substantial data heterogeneity. Future research should prioritize standardized multicenter datasets and explainable AI frameworks. Prospective clinical studies and rigorous external validation are needed to bridge the gap between research and practice and to advance personalized musculoskeletal care.

Keywords: Osteoarthritis (OA); osteoporosis (OP); machine learning (ML); deep learning (DL); medical imaging

Submitted Oct 17, 2025. Accepted for publication Jan 20, 2026. Published online Mar 30, 2026.

doi: 10.21037/qims-2025-aw-2168

Introduction

Osteoarthritis (OA) and osteoporosis (OP) represent two of the most prevalent chronic skeletal disorders, imposing substantial clinical and societal burdens, particularly within aging populations (1). OA is defined as a whole-joint disease affecting the cartilage, subchondral bone, synovium, ligaments, and periarticular tissues, ultimately leading to chronic pain and functional impairment (2). In contrast, OP is a systemic skeletal condition characterized by reduced bone mass and deterioration of bone microarchitecture, which significantly elevates fracture risk (3). According to the Global Burden of Disease Study 2019, the global prevalence of OA rose by 113% between 1990 and 2019, reaching 528 million cases (4). During the same period, the incidence of osteoporotic fractures increased by 70%, totaling 436 million cases (5). In the context of global demographic shifts, OA and OP present escalating public health challenges that necessitate precise diagnosis and rigorous longitudinal monitoring.

Medical imaging is pivotal to the clinical evaluation and management of these conditions (6). X-rays remain the primary modality for assessing bone structure and joint space narrowing (JSN) (7), while computed tomography (CT) provides high-resolution visualization of cortical and trabecular bone (8). Magnetic resonance imaging (MRI) offers a comprehensive assessment of soft tissues, including cartilage, synovium, and subchondral bone (9). Collectively, these modalities provide synergistic insights for disease staging and treatment assessment. In clinical practice, OA severity is typically graded using the Kellgren-Lawrence (KL) (10) and Osteoarthritis Research Society International (OARSI) systems (11), whereas OP evaluation relies on radiographic indicators such as increased radiolucency and cortical thinning (12). However, these conventional methods are largely qualitative and inherently subjective, as they depend heavily on clinician expertise. Consequently, they are susceptible to inter- and intra-observer variability and often lack the sensitivity required to detect early-stage structural changes.

To address these limitations, quantitative imaging techniques have been developed to provide reproducible metrics, such as cartilage thickness and bone volume fraction (13). Concurrently, advancements in artificial intelligence (AI), specifically machine learning (ML) and deep learning (DL), have enabled the extraction and analysis of features from X-ray, CT, and MRI data (14). AI-driven imaging can enhance diagnostic performance by enabling automated detection and accurate grading of OA and OP. By capturing subtle structural and textural changes that are not readily appreciable on routine visual assessment, these computational approaches support more objective and scalable musculoskeletal evaluation (15,16). Beyond diagnosis, AI frameworks may further support downstream clinical tasks such as disease progression forecasting and prognostic risk assessment, as illustrated in Figure 1.

Figure 1 AI-driven workflow for imaging-based management of OA and OP across multiple imaging modalities. Imaging inputs from X-ray, CT, or MRI are processed through a structured pipeline of anatomical segmentation and quantitative feature learning to facilitate automated diagnosis and staging. These outputs ultimately inform a clinical management pathway, ranging from early intervention to long-term monitoring. Attribution: X-ray image (Inputs) is adapted from Yamamoto et al., Biomolecules, 2020, CC BY 4.0 (17); CT image (Inputs) is adapted from Pan et al., European Radiology, 2020, CC BY 4.0 (18); MRI image (Inputs) is adapted from Guida et al., Applied Sciences, 2021, CC BY 4.0 (19); Outputs panel (top) is adapted from Nishiyama et al., PLoS One, 2021, CC BY 4.0 (20); Outputs panel (bottom) is adapted from Hirvasniemi et al., European Radiology, 2021, CC BY 4.0 (21). BMD, bone mineral density; CNN, convolutional neural network; CT, computed tomography; KL, Kellgren-Lawrence; ML, machine learning; MRI, magnetic resonance imaging; OA, osteoarthritis; OARSI, Osteoarthritis Research Society International; OP, osteoporosis; ROI, region of interest.

Building upon these technological shifts, this review provides a comprehensive synthesis of the diagnostic progress in ML- and DL-based medical imaging for OA and OP over the past decade. Following a description of our search methodology (Section “Methods”), we first introduce the technical foundations of imaging modalities and AI-assisted analysis (Section “Imaging modalities and AI-assisted analysis for OA and OP”). The core of this review (Section “Imaging diagnosis of OA and OP with the assistance of ML and DL”) delivers an in-depth evaluation of AI-assisted diagnosis for OA and OP, categorized by imaging modality and algorithmic approach. Finally, Section “Discussion” discusses emerging trends, such as multimodal fusion and disease progression prediction, while addressing the practical barriers to clinical translation. We present this article in accordance with the Narrative Review reporting checklist (available at https://qims.amegroups.com/article/view/10.21037/qims-2025-aw-2168/rc).

Methods

A systematic literature search was conducted across the Web of Science, PubMed, and Scopus databases to identify relevant studies published between January 2015 and August 2025. The search strategy, including specific keywords and Boolean operators, is summarized in Table 1. The core search string combined terms related to pathologies (“osteoarthritis”, “osteoporosis”), imaging modalities (“X-ray”, “MRI”, “CT”), computational techniques (“machine learning”, “deep learning”), and clinical tasks (“detection”, “classification”, “diagnosis”). Only peer-reviewed original research articles published in English were included.

Table 1

Search strategy summary

Item	Specification
Date of search	August 31, 2025
Database searched	Web of Science, PubMed, and Scopus
Search terms used	(“osteoarthritis” OR “OA” OR “osteoporosis” OR “OP”) AND (“X-ray” OR “computed tomography” OR “CT” OR “magnetic resonance imaging” OR “MRI”) AND (“machine learning” OR “ML” OR “deep learning” OR “DL”) AND (“detection” OR “classification” OR “diagnosis”)
Timeframe	2015.01–2025.08
Inclusion criteria and exclusion criteria	Inclusion: (I) original research utilizing ML or DL for medical imaging (X-ray, CT, or MRI) in OA or OP diagnosis; (II) peer-reviewed articles; (III) published in English
Inclusion criteria and exclusion criteria	Exclusion: (I) studies without medical imaging or ML/DL integration; (II) non-diagnostic applications; (III) non-peer-reviewed materials (e.g., conference abstracts, book chapters, editorials)
Selection process	Literature selection was completed by W.M. and T.L.

CT, computed tomography; DL, deep learning; ML, machine learning; MRI, magnetic resonance imaging; OA, osteoarthritis; OP, osteoporosis.

The literature selection followed a multi-step process. Initially, 2,070 records were retrieved. After removing duplicates, 1,853 titles and abstracts were screened. Studies were excluded based on the following criteria: (I) non-peer-reviewed publications (e.g., conference abstracts, book chapters, or editorials); (II) studies not focused on the clinical diagnosis of OA or OP; and (III) studies lacking the integration of ML or DL algorithms with medical imaging. After this initial screening, 354 full-text articles were assessed for eligibility. Following full-text screening, 68 studies were excluded for the following reasons: (I) the studies used imaging modalities other than X-ray, MRI, or CT, which were outside the scope of this review; (II) the studies did not clearly specify the ML or DL algorithms used, or the methods were unrelated to diagnostic tasks; (III) the full text was unavailable, preventing further assessment.

Ultimately, 286 studies met the inclusion criteria and were included for systematic categorization and synthesis. To facilitate a structured analysis, these studies were organized into a three-level hierarchical taxonomy. First, the literature was primarily categorized by target pathology (OA and OP). Within each pathology, studies were grouped according to their computational framework (ML vs. DL) to reflect the technological evolution in the field. Finally, research was further analyzed based on imaging modality (e.g., X-ray, CT, and MRI). This systematic organization provides the structural foundation for the comparative synthesis of AI-driven diagnostic performance presented in the subsequent sections.

Imaging modalities and AI-assisted analysis for OA and OP

High-fidelity medical imaging provides the basis for OA and OP assessment. Recent advancements in imaging assessment and consensus guidelines have highlighted the central role of medical imaging in the evaluation of OA and OP (22). As imaging datasets grow in scale and complexity, AI has increasingly been adopted to support diagnostic efficiency and consistency (23). Both ML and DL paradigms are widely used paradigms for AI-assisted imaging, and the choice between them depends on the study objective, data availability, and clinical constraints. To provide a structured overview, Table 2 summarizes representative studies selected based on a combination of methodological innovation, study influence (e.g., citation impact), and clinical relevance. These works are categorized by AI methodology (ML vs. DL) and imaging modality (e.g., X-ray, CT, and MRI) across three primary clinical tasks: anatomical segmentation, disease diagnosis, and progression prediction, which represent the essential stages of the clinical diagnostic and management pathway for OA and OP.

Table 2

Representative AI studies in OA and OP: segmentation, diagnosis, and progression prediction

Disease	Reference	Imaging modality	Clinical task	AI methodology	Dataset	Performance	Contribution
OA	(24)	Knee X-ray	Segmentation	ResNet	570 images	DSC: 0.964 (femur), 0.942 (tibia)	Applied the Taguchi method to optimize knee X-ray segmentation
	(25)	Knee X-ray	Diagnosis	Hybrid Transformers	OAI (8,260 images)	Accuracy: 97.03%; Cohen’s kappa: 0.98	Integrated multiple Transformer models with Explainable AI for interpretable knee OA grading
	(26)	Knee X-ray	Early detection	LR	OAI (1,024 images)	Accuracy: 82.98%, sensitivity: 87.15%, specificity: 80.65%	Using pixel-intensity features only and validated on a larger multicenter dataset
	(27)	Knee X-ray	8-year longitudinal progression	LASSO regression	OAI (1,243 high-risk subjects)	AUROC: 0.86 (radiographic), 0.95 (pain)	Established a high-accuracy 8-year prognostic tool using multimodal data
	(28)	MRI	Segmentation	SSMs + 2D/3D CNNs	OAI (595 subjects) & SKI10	OAI: DSC: 98.5–98.6% (bone), 85.6–89.9% (cartilage); SKI10 Score: 75.73	Combined anatomical shape models with CNNs to achieve human-level accuracy
	(29)	MRI	Diagnosis	DenseNet	OAI (4,384 subjects; baseline T2 maps)	AUROC: 83.44%, sensitivity: 76.99%, specificity: 77.94%	Validated that deep feature learning from raw T2 maps outperforms traditional voxel-averaging methods
	(30)	MRI	Progression prediction (JSN and/or pain progression)	XGBoost	FNIH (594 participants)	AUROC: 0.880 (JSN + pain), 0.913 (JSN), 0.886 (pain), 0.909 (non-progression)	Longitudinal MRI radiomic features of load-bearing knee tissues are informative for predicting knee OA progression
	(31)	CT	Segmentation	SSMs + CNNs	85 patients	DSC: 84.53%	An adaptive SSM-CNN fusion framework with voxel-wise refinement was developed for anatomically consistent patella segmentation
	(32)	CT	Segmentation & diagnosis	Cascaded 3D CEL-U-Net (segmentation) & 3D Arthro-Net (multi-task classification)	571 CT scan dataset	DSC: 0.98–0.99 (bone); staging accuracy: 89–91%; total inference time: 14.8 s	Automated cascaded pipeline for concurrent bone segmentation and multi-task OA staging
	(33)	CT	Automated hip OA severity grading	Vision Transformer	197 hip OA patients	Accuracy: 0.95	Developed an automated model for Crowe/KL hip OA grading (disease progression)
OP	(34)	Lumbar spine X-ray	Segmentation	M-Net	160 images	Mean DSC: 91.60%±2.22%	Accurate lumbar vertebra identification (pose-driven DL) and fine vertebra segmentation (M-Net + level-set refinement)
	(35)	Lumbar spine X-ray	3-class classification (normal/osteopenia/OP)	DCNN	1,616 images from 808 postmenopausal women	AUROC: 0.726–0.767 (OP), 0.787–0.810 (osteopenia)	DXA-referenced DL screening for OP/osteopenia from lumbar radiographs
	(17)	Hip X-ray	Classification	EfficientNet-B3	1,131 images + clinical covariates	AUROC: 0.9374; accuracy: 0.8805	Integrated routine clinical variables with the X-ray model to improve predictive accuracy over imaging alone
	(36)	MRI	Proximal femur segmentation	CNN	86 subjects	DSC: 0.95±0.02; precision: 0.95±0.02; recall: 0.95±0.03	Automatic proximal femur segmentation using deep CNNs with high accuracy, enabling clinical use of MRI-based bone quality measurements
	(37)	MRI	Diagnosis fresh osteoporotic VFs	Ensemble CNNs (VGG16 + VGG19 + DenseNet201 + ResNet50)	1,624 slices of T1-weighted MRI	AUROC: 0.949	Performance comparable to that of spine surgeons in detecting fresh osteoporotic VFs
	(38)	Quantitative MRI	Fracture risk prediction	RUS-boosted trees, LR, linear discriminant	92 women (32 with prior fragility fractures; 60 controls)	RUS-boosted trees: F1 =0.63±0.03	MRI microstructure and FRAX add independent value; head/trochanter regions most informative
	(39)	CT	Segmentation and opportunistic screening	U-Net (segmentation), DenseNet-121 (BMD calculation)	1,449 patients	DSC: 0.782–0.823; r>0.98	DL-based method for fully automatic identification of OP, osteopenia, and normal BMD
	(18)	CT	Segmentation and opportunistic screening	U-Net and 3D-CNN	200 images	DSC: 86.6%; AUROC: 0.927 (OP), 0.942 (low BMD)	First DL-based automation of BMD measurement and OP detection on lung cancer screening low-dose chest CT, validated against QCT
	(40)	CT	Vertebral insufficiency fracture risk prediction	SVM	58 patients with insufficiency fractures of the spine	AUROC: 0.97	Application of bone texture analysis combined with ML improved fracture risk prediction

r, Pearson correlation coefficient; 3D, three-dimensional; AI, artificial intelligence; AUROC, area under the receiver operating characteristic curve; BMD, bone mineral density; CNN, convolutional neural network; CT, computed tomography; DCNN, deep convolutional neural network; DL, deep learning; DSC, Dice similarity coefficient; DXA, dual-energy X-ray absorptiometry; FNIH, Foundation of the National Institutes of Health; JSN, joint space narrowing; KL, Kellgren-Lawrence; LASSO, least absolute shrinkage and selection operator; LR, linear regression; MRI, magnetic resonance imaging; OA, osteoarthritis; OAI, Osteoarthritis Initiative; OP, osteoporosis; QCT, quantitative computed tomography; SKI10, Segmentation of Knee Images 2010; SSM, statistical shape model; SVM, support vector machine; VF, vertebral fracture.

Imaging techniques for OA and OP

Precise medical imaging is indispensable for the diagnostic evaluation and clinical management of OA and OP. Various modalities are employed, each characterized by distinct advantages and constraints regarding spatial resolution, dimensionality, and tissue contrast. Currently, conventional radiography, CT, and MRI serve as the cornerstone diagnostic instruments, yielding synergistic insights into musculoskeletal integrity (41).

X-ray imaging is widely used due to its cost-effectiveness and accessibility. It is effective for detecting fractures, osteophytes, and JSN, thereby supporting the assessment of OA. Although dual-energy X-ray absorptiometry (DXA) is less relevant for OA evaluation, it remains the gold standard for measuring bone mineral density (BMD) and is a cornerstone to the diagnosis of OP (42). However, both techniques primarily produce two-dimensional (2D) images and offer limited visualization of soft tissues, which often necessitates supplementary imaging via CT or MRI.

CT provides higher-resolution imaging of cortical bone and complex structures compared to conventional X-rays, making it particularly valuable for evaluating intra-articular fractures and subtle bony changes. Its capacity for multiplanar reconstruction facilitates a comprehensive structural assessment from multiple angles. In OA, CT is primarily used to quantify subchondral bone parameters and osteophyte volume (43); in OP, it helps mitigate the artifacts associated with DXA, thereby improving diagnostic specificity (44). Nevertheless, the relatively high radiation dose associated with CT limits its suitability for frequent longitudinal monitoring, and its soft tissue contrast remains inferior to that of MRI.

MRI holds a central position in musculoskeletal imaging owing to its superior soft tissue contrast, rendering it indispensable for the evaluation of cartilage, ligaments, tendons, and joint fluid. Unlike CT, MRI produces three-dimensional (3D) images without employing ionizing radiation, and it is widely utilized in assessing soft tissue lesions associated with both OA and OP (45). By leveraging relaxation times (T1 and T2), MRI can detect early tissue fluid changes and reveal periarticular abnormalities. Despite these advantages, its routine application is constrained by high costs, lengthy acquisition times, and susceptibility to artifacts from metallic implants.

Advances in medical imaging have substantially enhanced the diagnosis and management of musculoskeletal disorders. Clinicians now employ a range of modalities, each with distinct strengths: X-ray, CT, and MRI serve as primary techniques for visualizing bone and soft tissues, while ultrasound (US) is occasionally used to examine superficial structures such as tendons and ligaments (46). US offers real-time, portable, and radiation-free imaging; however, its reliability is limited by operator dependence and inadequate visualization of deep bone structures (47). The availability of high-quality multimodal imaging data has further facilitated the application of AI techniques, including ML and DL, enabling the identification of complex patterns and improving diagnostic accuracy. These developments contribute significantly to the detection, characterization, and monitoring of OA and OP.

AI-assisted image analysis

Building upon advanced imaging techniques, AI, particularly ML and DL, has been increasingly employed to enhance the diagnosis of OA and OP. By extracting complex patterns and quantitative features from medical images, AI complements clinical assessment and facilitates objective, reproducible evaluation.

In orthopedics, ML has demonstrated significant potential to improve diagnostic accuracy (48,49). A typical ML workflow involves image preprocessing, segmentation of region of interests (ROIs), and feature extraction, such as cartilage thickness and joint space width (JSW) for OA, or BMD and trabecular microstructure for OP (50). Dimensionality reduction techniques, including the least absolute shrinkage and selection operator (LASSO) regression, are frequently applied prior to classification using support vector machines (SVM), random forests (RF), or k-nearest neighbors (KNN). Model performance is commonly evaluated using metrics such as accuracy, sensitivity, specificity, and the area under the receiver operating characteristic curve (AUROC). For example, one study reported that an RF model utilizing shape and texture features achieved an AUROC of 0.849 for OA classification (51).

DL, and particularly convolutional neural networks (CNNs), enable end-to-end learning directly from raw image data, thereby eliminating the need for manual feature engineering. Although preprocessing steps such as intensity normalization and data augmentation remain essential, CNNs automatically learn hierarchical feature representations. Advanced architectures, including generative adversarial networks (GANs) and U-Net, have further improved segmentation performance. In segmentation tasks, model evaluation typically relies on overlap-based metrics, such as the Dice similarity coefficient (DSC) and the Jaccard index [Intersection over Union (IoU)], which quantify the spatial agreement between predicted and ground-truth regions. For instance, a GAN-based approach demonstrated an average DSC of 0.88 for cartilage segmentation (52). In contrast, the evaluation of grading tasks relies on standard classification metrics, while interpretability is often enhanced through techniques such as gradient-weighted class activation mapping (Grad-CAM) or saliency maps. For instance, a CNN-based approach achieved an AUROC of 0.93 in knee OA grading, with attention maps visually highlighting diagnostically relevant regions (53). Furthermore, transfer learning using pre-trained models (e.g., on ImageNet) has also been shown to enhance training efficiency and model generalizability (54).

As illustrated in Figure 2, the distribution of ML and DL algorithms in the clinical diagnosis of OA and OP reveals a predominant shift toward DL architectures. Specifically, DL models account for 80.5% and 70.3% of the studies in OA and OP, respectively, underscoring the transition from traditional, manually intensive feature engineering to automated, end-to-end hierarchical representation learning. In OA research, CNNs and their variants remain the primary architecture due to their efficacy in identifying structural abnormalities, particularly in high-dimensional imaging data. Conversely, ML approaches, comprising 19.5% of OA studies, primarily utilize RF and SVM to process predefined morphological parameters. This reflects the ongoing utility of traditional ML models in domains where feature extraction is relatively straightforward and interpretable. Additionally, ML methods (e.g., XGBoost and KNN) maintain a more substantial footprint in OP studies (29.7%) compared to OA. This may be attributed to the continued efficacy of traditional ML models in processing the structured, feature-level diagnostic parameters frequently utilized in OP clinical evaluations. In OP, these models are particularly adept at handling structured tabular data such as BMD and clinical risk factors, which have been the standard for diagnostic evaluation in clinical settings for years.

Figure 2 Distribution of ML and DL algorithms in imaging-based diagnostic studies of OA and OP. The donut charts show the overall proportion and number of studies utilizing DL versus ML for each condition. Horizontal bar charts display the frequency of specific DL architectures (left) and ML algorithms (right) reported in the literature. The bottom panels present the yearly publication trends from 2015 to 2025, stratified by methodology. “Others” aggregates algorithms that were infrequently reported. ANN, artificial neural network; DCNN, deep convolutional neural network; DL, deep learning; DNN, deep neural network; DT, decision tree; ELM, extreme learning machine; KNN, K-nearest neighbor; LDA, linear discriminant analysis; LR, linear regression; ML, machine learning; MLP, multilayer perceptron; OA, osteoarthritis; OP, osteoporosis; SVM, support vector machine; YOLO, you only look once.

The temporal trends (Figure 2, bottom) further elucidate a clear paradigm shift in the diagnostic landscape. The period between 2015 and 2020 was an exploratory phase, with publication volumes remaining relatively low and a notable reliance on ML for diagnostic tasks. This phase was characterized by a limited availability of large-scale annotated datasets and relatively less computational power for training complex models. However, a technological inflection point occurred in 2021, marking the beginning of an acceleration phase. Several factors contributed to this shift: advancements in GPU technology, the rise of open-source DL frameworks (such as TensorFlow and PyTorch), and the increased availability of large-scale annotated diagnostic datasets. Between 2021 and 2025, the number of DL-based diagnostic studies surged exponentially, reflecting the maturation of DL frameworks and the growing support for high-performance computing resources. While OA diagnostics pioneered the early application of AI, OP diagnostics exhibited a sharper acceleration after 2022, rapidly achieving methodological convergence with DL dominance. This surge in OP research may be attributed to several factors, including the increased adoption of DL in imaging tasks and the availability of clinical datasets containing high-resolution images and relevant diagnostic parameters. These advancements have helped DL-based methods surpass traditional ML approaches, particularly in handling complex imaging data such as bone texture and trabecular microstructure. Overall, this evolving distribution suggests that while DL has become the standard for high-dimensional diagnostic tasks, ML continues to offer robust and interpretable solutions for feature-based diagnostic workflows.

Imaging diagnosis of OA and OP with the assistance of ML and DL

OA and OP imaging datasets

The successful implementation of ML and DL algorithms in orthopedic imaging hinges on the availability of high-quality, large-scale datasets. As data-driven paradigms, these models often rely more heavily on the quality and diversity of training data than on algorithmic complexity. In particular, DL architectures require substantial data volumes to optimize network parameters through iterative training and to extract generalizable feature representations. Conventionally, datasets are partitioned into training, validation, and testing subsets to facilitate robust model development, hyperparameter tuning, and final performance evaluation.

Publicly available datasets provide a standardized platform for model comparison and reproducibility; however, such resources remain relatively scarce in OA research. Established repositories include the Osteoarthritis Initiative (OAI) (55), the Multicenter Osteoarthritis Study (MOST) (56), and other resources such as the Cohort Hip & Cohort Knee (CHECK), Segmentation of Knee Images 2010 (SKI10), Musculoskeletal Radiographs (MURA), MRNet, and the Digital Knee X-ray Dataset. Notably, semi-quantitative methods such as Knee Images Digital Analysis (KIDA) have been developed to extract quantitative features from knee X-rays, including cartilage thickness, osteophyte area, and subchondral bone density (57). However, KIDA relies on specific imaging protocols, such as the standardized semi-flexed knee view employed in CHECK, and may not be directly applicable to X-rays acquired under different conditions, such as those within the OAI database.

In contrast to the structured repositories for OA, OP imaging research frequently relies on institutional or independently curated datasets. The CTSpine1K dataset, for instance, offers spine CT images reformatted into the NIfTI format to facilitate standardized processing and de-identification. The European Prospective Osteoporosis Study (EPOS) enrolled participants aged 50 years and older across 29 centers to assess vertebral fracture (VF) incidence via lateral spine X-rays. Similarly, the Study of Osteoporotic Fractures (SOF) and the Osteoporotic Fractures in Men Study (MrOS) offer extensive longitudinal data, including DXA scans, quantitative computed tomography (QCT) images, and biochemical markers. Table 3 summarizes the key characteristics of these public datasets.

Table 3

OA and OP public datasets related information

Disease	Dataset	Website	Object	Imaging method	Number of subjects	Subject age	Number of images
OA	OAI	https://nda.nih.gov/oai/	Knee	X-ray, MRI	4,796	45–79 years	26,626,000
	MOST	http://most.ucsf.edu	Knee	X-ray, MRI, DXA	3,026 (existing cohort); 1,500 (new cohort)	50–79 years
	MRNet	https://stanfordmlgroup.github.io/competitions/mrnet	Knee	MRI	1,312	Mean age 38.0 years	1,370
	CHECK	http://www.check-research.com	Hip, knee	X-ray	1,002	45–65 years
	Digital Knee X-ray Images	https://data.mendeley.com/datasets/t9ndx37v5h/1	Knee	X-ray			1,650
	SKI10	https://ski10.grand-challenge.org/Home/	Knee	MRI			100
	MURA	https://stanfordmlgroup.github.io/competitions/mura	Musculoskeletal (fingers, elbows, forearms, hands, humerus, shoulders, and wrists)	X-ray	12,173		40,561
OP	CTSpine1K	https://github.com/ICT-MIRACLE-lab/CTSpine1K	Spine	CT	1,005		Over 500,000 labeled slices and over 11,000 vertebrae
	SOF	https://sleepdata.org/datasets/sof	Hip	DXA	9,703	65 years or older
	MrOS	https://agingresearchbiobank.nia.nih.gov/studies/mros	Hip, vertebral	DXA, QCT	10,994	Mean age 73±6 years
	EPOS	–	Hip, vertebral	DXA	17,000	50–79 years	26,626,000

CHECK, Cohort Hip & Cohort Knee; CT, computed tomography; DXA, dual-energy X-ray absorptiometry; EPOS, European Prospective Osteoporosis Study; MOST, Multicenter Osteoarthritis Study; MRI, magnetic resonance imaging; MrOS, Osteoporotic Fractures in Men Study; MURA, Musculoskeletal Radiographs; OA, osteoarthritis; OAI, Osteoarthritis Initiative; OP, osteoporosis; QCT, quantitative computed tomography; SKI10, Segmentation of Knee Images 2010; SOF, Study of Osteoporotic Fractures.

Despite these efforts, the lack of diverse, high-quality public datasets remains a fundamental bottleneck. Limited dataset size not only restricts feature learning but also exacerbates two critical challenges. First, inadequate data diversity hinders model generalization, often leading to performance degradation when algorithms are applied to external cohorts or varying imaging protocols. Second, small or imbalanced datasets are prone to systematic biases, where models may overfit to specific demographics or acquisition conditions, thereby compromising their clinical utility. These challenges underscore an urgent need for multicenter collaborative initiatives to develop standardized, large-scale imaging repositories, which are essential for advancing robust and unbiased AI solutions in orthopedics.

Assisted medical imaging diagnosis for OA

ML for OA imaging diagnosis

ML has become increasingly instrumental in OA imaging, enabling early diagnosis and objective severity assessment (58). Following image preprocessing and ROI segmentation, multidimensional features are extracted, comprising morphological parameters (e.g., cartilage thickness and JSW), texture attributes (e.g., grey-level co-occurrence matrix (GLCM), local binary patterns (LBP), and Haralick descriptors), and intensity-based metrics. These features facilitate two primary classification paradigms: pathology-driven analysis, which quantifies structural alterations such as cartilage degeneration and bone remodeling; and severity grading, which aligns with clinical standards such as the KL or OARSI criteria. Evaluated via robust cross-validation, the implementation of these ML-driven workflows enhances diagnostic precision and reproducibility, serving as a pivotal adjunct to radiologists in clinical decision-making.

X-ray

Most X-ray-based ML studies have focused on automating radiographic severity assessment, typically using the KL grading system or related ordinal scales. A common pipeline involves joint localization or ROI extraction, followed by the computation of handcrafted features. Unlike MRI or CT, which may employ volumetric or biochemical markers, X-ray-based approaches focus on 2D structural and texture descriptors of bones and joint spaces. These features are then used as input to ML classifiers, including RF (58,59), decision tree (DT) (60), Bayes (61), XGBoost (62), and SVM (63) for OA classification. For example, one study used Grad-CAM-guided localization and cropping of the knee joint ROI, extracted handcrafted morphological, texture, and statistical features from the segmented region, and fed them into an XGBoost model to perform five-class KL (0–4) grading, achieving a classification accuracy of 99.46%, as illustrated in Figure 3A (64). Another study developed a clinically inspired hierarchical KL grading framework, in which U-Net-based segmentation of joint spaces and osteophytes provided geometric and radiomic features for ML classifiers, achieving up to 98.5% accuracy in distinguishing KL 0–2 from KL 3–4 (66). In addition, several studies have employed regression-based normalization and independent component analysis to mitigate data bias and reduce feature dimensionality before model training (26). More recently, hybrid ML-DL schemes have been proposed, in which CNNs are used as automatic feature extractors and the resulting deep features, alone or combined with handcrafted descriptors, are passed to traditional ML classifiers (67-69).

Figure 3 Methodological frameworks and diagnostic performance of image-based ML models for OA assessment. (A) Methodological workflow and accuracy comparison of five ML models across six feature sets. Adapted from Fatema et al., Heliyon, 2023, CC BY 4.0 (64). (B) PCI-CT-based visualization, SIM feature extraction, and classification performance across different feature dimensionality reduction strategies. Adapted from Nagarajan et al., PLoS One, 2015, CC BY 4.0 (65). (C) Illustration of tibial VOIs and diagnostic performance curves for models integrating image features and clinical covariates. Adapted from Hirvasniemi et al., European Radiology, 2021, CC BY 4.0 (21). DT, decision tree; LR, linear regression; ML, machine learning; OA, osteoarthritis; PCI, phase contrast imaging; RF, random forest; SIM, scaling index method; SVR, support vector regression; VOIs, volumes of interest.

CT

Although CT is less conventional than other modalities in the routine assessment of OA, it provides superior precision in detecting cortical bone alterations, such as osteophyte formation, subchondral cysts, sclerosis, and soft tissue calcifications (70). The integration of phase contrast imaging (PCI) with CT extends this capability to the high-resolution visualization of the knee cartilage matrix (71). Building upon this, advanced computational frameworks have been developed to achieve objective pathological quantification. For instance, geometrical features derived from the scaling index method (SIM) were extracted from PCI-CT volumes of interests (VOIs) (65). To enhance the feature space, a hybrid approach of linear/nonlinear dimensionality reduction and mutual information-based selection was employed, with the refined features subsequently classified using a support vector regression (SVR) model. This methodology demonstrated exceptional performance, yielding an AUROC of 0.96 with a nine-dimensional feature set and maintaining high accuracy (AUROC =0.97) even when reduced to two dimensions, as illustrated in Figure 3B. Further analysis revealed that SIM-derived geometrical features (AUROC =0.90±0.09) significantly outperformed Minkowski Functionals (AUROC range, 0.54–0.78), particularly exceeding the performance of traditional metrics such as volume and surface area (72). These results indicate that SIM-derived features can effectively and automatically characterize chondrocyte organization in the cartilage matrix, providing higher accuracy in classifying healthy and osteoarthritic cartilage.

MRI

Early MRI-based detection and classification of OA initially relied on single quantitative parameter measurements. However, the diagnostic accuracy of these univariate methods was typically limited to approximately 60% due to the significant overlap in parameter values across different degenerative stages (73). To overcome these limitations, subsequent research transitioned toward utilizing image-derived content descriptors. For instance, combining MRI intensity histograms, GLCM, and gray-level run-length matrix (GLRLM) features with an SVM classifier yielded a diagnostic accuracy of 71% OA on the OAI dataset (74). The field subsequently moved toward high-dimensional feature spaces. Notably, the weighted neighbor distance using the compound hierarchy of algorithms representing morphology (WND-CHRM) framework achieved an accuracy of 86% in classifying normal and OA osteochondral plugs by extracting over 2,900 texture and morphological features (75). Building on these advancements, recent studies have increasingly adopted standardized radiomics workflows. By integrating predefined families of first-order, shape, and texture features with ML classifiers, such as multilayer perceptron (MLP) (76), SVM (77), and linear regression (LR) (78), diagnostic accuracy and AUROC values have surpassed 90% for knee OA. Furthermore, related research has not only validated the efficacy of MRI-based subregional texture analysis models in OA severity grading (79) but has also expanded the diagnostic scope from cartilage to peri-articular tissues, including bone marrow edema (80) and the quadriceps fat pad (81). Notably, a combined model integrating radiomic features from the proximal tibia VOI with clinical covariates demonstrated robust diagnostic performance (AUROC =0.80), as illustrated in Figure 3C (21).

Collectively, the selection of imaging modalities in ML-assisted OA assessment dictates the underlying analytical paradigms. X-ray imaging, primarily limited to 2D bone morphology and JSN, is frequently coupled with KL grading for large-scale screening and initial severity staging. Conversely, MRI and CT facilitate a shift toward pathology-oriented classification through high-dimensional soft-tissue features and 3D structural quantification. This synergy between modality physics and handcrafted feature engineering ensures that ML remains a robust, interpretable adjunct for both routine screening and deep phenotyping in clinical OA diagnostics.

DL for OA imaging diagnosis

DL represents a data-driven paradigm that autonomously extracts hierarchical features from large-scale datasets. By circumventing the need for manual feature engineering, DL models achieve superior generalization on unseen data (82). Among these architectures, CNNs, characterized by local connectivity and weight-sharing mechanisms, have emerged as a cornerstone of medical image analysis. Through end-to-end learning of hierarchical representations, CNNs demonstrate remarkable robustness against variability in imaging conditions, ensuring consistent performance across diverse clinical settings (83-85).

X-ray

In the diagnosis of OA based on X-ray images, CNN architectures such as ResNet (86), VGG (87,88), Faster R-CNN (89), DenseNet (90), and InceptionV3 (91) are typically pre-trained on ImageNet. These models are subsequently fine-tuned on public or custom datasets, where model weights are updated using the backpropagation algorithm. Simultaneously, hyperparameters such as learning rate and dropout are optimized through empirical tuning or search strategies (e.g., grid search or Bayesian optimization) to ensure robust model generalization. The refined models are then applied to grade OA severity, as illustrated in Figure 4A (92). Given that conventional grading systems provide a holistic clinical assessment rather than isolated features, researchers have employed CNNs to predict probability distributions across image grades to reduce inter-observer ambiguity. This paradigm generates saliency or attention maps that highlight clinically radiological features, thereby facilitating more objective radiographic staging (53). While well-curated knee joint datasets have led to a predominance of knee-focused research, some research has expanded into hip OA diagnosis (94,95). Notably, the first application of DL to hip OA in 2017 utilized a VGG-16 model initialized with ImageNet weights; when fine-tuned on 420 images, it achieved a sensitivity, specificity, and overall accuracy of 95.0%, 90.7%, and 92.8%, respectively (96). Subsequent studies have integrated ROI extraction and contrast-limited adaptive histogram equalization (CLAHE) with DenseNet-169 architectures, attaining a remarkable 98.7% accuracy across 750 hip X-rays categorized by OA severity (97). More recently, ensemble models combining multiple CNN architectures have been developed to leverage complementary feature representations, further enhancing diagnostic reliability (98).

Figure 4 Methodological frameworks and diagnostic performance of image-based DL models for OA assessment. (A) Fine-tuned DenseNet201 architecture, representative knee X-ray image with ROI annotation, and performance comparison with existing approaches. Adapted from Abdullah et al., Scientific Reports, 2025, CC BY 4.0 (92). (B) Workflow for the generation of DRRs from CT images, classification performance metrics, and visualization of the learned feature maps. Adapted from Gebre et al., Osteoporosis International, 2022, CC BY-NC 4.0 (93). (C) Overview of the proposed framework, representative 2D slice from a 3D knee MRI sequence, and AUROC for OA/non-OA classification. Adapted from Guida et al., Applied Sciences, 2021, CC BY 4.0 (19). AUROC, area under the receiver operating characteristic curve; CT, computed tomography; DL, deep learning; DRRs, digitally reconstructed radiographs; MRI, magnetic resonance imaging; OA, osteoarthritis; ROI, region of interest.

CT

CT provides 3D information about joint bones, although its resolution for soft tissues is lower than that of conventional radiography. Consequently, the KL grading system, originally designed for 2D X-ray images, is not directly applicable to CT. While several attempts have been made to develop CT-based grading systems for OA severity, no widely accepted standard currently exists (99,100). Due to the limited availability of large, publicly accessible CT datasets for training, some studies have generated digitally reconstructed radiographs (DRRs) from CT images, creating 2D summation images that resemble conventional X-rays. These DRRs, combined with radiographs from the CHECK study, were used to evaluate hip OA with a ResNet18 model trained via transfer learning, achieving a balanced accuracy of 82.2% and an AUROC of 0.93, as shown in Figure 4B (93). Another study utilized a Vision Transformer model on CT-based DRRs to classify hip OA severity based on the Crowe and KL grading systems, achieving a one-neighbor class accuracy of 0.95 (33). CNNs have also been applied to characterize the cartilage matrix visualized in PCI-CT (101,102). Additionally, a study examined changes in the trabecular microstructure of the femoral head in hip OA using micro-CT images and a graph theory analysis approach. This study employed a CNN model that classified the images by inputting both the micro-CT images and the extracted trabecular network features, achieving an accuracy of 96.5% (103). Furthermore, DL techniques have also been applied to CT imaging for the diagnosis of temporomandibular joint OA (104,105).

MRI

MRI offers comprehensive, multiparametric visualization of articular structures, capturing both morphological and biochemical changes. To fully harness this high-dimensional data, DL methodologies have been extensively integrated into OA diagnostic workflows. Diverging from conventional handcrafted feature extraction, DL architectures excel at modeling complex 3D spatial anatomy and synthesizing multimodal MRI sequences, facilitating the sensitive detection of subtle structural and biochemical alterations in early-stage OA (106). One study employed a 3D CNN for the automatic detection and staging of meniscal and patellofemoral cartilage lesions; by integrating 3D deep features with an RF classifier in an ensemble framework, the model achieved a classification accuracy of 80.74% (107). Similarly, other studies demonstrated that combining CNN-derived features with RF or SVM classifiers could yield accuracies up to 86.0% for knee OA grading (108). Advancing toward end-to-end learning, transfer learning models have demonstrated even higher performance. For example, a DenseNet201-based MRI architecture, employing a two-stage block-wise fine-tuning strategy, achieved a 92.1% accuracy and a 0.96 sensitivity for knee OA detection (109). Furthermore, 3D CNNs applied to full double-echo steady-state (DESS) sequences attained a test accuracy of 83.0%, as illustrated in Figure 4C (19). More recently, novel architectures have further advanced MRI-based OA assessment. An improved hybrid quantum CNN has achieved a classification accuracy of up to 98.36% for knee OA (110), whereas a hybrid 3D CNN and 3D Vision Transformer model operating on 3D knee MRI attained 90.46% accuracy in classifying cartilage degeneration across multiple stages (111). Beyond the knee, DL-based MRI analysis has proven effective for hip OA. An MRNet-based approach with data augmentation and class-weighted loss has been used to classify cartilage lesions, bone marrow edema, and subchondral cysts, improving balanced accuracy from 53%, 71%, and 56% to 60%, 73%, and 68%, respectively (112).

Driven by the transition from handcrafted features to end-to-end learning, DL has redefined the accuracy benchmarks for OA lesion detection and automated segmentation across all major modalities. While X-ray-based CNNs continue to anchor the current literature owing to the ubiquity of large-scale datasets, the integration of Transformer-based attention mechanisms (113), generative paradigms for data synthesis (114,115), and heterogeneous multi-modal (116) ensembles defines the evolving research trajectory in musculoskeletal AI. These advancements not only enhance the robustness and generalizability of computer-aided diagnosis systems across diverse clinical environments but also underscore the growing potential of data-driven paradigms to provide clinically actionable insights that transcend the inherent limitations of conventional visual assessment.

Assisted medical imaging diagnosis for OP

ML for OP imaging diagnosis

Traditional ML approaches play an important role in OP imaging diagnosis by leveraging radiomic features extracted from medical images. These features, encompassing shape, greyscale intensity, and texture characteristics, serve as inputs for statistical and pattern recognition models for the classification and prediction of OP (117). Furthermore, integrating imaging features with biochemical markers enables ML models to achieve high diagnostic accuracy, highlighting their ability to integrate heterogeneous data sources in support of clinical decision-making (118,119).

X-ray

Early applications of ML in OP diagnosis using X-ray images focused on extracting trabecular bone features from hip radiographs, including structural boundaries, orientation, and joint architecture, with DXA as the reference standard. Through five-fold cross-validation, an average accuracy of 90% and a sensitivity of 90% were achieved (120). Subsequent studies incorporated additional statistical and texture features and evaluated multiple ML classifiers, reporting diagnostic accuracies exceeding 95% (121). Notably, the SVM attained a classification accuracy of 97.87%, with a sensitivity and specificity of 100% and 95.74%, respectively, underscoring the robust potential of ML in interpreting radiographic images. Beyond hip radiographs, periapical dental radiographs have also been investigated for OP screening. A histogram-based trabecular segmentation framework combined with ML classifiers demonstrated effective OP detection, with K-means segmentation combined with an MLP achieving the best performance (accuracy 91.67%, specificity 90.00%, sensitivity 93.33%), as illustrated in Figure 5A (122). Traditional ROI segmentation methods are often constrained by limited accuracy and poor generalizability. Consequently, several studies have proposed DL-based ROI segmentation techniques and reported improved practical performance (125,126). For example, a U-Net-based femur segmentation method achieved an accuracy of 97.50%. Texture features extracted from femur X-rays were then used as input to multiple ML classifiers. An artificial neural network (ANN) achieved the highest accuracy (95.83%) and recall (100%), while SVM achieved the highest specificity (62.50%) (127). These findings indicate that combining DL-based segmentation with ML classifiers can enhance the accuracy and robustness of OP imaging diagnosis.

Figure 5 Methodological frameworks and diagnostic performance of image-based ML models for OP assessment. (A) Workflow overview, representative K-means segmentation results, and performance of the evaluated ML classifiers. Adapted from Widyaningrum et al., International Journal of Dentistry, 2023, CC BY 4.0 (122). (B) GBM model schematic, abdominal CT images examples, and performance metrics of various classifiers. Adapted from Huang et al., BMC Geriatrics, 2022, CC BY 4.0 (123). (C) Overall procedure flowcharts, examples of geometrical features, and classification accuracy distribution over multiple repetitions. Adapted from Najafi et al., Sensors, 2023, CC BY 4.0 (124). CT, computed tomography; GBM, gradient boosting machine; ML, machine learning; MLP, multilayer perceptron; OP, osteoporosis; RBF, radial basis function; SVM, support vector machine.

CT

For OP diagnostic assessment utilizing CT imaging, preprocessing via ROI segmentation remains a fundamental prerequisite. Features encompassing BMD, trabecular bone microarchitecture, and morphological parameters are typically extracted and integrated into ML models. For example, in opportunistic screening via abdominal CT, ROIs delineated along the bilateral psoas muscle margins can yield 826 radiomic features. Following dimensionality reduction via the LASSO algorithm, a subset of discriminative features is typically fed into ML classifiers. Among these, the gradient boosting machine (GBM) model has demonstrated superior performance, yielding an AUROC of 0.86, sensitivity of 0.70, and specificity of 0.92, as illustrated in Figure 5B (123). Furthermore, the integration of radiomic signatures with clinical covariates, quantified via metrics such as the Pearson correlation coefficient (r), has enhanced diagnostic efficacy, with LR-based models achieving an AUROC of 0.962 (128). While earlier methodologies relied on manual ROI segmentation (129-131), recent investigations have pivoted toward automated segmentation frameworks. For instance, automatic proximal femur segmentation in abdominopelvic CT scans has achieved a success rate of 99.7%, which, when coupled with an RF classifier, attained a diagnostic AUROC of 0.946 (132). The clinical significance of precise OP diagnosis lies in the prevention of fragility fractures, most prevalent in the hip, vertebrae, and radius. Although DXA remains the clinical standard for BMD quantification (133), it is constrained by its 2D nature, an inability to distinguish degenerative changes, and the lack of 3D vertebral morphology assessment. Conversely, QCT facilitates 3D BMD evaluation and offers superior precision in risk assessment. By synergistic integration with shape model matching algorithms, QCT enables the automated localization and segmentation of vertebrae for the assessment of osteoporotic VFs, achieving an AUROC of 0.88 (134).

MRI

MRI provides unique diagnostic insights into OP, as the signal intensity in bone marrow sequences is significantly modulated by the relative proportions of adipose tissue, proteins, water content, and cellular components. For instance, radiomic features extracted from lumbar spine MRI were used to construct a multivariate LR model after feature selection, achieving an AUROC of 0.797 in differentiating healthy individuals from osteoporotic patients (135). Concurrently, MRI-derived geometric and textural features of the proximal femur have been evaluated using diverse ML classifiers coupled with genetic algorithm-based feature selection. The optimal SVM model attained a classification accuracy of 89.08%, as illustrated in Figure 5C (124). Beyond unimodal assessments, recent investigations have integrated MRI with CT or X-ray imaging to enhance OP assessment through multimodal feature fusion (136,137). Further research indicates that integrating MRI with clinical risk assessment tools such as the Fracture Risk Assessment Tool (FRAX) can enhance the predictive performance of ML algorithms beyond that achieved by either modality alone. This synergy implies that MRI-based microstructural surrogates provide critical complementary data, thereby refining diagnostic precision and supporting more personalized therapeutic strategies for individuals at high risk of fragility fractures (38).

In summary, ML-assisted approaches for OP imaging have expanded beyond 2D densitometry toward more comprehensive assessments of bone quality and strength. By leveraging high-dimensional radiomic features, these models can help address several limitations of DXA by capturing imaging-derived surrogates of trabecular architecture and marrow-related patterns that are not readily appreciable on routine visual assessment. Notably, the integration of ML with opportunistic CT screening and QCT-based volumetric analysis supports earlier fracture risk stratification. The transition from manual ROI delineation to automated radiomics pipelines, often coupled with feature selection strategies, may enable more objective and individualized estimation of bone fragility. Ultimately, these developments have the potential to facilitate earlier identification and management of patients at elevated risk of fragility fractures.

DL for OP imaging diagnosis

DL has substantially enhanced the capabilities of OP imaging diagnosis by addressing the challenges associated with traditional ML, particularly the reliance on manual feature engineering. By utilizing CNNs to bypass the need for manual feature engineering, researchers can achieve more consistent ROI segmentation and feature derivation. These hybrid models, which combine DL-based extraction with ML classification, have demonstrated improved performance across various imaging modalities (138).

X-ray

Hip fractures, frequently the most debilitating complication of OP, typically necessitate CT or MRI for definitive clinical evaluation. Nevertheless, DL algorithms, particularly CNNs, have demonstrated remarkable efficacy in identifying these fractures using plain radiographs alone. For example, a VGG16 model trained on pelvic radiographs yielded a detection accuracy of 95.5% for intertrochanteric hip fractures, with an AUROC of 0.984 (139). Similarly, an InceptionV3 architecture utilizing transfer learning attained a fracture detection accuracy of 96.9% and an AUROC of 0.9944 (140). Despite such high diagnostic performance, many early classifiers remained black-box systems with limited spatial localization capabilities. To address this, an automated DenseNet-based pipeline was developed to process full-sized radiographs directly. By obviating the requirement for a predefined localization subnetwork, this end-to-end system achieved a sensitivity of 98% and an accuracy of 91%, streamlining the diagnostic workflow without compromising clinical sensitivity (141). While most DL models for OP diagnosis concentrate on the hip (142), spine (138), and hands (143), detection using knee radiographs remains relatively unexplored (144). A pioneering study applied transfer learning to four CNN architectures, including AlexNet, VGG-16, ResNet, and VGG-19, trained on knee X-ray images. Among these, AlexNet attained the highest classification accuracy of 91% (145). Furthermore, a hip OP detection study reported that, among X-ray-only models, EfficientNet-B3 and GoogLeNet delivered the strongest performance. Notably, incorporating clinical variables further improved diagnostic performance, with EfficientNet-B3 achieving an AUROC of 0.9374 and an accuracy of 0.8805, as illustrated in Figure 6A (17).

Figure 6 Methodological frameworks and diagnostic performance of image-based DL models for OP assessment. (A) Network architecture, representative hip radiograph, and performance of the two best-performing CNN models. Adapted from Yamamoto et al., Biomolecules, 2020, CC BY 4.0 (17). (B) Schematic overview of the CNN architecture, representative spinal DXA images, and fracture prediction performance. Adapted from Nissinen et al., Bone Reports, 2021, CC BY 4.0 (146). (C) 3D U-Net with dense blocks, representative VOIs for BMD measurement, and diagnostic performance of the developed system. Adapted from Pan et al., European Radiology, 2020, CC BY 4.0 (18). 3D, three-dimensional; BMD, bone mineral density; CNN, convolutional neural network; DL, deep learning; DXA, dual-energy X-ray absorptiometry; OP, osteoporosis; VOIs, volumes of interest.

DXA

DXA remains the clinical gold standard for diagnosing OP, providing quantitative measurement of areal BMD at the lumbar spine and proximal femur. It facilitates classification into normal, osteopenic, or osteoporotic categories primarily based on T-scores. Recent advances in DL-based systems have enabled fully automated OP detection by utilizing DXA as the reference standard (147). For example, CNNs have been trained using DXA-derived BMD values serving as labels for learning. The training process incorporates techniques such as image augmentation (e.g., noise addition, translation, and rotation) and batch normalization to mitigate overfitting. The model achieved r values of 0.852 and 0.840, with corresponding AUROCs of 0.965 and 0.970 on the internal and external validation sets, respectively (148). Furthermore, CNNs have also been applied to detect VFs from lateral thoracolumbar spine images, achieving an AUROC of 0.94 and a sensitivity of 87.4% (149). In other studies, DXA images were used to predict fragility fractures; however, CNNs trained on these images attained an AUROC of only 0.63, showing only marginal improvement over baseline predictions based on lumbar spine (AUROC =0.62) or hip BMD T-scores (AUROC =0.62), as illustrated in Figure 6B (146).

QCT

DL algorithms integrated with QCT facilitate the precise assessment of volumetric BMD and trabecular microarchitecture, thereby enhancing fracture risk prediction. By providing high-resolution 3D imaging, QCT distinguishes between cortical and trabecular bone, enabling DL models to identify subtle structural alterations associated with early-stage OP. Despite these advantages, broader clinical adoption remains hindered by radiation exposure, high cost, and limited equipment availability. Consequently, recent efforts have focused on opportunistic screening using DL-based BMD quantification from routine CT scans. For example, a system combining a U-Net model and a 3D-CNN was developed to segment vertebral bodies and predict BMD from low-dose chest CTs, achieving a DSC of 86.6% and a strong correlation with reference QCT values (coefficient of determination, R²=0.964–0.968), alongside an AUROC of 0.927 for OP detection, as illustrated in Figure 6C (18). Furthermore, a fully automated deep convolutional neural network (DCNN) method for joint segmentation and BMD measurement has exhibited an excellent correlation with QCT references (r>0.98) (39). Beyond densitometry, hybrid architectures have advanced fracture assessment; for example, a ResNet34 backbone integrated with a long short-term memory (LSTM) sequence classifier achieved 89.2% accuracy in diagnosing VFs across multi-regional CT datasets (150).

MRI

MRI allows high-resolution visualization of bone marrow composition and trabecular integrity without exposing patients to ionizing radiation. Although not yet a routine tool for the clinical diagnosis of systemic OP, MRI provides critical structural insights that complement BMD measurements, showing significant potential for enhancing fracture risk prediction in both research and clinical trials. Recent studies have further highlighted the potential of MRI combined with DL for OP detection. For example, one investigation applied CNN models to T1-weighted, STIR, and T2-weighted lumbar MRI sequences in conjunction with BMD measurements, reporting that T2-weighted images achieved the best diagnostic performance, with an accuracy of 88.5% (151). Another study introduced a DCNN-based approach for OP classification, attaining an accuracy of 96.57% via optimization with the squirrel search algorithm (152). Furthermore, in the context of differential diagnosis, a CNN ensemble utilizing transfer learning achieved an AUROC of 0.9762 when distinguishing between avascular necrosis and transient OP of the hip (153).

In summary, DL has transitioned OP imaging diagnosis from feature-dependent pipelines toward automated, representation-based frameworks across radiographs, DXA, CT/QCT, and MRI. A prominent trend is the integration of localization, segmentation, and downstream prediction, such as BMD estimation and fracture detection, within unified models to minimize manual intervention. Crucially, these approaches enable opportunistic screening from routine clinical examinations, effectively complementing traditional densitometry with high-dimensional image representations. Collectively, these advancements facilitate a more comprehensive and objective evaluation of bone health and fracture risk.

Summary

Both ML and DL have contributed to advances in OA and OP imaging-based assessment by optimizing feature extraction, elevating classification precision, and mitigating the subjectivity inherent in manual interpretation. While ML methodologies remain indispensable in scenarios characterized by restricted datasets and expert-defined features, DL architectures have demonstrated superior performance in large-scale analysis through their autonomous acquisition of hierarchical representations. Despite this progress, several challenges persist, most notably the dependency on high-quality annotated data, substantial computational overhead, and the challenge of maintaining diagnostic robustness across heterogeneous imaging environments. To circumvent these hurdles, current research is converging upon hybrid methodologies that synergize the interpretability of ML with the representational power of DL, alongside strategies such as transfer learning and data augmentation to address data scarcity and class imbalance. Furthermore, the strategic utilization of large-scale public repositories is becoming a cornerstone of robust model development.

Critically, a significant disconnect remains between algorithmic excellence and clinical translation; very few validated models have transitioned from experimental stages into routine radiological workflows. The deployment of AI-driven tools in everyday practice, particularly in resource-constrained regions with limited radiological expertise, remains a paramount objective. Future initiatives should therefore prioritize bridging the gap between algorithmic development and clinical deployment, ensuring that intelligent imaging systems not only augment diagnostic accuracy but also substantively improve the accessibility and global standard of patient care.

Discussion

Multimodal fusion in ML-assisted diagnosis of OA and OP

Multimodal medical image fusion

The evaluation of OA and OP often relies on the use of multiple imaging modalities, each offering complementary information: MRI is particularly effective in visualizing soft tissues, CT provides detailed assessment of bone density and microarchitecture, and X-ray imaging captures skeletal morphology. As no single modality comprehensively characterizes all pathological features, diagnostic accuracy can be enhanced through multimodal medical image fusion (MMIF), which integrates heterogeneous image data to improve sensitivity and reduce diagnostic uncertainty. One example of successful MMIF is the combination of dynamic contrast-enhanced MRI and high-resolution peripheral QCT to detect vessels within cortical bone (154). This approach integrates the high-resolution structural data provided by high-resolution peripheral QCT with the dynamic vascular information from dynamic contrast-enhanced MRI, enabling a more comprehensive analysis of bone porosity and the vascular features associated with it.

DL has accelerated the progress in MMIF. CNNs are widely used for feature extraction, improving segmentation and classification performance (155,156). For example, X-ray, MRI, and clinical patient information into a CNN framework to classify the severity of OA improved classification accuracy from 54% (using MRI alone) and 70% (using X-ray alone) to 76%, with an AUROC of 0.964 (157). To address the limited receptive fields of CNNs, Transformer-based models with self-attention mechanisms have been introduced. These models excel at capturing long-range and cross-modal dependencies, and hybrid CNN-Transformer frameworks have shown considerable promise in medical image fusion (158). Compared to natural images, multimodal medical images exhibit more explicit and clinically relevant long-range dependencies, making effective fusion strategies particularly crucial.

Despite these advancements, several challenges remain. Most current research focuses on brain imaging, resulting in a scarcity of multimodal datasets for musculoskeletal conditions such as OA and OP. Developing and expanding disease-specific multimodal datasets is essential for improving diagnostic accuracy. Furthermore, standardized evaluation criteria for assessing fusion effectiveness are still lacking, which hinders comparability across studies. Technical issues, including registration errors and fusion artifacts, also remain unresolved. Addressing these limitations will be critical for advancing multimodal fusion methods and their robust integration into clinical practice.

Multimodal data fusion

Most current AI applications in medicine rely on single-modal data, such as imaging alone, for the diagnosis of OA and OP. However, in clinical practice, physicians make diagnostic decisions based on integrated evidence, including medical imaging, laboratory results, and clinical records. For OP, combining hip X-ray images with clinical covariates has been shown to enhance the performance of CNNs, increasing accuracy from 0.8407 to 0.8850 and the AUROC from 0.9203 to 0.9374 (17).

A key advantage of multimodal learning is its ability to integrate heterogeneous data sources into a unified architecture without the need for separate processing streams. Ideally, a single model aligns diverse inputs, such as images, sensor signals, and clinical text, into a shared representational space, supporting flexible diagnostic inference (159). Research in this field has evolved from using CNNs for images and recurrent neural networks (RNNs) for text to Transformer-based architectures, which have demonstrated strong performance across multiple modalities (160). Traditional fusion approaches, such as early and late fusion, rely on separate feature extraction followed by simple combination. These methods are limited in capturing complex cross-modal correlations and often require cumbersome preprocessing of text. In contrast, Transformer-based frameworks integrate medical images, unstructured clinical complaints, and structured electronic health records data through bidirectional attention mechanisms, eliminating the need for labor-intensive preprocessing (161). This approach achieved an AUROC of 0.924 in pulmonary disease diagnosis, representing a 12% improvement over models using X-ray images alone. Similarly, an end-to-end multimodal Transformer model can be used to predict the progression of knee OA (162). Additionally, Transformer architectures can leverage large unlabeled datasets through self-supervised pretraining, a crucial advantage given the scarcity of annotated medical data. Developing cross-modal pretraining strategies followed by task-specific fine-tuning on limited labeled data presents a promising direction for building high-performance multimodal frameworks.

Overall, multimodal data fusion is a promising avenue for improving diagnostic accuracy and supporting clinical decision-making in OA and OP. Compared to single-modality approaches, multimodal frameworks offer complementary perspectives by integrating anatomical, functional, and clinical information, thereby reducing diagnostic uncertainty. However, several challenges remain, including limited data availability, a lack of standardized multimodal datasets, and the need for robust algorithms that can generalize across diverse institutions and patient populations. Future research should prioritize the establishment of large-scale, well-annotated multimodal cohorts, the development of standardized evaluation protocols, and the validation of multimodal frameworks in prospective clinical settings. Addressing these challenges will be essential to bridge the gap between research and clinical application, ultimately enabling multimodal AI systems to support routine diagnosis and personalized treatment strategies for OA and OP.

Predicting disease progression of OA and OP using AI

ML and DL algorithms are applied not only to classification tasks, such as diagnosing OA and OP and grading disease severity, but also to regression-based prediction of disease progression. Accurate forecasting of OA and OP development enables healthcare professionals to anticipate disease trajectories, implement personalized interventions, reduce healthcare costs, and improve patient selection in clinical trials, thereby potentially facilitating the development of new therapies.

For OA, ML and DL approaches have been explored for predicting disease incidence, the progression of joint degeneration, and the likelihood of future surgical intervention (163). For example, one ML-based approach utilized principal component analysis to reduce the dimensionality of MRI-derived cartilage injury indices before evaluating performance across multiple algorithms, including ANN, SVM, RF, and Naive Bayes. Among these, the RF algorithm demonstrated superior efficacy in predicting medial JSN, yielding an AUROC of 0.761 and an F-measure of 0.743 (164). Moving beyond traditional ML, DL models enable the direct integration of raw radiographic data, clinical examination results, and previous medical history of the patient to predict OA progression, achieving an AUROC of 0.79 (165). In a related direction, TransUNet-based radiographic segmentation enabled automated JSW measurement (DSC =0.889), and an XGBoost regressor integrating JSW and clinical variables achieved strong 72-month JSW prediction (mean absolute error =0.48) (166). Furthermore, CNN-based architectures have proven highly effective in predicting the 9-year risk of total knee arthroplasty, significantly outperforming baseline models with an AUROC of 0.87 (167).

In OP management, conventional risk assessment tools such as FRAX (168) estimate 10-year fracture probability based on clinical variables; however, their efficacy is often constrained by static modeling assumptions. In contrast, ML and DL frameworks offer superior predictive accuracy by synthesizing multi-dimensional datasets. For instance, XGBoost models integrating biochemical markers with BMD data have demonstrated high discriminative power with an AUROC of 0.848 (169). Beyond clinical markers, ANN and SVR architectures have been successfully applied to QCT-derived imaging to evaluate vertebral strength (170), while lifestyle-integrated ANN models have proven effective in assessing OP risk among postmenopausal women (171). Notably, LightGBM can accurately identify low BMD (AUROC =0.961) using only non-invasive variables, such as anthropometrics and laboratory results, thereby potentially bypassing the requirement for DXA (172).

Notwithstanding these advancements, current prediction models exhibit several limitations. Most existing approaches fail to capture the temporal dynamics of disease progression, as the application of time-series architectures (e.g., RNNs) to longitudinal data remains sparse. Additionally, there is a disproportionate reliance on imaging features, while the integration of multi-modal inputs, including genomics, biomarkers, and treatment history, remains underexplored. Future research should prioritize the development of longitudinal, multi-modal predictive frameworks, enhance model explainability, and conduct cross-population validation to ensure clinical robustness and generalizability.

Barriers to clinical translation into routine practice

Despite the burgeoning potential of ML and DL in musculoskeletal imaging, their integration into routine clinical practice for OA and OP remains limited, highlighting a persistent disparity between experimental research and clinical implementation. A primary bottleneck stems from data-related constraints. Most existing models are trained on homogeneous datasets from controlled cohorts, such as the OAI or MOST. Consequently, these models often experience performance degradation when applied to heterogeneous real-world data due to variations in imaging hardware, acquisition protocols, and patient demographics, all of which compromise generalizability and introduce algorithmic bias. Furthermore, the prevalence of imbalanced datasets, characterized by the underrepresentation of early-stage disease and minority populations, hampers the development of robust, universally applicable diagnostic frameworks.

Beyond data limitations, technical and infrastructural hurdles significantly impede clinical adoption. The black-box nature of DL architectures, particularly CNNs, results in a lack of transparency that fosters clinician skepticism in high-stakes scenarios such as fracture risk assessment. Moreover, most current AI systems lack the interoperability required for seamless integration into existing clinical infrastructures, such as hospital information systems, picture archiving and communication systems, and electronic health records. Deployment often necessitates bespoke software pipelines, specialized personnel training, and extensive workflow reconfiguration, collectively escalating implementation overheads and disrupting established clinical protocols. These challenges are further compounded by concerns regarding computational scalability and the inability of complex models to provide real-time insights in high-throughput clinical settings.

At the regulatory and societal levels, the clinical translation of AI-assisted diagnostics faces equally significant challenges. Regulatory frameworks for AI-driven tools are still in a state of flux, with persistent uncertainties surrounding standardized benchmarks for safety, clinical efficacy, and liability attribution. Furthermore, patient privacy and data security remain critical ethical imperatives, particularly as model development and validation necessitate large-scale, multi-institutional data sharing. Financial viability also presents a significant barrier, as reimbursement paradigms for AI-integrated diagnostics remain largely undefined across most healthcare systems. This lack of clarity regarding cost-effectiveness and long-term sustainability, coupled with the absence of robust financial incentives, often deters healthcare institutions from investing in these emerging technologies.

Addressing these multifaceted barriers necessitates orchestrated efforts among researchers, clinicians, and policymakers. Expanding multicenter studies across diverse populations is imperative to bolster model generalizability and mitigate inherent biases. Concurrently, the development of interpretable and explainable AI is essential to foster clinical confidence and bridge the transparency gap between algorithmic outputs and medical decision-makers. The harmonization of imaging protocols, data formats, and evaluation metrics will further enhance reproducibility and cross-institutional interoperability. Finally, establishing collaborative frameworks that unite healthcare providers, regulatory bodies, and industry stakeholders is vital for defining clear regulatory pathways, ensuring stringent data privacy, and designing sustainable reimbursement models. Such integrated, multi-stakeholder strategies are essential for AI-driven imaging systems to transition from experimental paradigms into reliable, routinely integrated tools for the diagnosis and longitudinal management of OA and OP.

Conclusions

The rapid advancement of ML and DL has significantly enhanced imaging-based diagnosis and management of OA and OP. These technologies have shown great potential in automating feature extraction, improving diagnostic accuracy, and enabling earlier detection of disease progression. However, translating these research breakthroughs into routine clinical practice requires overcoming several persistent challenges. Based on the current systematic review, four key future research directions are identified.

DL algorithms require large, high-quality annotated datasets to achieve robust performance. However, publicly available imaging datasets for OA and OP remain limited. Expanding dataset size and diversity through collaborative initiatives, global data sharing, and improved annotation strategies is crucial. Simultaneously, unsupervised and weakly supervised learning approaches offer promising alternatives to reduce reliance on manual labeling. Future research should also focus on multimodal image and data fusion, such as combining X-ray, MRI, CT, and clinical records, to enable a more comprehensive disease characterization and improve diagnostic precision.

Despite their high accuracy, many DL models function as black boxes, which limits their adoption in clinical settings where interpretability and accountability are vital. Advances in explainable AI, uncertainty quantification, and visualization techniques (e.g., saliency maps and attention mechanisms) will be critical for building trust among clinicians. Additionally, integrating ML and DL models into medical imaging devices and clinical workflows remains underexplored. Bridging this gap will facilitate the real-time application of algorithms for diagnosis, treatment planning, and clinical decision-making within actual healthcare environments.

Current models often exhibit limited generalizability due to dataset and domain shifts across different populations, imaging protocols, and healthcare systems. Strategies such as transfer learning, domain adaptation, and federated learning should be further developed to address this issue. These approaches can enhance model robustness while ensuring data privacy and security, both essential considerations in multicenter clinical collaborations.

The future of AI-assisted imaging for OA and OP extends beyond single-task applications toward comprehensive, end-to-end diagnostic systems. These systems could integrate lesion detection, disease grading, treatment decision support, and longitudinal monitoring across multiple anatomical sites and imaging modalities. By enabling continuous assessment and personalized treatment planning, these integrated systems have the potential to transform musculoskeletal care and extend their utility to broader medical applications.

In summary, ML and DL are poised to transform the imaging diagnosis of OA and OP by improving accuracy, efficiency, and accessibility. To fully realize this potential, future work must prioritize the development of diverse and representative datasets, enhance model interpretability, ensure generalizability across settings, and facilitate seamless integration into clinical workflows. With sustained progress, AI-powered imaging systems could evolve from research prototypes to essential tools in musculoskeletal medicine, ultimately improving patient outcomes worldwide.

Acknowledgments

None.

Footnote

Reporting Checklist: The authors have completed the Narrative Review reporting checklist. Available at https://qims.amegroups.com/article/view/10.21037/qims-2025-aw-2168/rc

Funding: This work was supported by the National Natural Science Foundation of Young Scholars of China (No. 82205147), the National Natural Science Foundation of China (Nos. 81970261 and 62472046), the Program of Natural Science Foundation of Guangdong Province, China (No. 2022A1515010385), the Guangdong Provincial Sports Bureau Scientific Research Project (No. GDSS2024N038), and the Research Project on Theory and Practice of Guangdong-Hong Kong-Macao Collaborative Development for the 15th National Games and Special Olympics (No. 2025GBA-524).

Conflicts of Interest: All authors have completed the ICMJE uniform disclosure form (available at https://qims.amegroups.com/article/view/10.21037/qims-2025-aw-2168/coif). Y.Y. reports that this work was supported by the National Natural Science Foundation of Young Scholars of China (No. 82205147), the National Natural Science Foundation of China (Nos. 81970261 and 62472046), the Program of Natural Science Foundation of Guangdong Province, China (No. 2022A1515010385), the Guangdong Provincial Sports Bureau Scientific Research Project (No. GDSS2024N038), and the Research Project on Theory and Practice of Guangdong-Hong Kong-Macao Collaborative Development for the 15th National Games and Special Olympics (No. 2025GBA-524). The other authors have no conflicts of interest to declare.

Ethical Statement: The authors are accountable for all aspects of the work in ensuring that questions related to the accuracy or integrity of any part of the work are appropriately investigated and resolved.

Open Access Statement: This is an Open Access article distributed in accordance with the Creative Commons Attribution-NonCommercial-NoDerivs 4.0 International License (CC BY-NC-ND 4.0), which permits the non-commercial replication and distribution of the article with the strict proviso that no changes or edits are made and the original work is properly cited (including links to both the formal publication through the relevant DOI and the license). See: https://creativecommons.org/licenses/by-nc-nd/4.0/.

References

Wan M, Gray-Gaillard EF, Elisseeff JH. Cellular senescence in musculoskeletal homeostasis, diseases, and regeneration. Bone Res 2021;9:41. [Crossref] [PubMed]
Bijlsma JW, Berenbaum F, Lafeber FP. Osteoarthritis: an update with relevance for clinical practice. Lancet 2011;377:2115-26. [Crossref] [PubMed]
Rachner TD, Khosla S, Hofbauer LC. Osteoporosis: now and the future. Lancet 2011;377:1276-87. [Crossref] [PubMed]
Long H, Liu Q, Yin H, Wang K, Diao N, Zhang Y, Lin J, Guo A. Prevalence Trends of Site-Specific Osteoarthritis From 1990 to 2019: Findings From the Global Burden of Disease Study 2019. Arthritis Rheumatol 2022;74:1172-83. [Crossref] [PubMed]
Cieza A, Causey K, Kamenov K, Hanson SW, Chatterji S, Vos T. Global estimates of the need for rehabilitation based on the Global Burden of Disease study 2019: a systematic analysis for the Global Burden of Disease Study 2019. Lancet 2021;396:2006-17. [Crossref] [PubMed]
Bussières AE, Peterson C, Taylor JA. Diagnostic imaging practice guidelines for musculoskeletal complaints in adults--an evidence-based approach: introduction. J Manipulative Physiol Ther 2007;30:617-83. [Crossref] [PubMed]
Lawford BJ, Bennell KL, Ewald D, Li P, De Silva A, Pardo J, Capewell B, Hall M, Haber T, Egerton T, Filbay S, Dobson F, Hinman RS. Effects of X-ray-based diagnosis and explanation of knee osteoarthritis on patient beliefs about osteoarthritis management: A randomised clinical trial. PLoS Med 2025;22:e1004537. [Crossref] [PubMed]
Wang X, Ji C, Li S, Wang K, He M, Yu Z, Weng Y, Jiang W, Tang X, Guo D, Qin Y. Quantitative computed tomography provides improved accuracy for diagnosis of lumbar osteoporosis in patients with facet joint osteoarthritis: a cross-sectional study. Osteoporos Int 2025;36:1671-80. [Crossref] [PubMed]
Sollmann N, Löffler MT, Kronthaler S, Böhm C, Dieckmeyer M, Ruschke S, Kirschke JS, Carballido-Gamio J, Karampinos DC, Krug R, Baum T. MRI-Based Quantitative Osteoporosis Imaging at the Spine and Femur. J Magn Reson Imaging 2021;54:12-35. [Crossref] [PubMed]
Kohn MD, Sassoon AA, Fernando ND. Classifications in Brief: Kellgren-Lawrence Classification of Osteoarthritis. Clin Orthop Relat Res 2016;474:1886-93. [Crossref] [PubMed]
Pham T, van der Heijde D, Altman RD, Anderson JJ, Bellamy N, Hochberg M, Simon L, Strand V, Woodworth T, Dougados M. OMERACT-OARSI initiative: Osteoarthritis Research Society International set of responder criteria for osteoarthritis clinical trials revisited. Osteoarthritis Cartilage 2004;12:389-99. [Crossref] [PubMed]
Guglielmi G, Muscarella S, Bazzocchi A. Integrated imaging approach to osteoporosis: state-of-the-art review and update. Radiographics 2011;31:1343-64. [Crossref] [PubMed]
Fleming RM, Fleming MR, Dooley WC, Chaudhuri TK. The importance of differentiating between qualitative, semi-quantitative, and quantitative imaging-close only counts in horseshoes. Eur J Nucl Med Mol Imaging 2020;47:753-5. [Crossref] [PubMed]
Calivà F, Namiri NK, Dubreuil M, Pedoia V, Ozhinsky E, Majumdar S. Studying osteoarthritis with artificial intelligence applied to magnetic resonance imaging. Nat Rev Rheumatol 2022;18:112-21. [Crossref] [PubMed]
Ma F, Sun T, Liu L, Jing H. Detection and diagnosis of chronic kidney disease using deep learning-based heterogeneous modified artificial neural network. Future Gener Comput Syst 2020;111:17-26.
Rana M, Bhushan M. Machine learning and deep learning approach for medical image analysis: diagnosis to detection. Multimed Tools Appl 2022. [Epub ahead of print]. doi: 10.1007/s11042-022-14305-w.
Yamamoto N, Sukegawa S, Kitamura A, Goto R, Noda T, Nakano K, Takabatake K, Kawai H, Nagatsuka H, Kawasaki K, Furuki Y, Ozaki T. Deep Learning for Osteoporosis Classification Using Hip Radiographs and Patient Clinical Covariates. Biomolecules 2020;10:1534. [Crossref] [PubMed]
Pan Y, Shi D, Wang H, Chen T, Cui D, Cheng X, Lu Y. Automatic opportunistic osteoporosis screening using low-dose chest computed tomography scans obtained for lung cancer screening. Eur Radiol 2020;30:4107-16. [Crossref] [PubMed]
Guida C, Zhang M, Shan J. Knee Osteoarthritis Classification Using 3D CNN and MRI. Appl Sci 2021;11:5196.
Nishiyama D, Iwasaki H, Taniguchi T, Fukui D, Yamanaka M, Harada T, Yamada H. Deep generative models for automated muscle segmentation in computed tomography scanning. PLoS One 2021;16:e0257371. [Crossref] [PubMed]
Hirvasniemi J, Klein S, Bierma-Zeinstra S, Vernooij MW, Schiphof D, Oei EHG. A machine learning approach to distinguish between knees without and with osteoarthritis using MRI-based radiomic features from tibial bone. Eur Radiol 2021;31:8513-21. [Crossref] [PubMed]
Dimai HP. New Horizons: Artificial Intelligence Tools for Managing Osteoporosis. J Clin Endocrinol Metab 2023;108:775-83. [Crossref] [PubMed]
Neubauer M, Moser L, Neugebauer J, Raudner M, Wondrasch B, Führer M, Emprechtinger R, Dammerer D, Ljuhar R, Salzlechner C, Nehrer S. Artificial-Intelligence-Aided Radiographic Diagnostic of Knee Osteoarthritis Leads to a Higher Association of Clinical Findings with Diagnostic Ratings. J Clin Med 2023;12:744. [Crossref] [PubMed]
Kim YJ, Lee SR, Choi JY, Kim KG. Using Convolutional Neural Network with Taguchi Parametric Optimization for Knee Segmentation from X-Ray Images. Biomed Res Int 2021;2021:5521009. [Crossref] [PubMed]
Maqsood S, Maqsood N, Shahid S, Subhan FE, Sarwar MA, Yousufi M, Qurthobi A, Zafar A, Khan MA, Damaševičius R, Maskeliūnas R. Knee osteoarthritis network: A hybrid transformer-based approach for enhanced detection and grading of knee osteoarthritis. Eng Appl Artif Intell 2025;159:111751.
Brahim A, Jennane R, Riad R, Janvier T, Khedher L, Toumi H, Lespessailles E. A decision support tool for early detection of knee OsteoArthritis using X-ray imaging and machine learning: Data from the OsteoArthritis Initiative. Comput Med Imaging Graph 2019;73:11-8. [Crossref] [PubMed]
Halilaj E, Le Y, Hicks JL, Hastie TJ, Delp SL. Modeling and predicting osteoarthritis progression: data from the osteoarthritis initiative. Osteoarthritis Cartilage 2018;26:1643-50. [Crossref] [PubMed]
Ambellan F, Tack A, Ehlke M, Zachow S. Automated segmentation of knee bone and cartilage combining statistical shape knowledge and convolutional neural networks: Data from the Osteoarthritis Initiative. Med Image Anal 2019;52:109-18. [Crossref] [PubMed]
Pedoia V, Lee J, Norman B, Link TM, Majumdar S. Diagnosing osteoarthritis from T(2) maps using deep learning: an analysis of the entire Osteoarthritis Initiative baseline cohort. Osteoarthritis Cartilage 2019;27:1002-10. [Crossref] [PubMed]
Wang T, Liu H, Zhao W, Cao P, Li J, Chen T, Ruan G, Zhang Y, Wang X, Dang Q, Zhang M, Tack A, Hunter D, Ding C, Li S. Predicting knee osteoarthritis progression using neural network with longitudinal MRI radiomics, and biochemical biomarkers: A modeling study. PLoS Med 2025;22:e1004665. [Crossref] [PubMed]
Zhao J, Jiang T, Lin Y, Chan LC, Chan PK, Wen C, Chen H. Adaptive Fusion of Deep Learning With Statistical Anatomical Knowledge for Robust Patella Segmentation From CT Images. IEEE J Biomed Health Inform 2024;28:2842-53. [Crossref] [PubMed]
Marsilio L, Marzorati D, Rossi M, Moglia A, Mainardi L, Manzotti A, Cerveri P. Cascade learning in multi-task encoder-decoder networks for concurrent bone segmentation and glenohumeral joint clinical assessment in shoulder CT scans. Artif Intell Med 2025;165:103131. [Crossref] [PubMed]
Masuda M, Soufi M, Otake Y, Uemura K, Kono S, Takashima K, Hamada H, Gu Y, Takao M, Okada S, Sugano N, Sato Y. Automatic hip osteoarthritis grading with uncertainty estimation from computed tomography using digitally-reconstructed radiographs. Int J Comput Assist Radiol Surg 2024;19:903-15. [Crossref] [PubMed]
Kim KC, Cho HC, Jang TJ, Choi JM, Seo JK. Automatic detection and segmentation of lumbar vertebrae from X-ray images for compression fracture evaluation. Comput Methods Programs Biomed 2021;200:105833. [Crossref] [PubMed]
Zhang B, Yu K, Ning Z, Wang K, Dong Y, Liu X, et al. Deep learning of lumbar spine X-ray for osteopenia and osteoporosis screening: A multicenter retrospective cohort study. Bone 2020;140:115561. [Crossref] [PubMed]
Deniz CM, Xiang S, Hallyburton RS, Welbeck A, Babb JS, Honig S, Cho K, Chang G. Segmentation of the Proximal Femur from MR Images using Deep Convolutional Neural Networks. Sci Rep 2018;8:16485. [Crossref] [PubMed]
Yabu A, Hoshino M, Tabuchi H, Takahashi S, Masumoto H, Akada M, et al. Using artificial intelligence to diagnose fresh osteoporotic vertebral fractures on magnetic resonance images. Spine J 2021;21:1652-8. [Crossref] [PubMed]
Ferizi U, Besser H, Hysi P, Jacobs J, Rajapakse CS, Chen C, Saha PK, Honig S, Chang G. Artificial Intelligence Applied to Osteoporosis: A Performance Comparison of Machine Learning Algorithms in Predicting Fragility Fractures From MRI Data. J Magn Reson Imaging 2019;49:1029-38. [Crossref] [PubMed]
Fang Y, Li W, Chen X, Chen K, Kang H, Yu P, Zhang R, Liao J, Hong G, Li S. Opportunistic osteoporosis screening in multi-detector CT images using deep convolutional neural networks. Eur Radiol 2021;31:1831-42. [Crossref] [PubMed]
Muehlematter UJ, Mannil M, Becker AS, Vokinger KN, Finkenstaedt T, Osterhoff G, Fischer MA, Guggenberger R. Vertebral body insufficiency fractures: detection of vertebrae at risk on standard CT images using texture analysis and machine learning. Eur Radiol 2019;29:2207-17. [Crossref] [PubMed]
Zhou SK, Greenspan H, Davatzikos C, Duncan JS, van Ginneken B, Madabhushi A, Prince JL, Rueckert D, Summers RM. A review of deep learning in medical imaging: Imaging traits, technology trends, case studies with progress highlights, and future promises. Proc IEEE Inst Electr Electron Eng 2021;109:820-38.
Faber BG, Ebsim R, Saunders FR, Frysz M, Lindner C, Gregory JS, Aspden RM, Harvey NC, Davey Smith G, Cootes T, Tobias JH. A novel semi-automated classifier of hip osteoarthritis on DXA images shows expected relationships with clinical outcomes in UK Biobank. Rheumatology (Oxford) 2022;61:3586-95. [Crossref] [PubMed]
Bousson V, Lowitz T, Laouisset L, Engelke K, Laredo JD. CT imaging for the investigation of subchondral bone in knee osteoarthritis. Osteoporos Int 2012;23:S861-5. [Crossref] [PubMed]
Yokota S, Ishizu H, Miyazaki T, Takahashi D, Iwasaki N, Shimizu T. Osteoporosis, Osteoarthritis, and Subchondral Insufficiency Fracture: Recent Insights. Biomedicines 2024;12:843. [Crossref] [PubMed]
Ota S, Sasaki E, Sasaki S, Chiba D, Kimura Y, Yamamoto Y, Kumagai M, Ando M, Tsuda E, Ishibashi Y. Relationship between abnormalities detected by magnetic resonance imaging and knee symptoms in early knee osteoarthritis. Sci Rep 2021;11:15179. [Crossref] [PubMed]
Lento PH, Primack S. Advances and utility of diagnostic ultrasound in musculoskeletal medicine. Curr Rev Musculoskelet Med 2008;1:24-31. [Crossref] [PubMed]
Shin Y, Yang J, Lee YH, Kim S. Artificial intelligence in musculoskeletal ultrasound imaging. Ultrasonography 2021;40:30-44. [Crossref] [PubMed]
Fatima M, Pasha M. Survey of machine learning algorithms for disease diagnostic. Journal of Intelligent Learning Systems and Applications 2017;9:1-16.
Nichols JA, Herbert Chan HW, Baker MAB. Machine learning: applications of artificial intelligence to imaging and diagnosis. Biophys Rev 2019;11:111-8. [Crossref] [PubMed]
Hesamian MH, Jia W, He X, Kennedy P. Deep Learning Techniques for Medical Image Segmentation: Achievements and Challenges. J Digit Imaging 2019;32:582-96. [Crossref] [PubMed]
Thomson J, O’Neill T, Felson D, Cootes T. Automated Shape and Texture Analysis for Detection of Osteoarthritis from Radiographs of the Knee. In: Navab N, Hornegger J, Wells WM, Frangi A, editors. Medical Image Computing and Computer-Assisted Intervention -- MICCAI 2015. Cham: Springer; 2015. p. 127-34. (Lecture Notes in Computer Science; vol. 9350).
Gaj S, Yang M, Nakamura K, Li X. Automated cartilage and meniscus segmentation of knee MRI with conditional generative adversarial networks. Magn Reson Med 2020;84:437-49. [Crossref] [PubMed]
Tiulpin A, Thevenot J, Rahtu E, Lehenkari P, Saarakkala S. Automatic Knee Osteoarthritis Diagnosis from Plain Radiographs: A Deep Learning-Based Approach. Sci Rep 2018;8:1727. [Crossref] [PubMed]
Tajbakhsh N, Shin JY, Gurudu SR, Hurst RT, Kendall CB, Gotway MB, Liang Jianming. Convolutional Neural Networks for Medical Image Analysis: Full Training or Fine Tuning? IEEE Trans Med Imaging 2016;35:1299-312. [Crossref] [PubMed]
Eckstein F, Wirth W, Nevitt MC. Recent advances in osteoarthritis imaging--the osteoarthritis initiative. Nat Rev Rheumatol 2012;8:622-30. [Crossref] [PubMed]
Segal NA, Nevitt MC, Gross KD, Hietpas J, Glass NA, Lewis CE, Torner JC. The Multicenter Osteoarthritis Study: opportunities for rehabilitation research. PM R 2013;5:647-54. Erratum in: PM R 2013;5:987. [Crossref] [PubMed]
Jansen MP, Welsing PMJ, Vincken KL, Mastbergen SC. Performance of knee image digital analysis of radiographs of patients with end-stage knee osteoarthritis. Osteoarthritis Cartilage 2021;29:1530-9. [Crossref] [PubMed]
Gornale SS, Patravali PU, Manza RR. Detection of osteoarthritis using knee x-ray image analyses: a machine vision based approach. Int J Comput Appl 2016;145:20-6.
Khamparia A, Pandey B, Al‐Turjman F, Podder P. An intelligent IOMT enabled feature extraction method for early detection of knee arthritis. Expert Syst 2023;40:e12784.
Gornale SS, Patravali PU, Hiremath PS. Identification of region of interest for assessment of knee osteoarthritis in radiographic images. Int J Med Eng Inform 2021;13:64.
Chan S, Dittakan K, Garcia-Constantino M. Image texture analysis for medical image mining: a comparative study direct to osteoarthritis classification using knee X-ray image. Int J Adv Sci Eng Inf Technol 2020;10:2189-99.
Raza A, Phan TL, Li HC, Hieu NV, Nghia TT, Ching CTS. A comparative study of machine learning classifiers for enhancing knee osteoarthritis diagnosis. Information 2024;15:183.
Subramoniam Barani. Rajini. A non-invasive computer aided diagnosis of osteoarthritis from digital x-ray images. Biomed Res 2015;26:721-9.
Fatema K, Rony MAH, Azam S, Mukta MSH, Karim A, Hasan MZ, Jonkman M. Development of an automated optimal distance feature-based decision system for diagnosing knee osteoarthritis using segmented X-ray images. Heliyon 2023;9:e21703. [Crossref] [PubMed]
Nagarajan MB, Coan P, Huber MB, Diemoz PC, Wismüller A. Integrating dimension reduction and out-of-sample extension in automated classification of ex vivo human patellar cartilage on phase contrast X-ray computed tomography. PLoS One 2015;10:e0117157. [Crossref] [PubMed]
Pan J, Wu Y, Tang Z, Sun K, Li M, Sun J, Liu J, Tian J, Shen B. Automatic knee osteoarthritis severity grading based on X-ray images using a hierarchical classification method. Arthritis Res Ther 2024;26:203. [Crossref] [PubMed]
Prasetyo SY, Nabiilah GZ. Integrating VGG Re-trained Feature Extraction with Machine Learning for Knee Osteoarthritis Severity Levels Detection Using X-ray Images. ITEGAM-JETIA. 2025;11:36-42.
Islam MS, Rony MAT. CDK: A novel high-performance transfer feature technique for early detection of osteoarthritis. J Pathol Inform 2024;15:100382. [Crossref] [PubMed]
Bose ASC, Srinivasan C, Joy SI. Optimized feature selection for enhanced accuracy in knee osteoarthritis detection and severity classification with machine learning. Biomed Signal Process Control 2024;97:106670.
Sakellariou G, Conaghan PG, Zhang W, Bijlsma JWJ, Boyesen P, D'Agostino MA, Doherty M, Fodor D, Kloppenburg M, Miese F, Naredo E, Porcheret M, Iagnocco A. EULAR recommendations for the use of imaging in the clinical management of peripheral joint osteoarthritis. Ann Rheum Dis 2017;76:1484-94. [Crossref] [PubMed]
Nagarajan MB, Coan P, Huber MB, Diemoz PC, Glaser C, Wismüller A. Characterizing healthy and osteoarthritic knee cartilage on phase contrast CT with geometric texture features. Proc SPIE Int Soc Opt Eng 2013;8672:86721J. [Crossref] [PubMed]
Nagarajan MB, Coan P, Huber MB, Diemoz PC, Wismüller A. Volumetric quantitative characterization of human patellar cartilage with topological and geometrical features on phase-contrast X-ray computed tomography. Med Biol Eng Comput 2015;53:1211-20. [Crossref] [PubMed]
Lukas VA, Fishbein KW, Lin PC, Schär M, Schneider E, Neu CP, Spencer RG, Reiter DA. Classification of histologically scored human knee osteochondral plugs by quantitative analysis of magnetic resonance images at 3T. J Orthop Res 2015;33:640-50. [Crossref] [PubMed]
Urish KL, Keffalas MG, Durkin JR, Miller DJ, Chu CR, Mosher TJ. T2 texture index of cartilage can predict early symptomatic OA progression: data from the osteoarthritis initiative. Osteoarthritis Cartilage 2013;21:1550-7. [Crossref] [PubMed]
Ashinsky BG, Coletta CE, Bouhrara M, Lukas VA, Boyle JM, Reiter DA, Neu CP, Goldberg IG, Spencer RG. Machine learning classification of OARSI-scored human articular cartilage using magnetic resonance imaging. Osteoarthritis Cartilage 2015;23:1704-12. [Crossref] [PubMed]
Peuna A, Thevenot J, Saarakkala S, Nieminen MT, Lammentausta E. Machine learning classification on texture analyzed T2 maps of osteoarthritic cartilage: oulu knee osteoarthritis study. Osteoarthritis Cartilage 2021;29:859-69. [Crossref] [PubMed]
Xue Z, Wang L, Sun Q, Xu J, Liu Y, Ai S, Zhang L, Liu C. Radiomics analysis using MR imaging of subchondral bone for identification of knee osteoarthritis. J Orthop Surg Res 2022;17:414. [Crossref] [PubMed]
Cui T, Liu R, Jing Y, Fu J, Chen J. Development of machine learning models aiming at knee osteoarthritis diagnosing: an MRI radiomics analysis. J Orthop Surg Res 2023;18:375. [Crossref] [PubMed]
Nagawa K, Hara Y, Kakemoto S, Shiratori T, Kaizu A, Koyama M, Tsuchihashi S, Shimizu H, Inoue K, Sugita N, Kozawa E. Using magnetic resonance imaging-based subregional texture analysis models to classify knee osteoarthritis severity by compartment. Sci Rep 2025;15:36173. [Crossref] [PubMed]
Li X, Chen W, Liu D, Chen P, Li P, Li F, Yuan W, Wang S, Chen C, Chen Q, Li F, Guo S, Hu Z. Radiomics analysis using magnetic resonance imaging of bone marrow edema for diagnosing knee osteoarthritis. Front Bioeng Biotechnol 2024;12:1368188. [Crossref] [PubMed]
Lyu L, Ren J, Lu W, Zhong J, Song Y, Li Y, Yao W. A machine learning-based radiomics approach for differentiating patellofemoral osteoarthritis from non-patellofemoral osteoarthritis using Q-Dixon MRI. Front Sports Act Living 2025;7:1535519. [Crossref] [PubMed]
Deokar DD, Patil CG. Effective feature extraction based automatic knee osteoarthritis detection and classification using neural network. Int J Eng Tech 2015;1:134-9.
Esteva A, Kuprel B, Novoa RA, Ko J, Swetter SM, Blau HM, Thrun S. Dermatologist-level classification of skin cancer with deep neural networks. Nature 2017;542:115-8. [Crossref] [PubMed]
Cheng JZ, Ni D, Chou YH, Qin J, Tiu CM, Chang YC, Huang CS, Shen D, Chen CM. Computer-Aided Diagnosis with Deep Learning Architecture: Applications to Breast Lesions in US Images and Pulmonary Nodules in CT Scans. Sci Rep 2016;6:24454. [Crossref] [PubMed]
He W, Liu T, Han Y, Ming W, Du J, Liu Y, Yang Y, Wang L, Jiang Z, Wang Y, Yuan J, Cao C. A review: The detection of cancer cells in histopathology based on machine vision. Comput Biol Med 2022;146:105636. [Crossref] [PubMed]
Antony J, McGuinness K, O’Connor NE, Moran K. Quantifying radiographic knee osteoarthritis severity using deep convolutional neural networks. In: 2016 23rd International Conference on Pattern Recognition (ICPR). IEEE; 2016. p. 1195-200.
Chen P, Gao L, Shi X, Allen K, Yang L. Fully automatic knee osteoarthritis severity grading using deep neural networks with a novel ordinal loss. Comput Med Imaging Graph 2019;75:84-92. [Crossref] [PubMed]
Tiulpin A, Saarakkala S. Automatic Grading of Individual Knee Osteoarthritis Features in Plain Radiographs Using Deep Convolutional Neural Networks. Diagnostics (Basel) 2020;10:932. [Crossref] [PubMed]
Liu B, Luo J, Huang H. Toward automatic quantification of knee osteoarthritis severity using improved Faster R-CNN. Int J Comput Assist Radiol Surg 2020;15:457-66. [Crossref] [PubMed]
Norman B, Pedoia V, Noworolski A, Link TM, Majumdar S. Applying Densely Connected Convolutional Neural Networks for Staging Osteoarthritis Severity from Plain Radiographs. J Digit Imaging 2019;32:471-7. [Crossref] [PubMed]
Sohail M, Azad MM, Kim HS. Knee osteoarthritis severity detection using deep inception transfer learning. Comput Biol Med 2025;186:109641. [Crossref] [PubMed]
Abdullah SS, Rajasekaran MP, Hossen MJ, Wong WK, Ng PK. Deep learning based classification of tibio-femoral knee osteoarthritis from lateral view knee joint X-ray images. Sci Rep 2025;15:21305. [Crossref] [PubMed]
Gebre RK, Hirvasniemi J, van der Heijden RA, Lantto I, Saarakkala S, Leppilahti J, Jämsä T. Detecting hip osteoarthritis on clinical CT: a deep learning application based on 2-D summation images derived from CT. Osteoporos Int 2022;33:355-65. [Crossref] [PubMed]
Üreten K, Arslan T, Gültekin KE, Demir AND, Özer HF, Bilgili Y. Detection of hip osteoarthritis by using plain pelvic radiographs with deep learning methods. Skeletal Radiol 2020;49:1369-74. [Crossref] [PubMed]
von Schacky CE, Sohn JH, Liu F, Ozhinsky E, Jungmann PM, Nardo L, Posadzy M, Foreman SC, Nevitt MC, Link TM, Pedoia V. Development and Validation of a Multitask Deep Learning Model for Severity Grading of Hip Osteoarthritis Features on Radiographs. Radiology 2020;295:136-45. [Crossref] [PubMed]
Xue Y, Zhang R, Deng Y, Chen K, Jiang T. A preliminary examination of the diagnostic value of deep learning in hip osteoarthritis. PLoS One 2017;12:e0178992. [Crossref] [PubMed]
Muttaqin F, Rahardjo P, Chilmi MZ, Bayuseno AP, Winarni TI, Isnanto RR. A Combination Method of ROI, CLAHE, and DenseNet-169 for Hip Osteoarthritis Detection. Eng Technol Appl Sci Res 2025;15:22690-7.
Ren X, Hou L, Liu S, Wu P, Liang S, Fu H, Li C, Li T, Cheng Y OA-MEN. a fusion deep learning approach for enhanced accuracy in knee osteoarthritis detection and classification using X-Ray imaging. Front Bioeng Biotechnol 2024;12:1437188. [Crossref] [PubMed]
Turmezei TD, Fotiadou A, Lomas DJ, Hopper MA, Poole KE. A new CT grading system for hip osteoarthritis. Osteoarthritis Cartilage 2014;22:1360-6. [Crossref] [PubMed]
Gielis WP, Weinans H, Nap FJ, Roemer FW, Foppen W. Scoring Osteoarthritis Reliably in Large Joints and the Spine Using Whole-Body CT: OsteoArthritis Computed Tomography-Score (OACT-Score). J Pers Med 2020;11:5. [Crossref] [PubMed]
Abidin AZ, Deng B. DSouza AM, Nagarajan MB, Coan P, Wismüller A. Deep transfer learning for characterizing chondrocyte patterns in phase contrast X-Ray computed tomography images of the human patellar cartilage. Comput Biol Med 2018;95:24-33. [Crossref] [PubMed]
Stroebel J, Horng A, Armbruster M, Mittone A, Reiser M, Bravin A, Coan P. Convolutional neuronal networks combined with X-ray phase-contrast imaging for a fast and observer-independent discrimination of cartilage and liver diseases stages. Sci Rep 2020;10:20007. [Crossref] [PubMed]
Dorraki M, Muratovic D, Fouladzadeh A, Verjans JW, Allison A, Findlay DM, Abbott D. Hip osteoarthritis: A novel network analysis of subchondral trabecular bone structures. PNAS Nexus 2022;1:pgac258. [Crossref] [PubMed]
Mourad L, Aboelsaad N, Talaat WM, Fahmy NMH, Abdelrahman HH, El-Mahallawy Y. Automatic detection of temporomandibular joint osteoarthritis radiographic features using deep learning artificial intelligence. A Diagnostic accuracy study. J Stomatol Oral Maxillofac Surg 2025;126:102124. [Crossref] [PubMed]
Talaat WM, Shetty S, Al Bayatti S, Talaat S, Mourad L, Shetty S, Kaboudan A. An artificial intelligence model for the radiographic diagnosis of osteoarthritis of the temporomandibular joint. Sci Rep 2023;13:15972. [Crossref] [PubMed]
van Tulder G, de Bruijne M. Learning Cross-Modality Representations From Multi-Modal Images. IEEE Trans Med Imaging 2019;38:638-48. [Crossref] [PubMed]
Pedoia V, Norman B, Mehany SN, Bucknor MD, Link TM, Majumdar S. 3D convolutional neural networks for detection and severity staging of meniscus and PFJ cartilage morphological degenerative changes in osteoarthritis and anterior cruciate ligament subjects. J Magn Reson Imaging 2019;49:400-10. [Crossref] [PubMed]
Roy C, Roshan M, Goyal N, Rana P, Ghonge NP, Jena A, Vaishya R, Ghosh S. MRI detection and grading of knee osteoarthritis - a pilot study using an AI technique with a novel imaging-based scoring system. Biomater Sci 2025;13:5475-94. [Crossref] [PubMed]
Wang X, Liu S, Zhou CC. Detection Algorithm of Knee Osteoarthritis Based on Magnetic Resonance Images. Intell Autom Soft Comput 2023;37:221-34.
Dong Y, Che X, Fu Y, Liu H, Zhang Y, Tu Y. Classification of knee osteoarthritis based on quantum-to-classical transfer learning. Front Phys 2023;11:1212373.
Simran S, Mehta S, Sharma R, Kukreja V, Dogra A. Synergistic Integration of 3D CNN and Vision Transformers for Enhanced Bio-Medical for Knee Cartilage Pathology Detection. Biomed Pharmacol J 2025;18:1647-67.
Tibrewala R, Ozhinsky E, Shah R, Flament I, Crossley K, Srinivasan R, Souza R, Link TM, Pedoia V, Majumdar S. Computer-Aided Detection AI Reduces Interreader Variability in Grading Hip Abnormalities With MRI. J Magn Reson Imaging 2020;52:1163-72. [Crossref] [PubMed]
Panwar P, Chaurasia S, Gangrade J, Bilandi A. Early diagnosis of knee osteoarthritis severity using vision transformer. BMC Musculoskelet Disord 2025;26:884. [Crossref] [PubMed]
Prezja F, Paloneva J, Pölönen I, Niinimäki E, Äyrämö S. DeepFake knee osteoarthritis X-rays from generative adversarial neural networks deceive medical experts and offer augmentation potential to automatic classification. Sci Rep 2022;12:18573. [Crossref] [PubMed]
Prezja F, Annala L, Kiiskinen S, Ojala T. Exploring the efficacy of base data augmentation methods in deep learning-based radiograph classification of knee joint osteoarthritis. Algorithms 2023;17:8.
Teh XY, Yeoh PSQ, Wang T, Wu X, Hasikin K, Lai KW. Knee Osteoarthritis Diagnosis With Unimodal and Multi-modal Neural Networks: Data from the Osteoarthritis Initiative. IEEE Access. 2024;12:146698-717.
Burian E, Subburaj K, Mookiah MRK, Rohrmeier A, Hedderich DM, Dieckmeyer M, Diefenbach MN, Ruschke S, Rummeny EJ, Zimmer C, Kirschke JS, Karampinos DC, Baum T. Texture analysis of vertebral bone marrow using chemical shift encoding-based water-fat MRI: a feasibility study. Osteoporos Int 2019;30:1265-74. [Crossref] [PubMed]
Zhang T, Liu P, Zhang Y, Wang W, Lu Y, Xi M, Duan S, Guan F. Combining information from multiple bone turnover markers as diagnostic indices for osteoporosis using support vector machines. Biomarkers 2019;24:120-6. [Crossref] [PubMed]
Wang J, Yan D, Zhao A, Hou X, Zheng X, Chen P, Bao Y, Jia W, Hu C, Zhang ZL, Jia W. Discovery of potential biomarkers for osteoporosis using LC-MS/MS metabolomic methods. Osteoporos Int 2019;30:1491-9. [Crossref] [PubMed]
Sapthagirivasan V, Anburajan M. Diagnosis of osteoporosis by extraction of trabecular features from hip radiographs using support vector machine: an investigation panorama with DXA. Comput Biol Med 2013;43:1910-9. [Crossref] [PubMed]
Singh A, Dutta MK, Jennane R, Lespessailles E. Classification of the trabecular bone structure of osteoporotic patients using machine vision. Comput Biol Med 2017;91:148-58. [Crossref] [PubMed]
Widyaningrum R, Sela EI, Pulungan R, Septiarini A. Automatic Segmentation of Periapical Radiograph Using Color Histogram and Machine Learning for Osteoporosis Detection. Int J Dent 2023;2023:6662911. [Crossref] [PubMed]
Huang CB, Hu JS, Tan K, Zhang W, Xu TH, Yang L. Application of machine learning model to predict osteoporosis based on abdominal computed tomography images of the psoas muscle: a retrospective study. BMC Geriatr 2022;22:796. [Crossref] [PubMed]
Najafi M, Yousefi Rezaii T, Danishvar S, Razavi SN. Qualitative Classification of Proximal Femoral Bone Using Geometric Features and Texture Analysis in Collected MRI Images for Bone Density Evaluation. Sensors (Basel) 2023;23:7612. [Crossref] [PubMed]
An FP, Liu Je. Medical image segmentation algorithm based on multilayer boundary perception-self attention deep learning model. Multimed Tools Appl 2021;80:15017-39.
Shi Q, Yin S, Wang K, Teng L, Li H. Multichannel convolutional neural network-based fuzzy active contour model for medical image segmentation. Evol Syst 2022;13:535-49.
Du J, Wang J, Gai X, Sui Y, Liu K, Yang D. Application of intelligent X-ray image analysis in risk assessment of osteoporotic fracture of femoral neck in the elderly. Math Biosci Eng 2023;20:879-93. [Crossref] [PubMed]
Liu L, Si M, Ma H, Cong M, Xu Q, Sun Q, Wu W, Wang C, Fagan MJ, Mur LAJ, Yang Q, Ji B. A hierarchical opportunistic screening model for osteoporosis using machine learning applied to clinical data and CT images. BMC Bioinformatics 2022;23:63. [Crossref] [PubMed]
Lim HK, Ha HI, Park SY, Lee K. Comparison of the diagnostic performance of CT Hounsfield unit histogram analysis and dual-energy X-ray absorptiometry in predicting osteoporosis of the femur. Eur Radiol 2019;29:1831-40. [Crossref] [PubMed]
Alacreu E, Moratal D, Arana E. Opportunistic screening for osteoporosis by routine CT in Southern Europe. Osteoporos Int 2017;28:983-90. [Crossref] [PubMed]
Buckens CF, Dijkhuis G, de Keizer B, Verhaar HJ, de Jong PA. Opportunistic screening for osteoporosis on routine computed tomography? An external validation study. Eur Radiol 2015;25:2074-9. [Crossref] [PubMed]
Park MS, Ha HI, Lim HK, Han J, Pak S. Femoral osteoporosis prediction model using autosegmentation and machine learning analysis with PyRadiomics on abdomen-pelvic computed tomography (CT). Quant Imaging Med Surg 2024;14:3959-69. [Crossref] [PubMed]
Choplin RH, Lenchik L, Wuertzer S. A Practical Approach to Interpretation of Dual-Energy X-ray Absorptiometry (DXA) for Assessment of Bone Density. Curr Radiol Rep 2014;2:48.
Valentinitsch A, Trebeschi S, Kaesmacher J, Lorenz C, Löffler MT, Zimmer C, Baum T, Kirschke JS. Opportunistic osteoporosis screening in multi-detector CT images via local classification of textures. Osteoporos Int 2019;30:1275-85. [Crossref] [PubMed]
He L, Liu Z, Liu C, Gao Z, Ren Q, Lei L, Ren J. Radiomics Based on Lumbar Spine Magnetic Resonance Imaging to Detect Osteoporosis. Acad Radiol 2021;28:e165-71. [Crossref] [PubMed]
Poullain F, Champsaur P, Pauly V, Knoepflin P, Le Corroller T, Creze M, Pithioux M, Bendahan D, Guenoun D. Vertebral trabecular bone texture analysis in opportunistic MRI and CT scan can distinguish patients with and without osteoporotic vertebral fracture: A preliminary study. Eur J Radiol 2023;158:110642. [Crossref] [PubMed]
Galbusera F, Cina A, O'Riordan D, Vitale JA, Loibl M, Fekete TF, Kleinstück F, Haschtmann D, Mannion AF. Estimating lumbar bone mineral density from conventional MRI and radiographs with deep learning in spine patients. Eur Spine J 2024;33:4092-103. [Crossref] [PubMed]
Lee S, Choe EK, Kang HY, Yoon JW, Kim HS. The exploration of feature extraction and machine learning for predicting bone density from simple spine X-ray images in a Korean population. Skeletal Radiol 2020;49:613-8. [Crossref] [PubMed]
Urakawa T, Tanaka Y, Goto S, Matsuzawa H, Watanabe K, Endo N. Detecting intertrochanteric hip fractures with orthopedist-level accuracy using a deep convolutional neural network. Skeletal Radiol 2019;48:239-44. [Crossref] [PubMed]
Yu JS, Yu SM, Erdal BS, Demirer M, Gupta V, Bigelow M, Salvador A, Rink T, Lenobel SS, Prevedello LM, White RD. Detection and localisation of hip fractures on anteroposterior radiographs with artificial intelligence: proof of concept. Clin Radiol 2020;75:237.e1-9. [Crossref] [PubMed]
Cheng CT, Ho TY, Lee TY, Chang CC, Chou CC, Chen CC, Chung IF, Liao CH. Application of a deep learning algorithm for detection and visualization of hip fractures on plain pelvic radiographs. Eur Radiol 2019;29:5469-77. [Crossref] [PubMed]
Jang R, Choi JH, Kim N, Chang JS, Yoon PW, Kim CH. Prediction of osteoporosis from simple hip radiography using deep learning algorithm. Sci Rep 2021;11:19997. [Crossref] [PubMed]
Tecle N, Teitel J, Morris MR, Sani N, Mitten D, Hammert WC. Convolutional Neural Network for Second Metacarpal Radiographic Osteoporosis Screening. J Hand Surg Am 2020;45:175-81. [Crossref] [PubMed]
Wani IM, Arora S. Computer-aided diagnosis systems for osteoporosis detection: a comprehensive survey. Med Biol Eng Comput 2020;58:1873-917. [Crossref] [PubMed]
Wani IM, Arora S. Osteoporosis diagnosis in knee X-rays by transfer learning based on convolution neural network. Multimed Tools Appl 2023;82:14193-217. [Crossref] [PubMed]
Nissinen T, Suoranta S, Saavalainen T, Sund R, Hurskainen O, Rikkonen T, Kröger H, Lähivaara T, Väänänen SP. Detecting pathological features and predicting fracture risk from dual-energy X-ray absorptiometry images using deep learning. Bone Rep 2021;14:101070. [Crossref] [PubMed]
Ho CS, Chen YP, Fan TY, Kuo CF, Yen TY, Liu YC, Pei YC. Application of deep learning neural network in predicting bone mineral density from plain X-ray radiography. Arch Osteoporos 2021;16:153. [Crossref] [PubMed]
Yasaka K, Akai H, Kunimatsu A, Kiryu S, Abe O. Prediction of bone mineral density from computed tomography: application of deep learning with a convolutional neural network. Eur Radiol 2020;30:3549-57. [Crossref] [PubMed]
Derkatch S, Kirby C, Kimelman D, Jozani MJ, Davidson JM, Leslie WD. Identification of Vertebral Fractures by Convolutional Neural Networks to Predict Nonvertebral and Hip Fractures: A Registry-based Cohort Study of Dual X-ray Absorptiometry. Radiology 2019;293:405-11. [Crossref] [PubMed]
Tomita N, Cheung YY, Hassanpour S. Deep neural networks for automatic detection of osteoporotic vertebral fractures on CT scans. Comput Biol Med 2018;98:8-15. [Crossref] [PubMed]
Mousavinasab SM, Hedyehzadeh M, Mousavinasab ST. Deep Learning for Osteoporosis Diagnosis Using Magnetic Resonance Images of Lumbar Vertebrae. J Imaging Inform Med 2025; Epub ahead of print. [Crossref]
Liu L. Implemented classification techniques for osteoporosis using deep learning from the perspective of healthcare analytics. Technol Health Care 2024;32:1947-65. [Crossref] [PubMed]
Klontzas ME, Stathis I, Spanakis K, Zibis AH, Marias K, Karantanas AH. Deep Learning for the Differential Diagnosis between Transient Osteoporosis and Avascular Necrosis of the Hip. Diagnostics (Basel) 2022;12:1870. [Crossref] [PubMed]
Wu PH, Gibbons M, Foreman SC, Carballido-Gamio J, Han M, Krug R, Liu J, Link TM, Kazakia GJ. Cortical bone vessel identification and quantification on contrast-enhanced MR images. Quant Imaging Med Surg 2019;9:928-41. [Crossref] [PubMed]
Dhanagopal R, Menaka R, Suresh Kumar R, Vasanth Raj PT, Debrah EL, Pradeep K. Channel-Boosted and Transfer Learning Convolutional Neural Network-Based Osteoporosis Detection from CT Scan, Dual X-Ray, and X-Ray Images. J Healthc Eng 2024;2024:3733705. [Crossref] [PubMed]
Zhu D, Zhang Z, Li W. Accuracy and Reliability of Multimodal Imaging in Diagnosing Knee Sports Injuries. Curr Med Imaging 2025;21:e15734056360665. [Crossref] [PubMed]
Guida C, Zhang M, Shan J. Improving knee osteoarthritis classification using multimodal intermediate fusion of X-ray, MRI, and clinical information. Neural Comput Appl 2023;35:9763-72.
Zhao Z, Bai H, Zhang J, Zhang Y, Xu S, Lin Z, Timofte R, Van Gool L. Cddfuse: Correlation-driven dual-branch feature decomposition for multi-modality image fusion. Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 2023;5906-16.
Acosta JN, Falcone GJ, Rajpurkar P, Topol EJ. Multimodal biomedical AI. Nat Med 2022;28:1773-84. [Crossref] [PubMed]
Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser Ł, Polosukhin I. NIPS'17: Proceedings of the 31st International Conference on Neural Information Processing Systems; 2017; 6000-10.
Zhou HY, Yu Y, Wang C, Zhang S, Gao Y, Pan J, Shao J, Lu G, Zhang K, Li W. A transformer-based representation-learning model with unified processing of multimodal input for clinical diagnostics. Nat Biomed Eng 2023;7:743-55. [Crossref] [PubMed]
Panfilov E, Saarakkala S, Nieminen MT, Tiulpin A. End-to-End Prediction of Knee Osteoarthritis Progression With Multimodal Transformers. IEEE J Biomed Health Inform 2025;29:6276-86. [Crossref] [PubMed]
Gao G, Zhang Y, Shi L, Wang L, Wang F, Xue Q. Radiographic prediction model based on X-rays predicting anterior cruciate ligament function in patients with knee osteoarthritis. Vis Comput Ind Biomed Art 2025;8:14. [Crossref] [PubMed]
Du Y, Almajalid R, Shan J, Zhang M. A Novel Method to Predict Knee Osteoarthritis Progression on MRI Using Machine Learning Methods. IEEE Trans Nanobioscience 2018;17:228-36. [Crossref] [PubMed]
Tiulpin A, Klein S, Bierma-Zeinstra SMA, Thevenot J, Rahtu E, Meurs JV, Oei EHG, Saarakkala S. Multimodal Machine Learning-based Knee Osteoarthritis Progression Prediction from Plain Radiographs and Clinical Data. Sci Rep 2019;9:20038. [Crossref] [PubMed]
Guo J, Yan P, Luo H, Ma Y, Jiang Y, Ju C, Chen W, Liu M, Lv S, Qin Y. Predicting joint space changes in knee osteoarthritis over 6 years: a combined model of TransUNet and XGBoost. Quant Imaging Med Surg 2025;15:1396-410. [Crossref] [PubMed]
Leung K, Zhang B, Tan J, Shen Y, Geras KJ, Babb JS, Cho K, Chang G, Deniz CM. Prediction of Total Knee Replacement and Diagnosis of Osteoarthritis by Using Deep Learning on Knee Radiographs: Data from the Osteoarthritis Initiative. Radiology 2020;296:584-93. [Crossref] [PubMed]
Kanis JA, Hans D, Cooper C, Baim S, Bilezikian JP, Binkley N, et al. Interpretation and use of FRAX in clinical practice. Osteoporos Int 2011;22:2395-411. [Crossref] [PubMed]
Wang P, Yin Q, Ding K, Zhong H, Jia Q, Xiao Z, Xiong H. Comparing machine learning models for osteoporosis prediction in Tibetan middle aged and elderly women. Sci Rep 2025;15:10960. [Crossref] [PubMed]
Zhang M, Gong H, Zhang K, Zhang M. Prediction of lumbar vertebral strength of elderly men based on quantitative computed tomography images using machine learning. Osteoporos Int 2019;30:2271-82. [Crossref] [PubMed]
Shim JG, Kim DW, Ryu KH, Cho EA, Ahn JH, Kim JI, Lee SH. Application of machine learning approaches for osteoporosis risk prediction in postmenopausal women. Arch Osteoporos 2020;15:169. [Crossref] [PubMed]
Inui A, Nishimoto H, Mifune Y, Yoshikawa T, Shinohara I, Furukawa T, Kato T, Tanaka S, Kusunose M, Kuroda R. Screening for Osteoporosis from Blood Test Data in Elderly Women Using a Machine Learning Approach. Bioengineering (Basel) 2023;10:277. [Crossref] [PubMed]

Cite this article as: Ming W, Liu T, Hu R, He W, Yang Y. Progress in machine learning-assisted medical imaging for osteoarthritis and osteoporosis diagnosis: a narrative review. Quant Imaging Med Surg 2026;16(4):320. doi: 10.21037/qims-2025-aw-2168

Introduction

Methods

Table 1

Imaging modalities and AI-assisted analysis for OA and OP

Table 2

Imaging techniques for OA and OP

AI-assisted image analysis

Imaging diagnosis of OA and OP with the assistance of ML and DL

OA and OP imaging datasets

Table 3

Assisted medical imaging diagnosis for OA

ML for OA imaging diagnosis

X-ray

CT

MRI

DL for OA imaging diagnosis

X-ray

CT

MRI

Assisted medical imaging diagnosis for OP

ML for OP imaging diagnosis

X-ray

CT

MRI

DL for OP imaging diagnosis

X-ray

DXA

QCT

MRI

Summary

Discussion

Multimodal fusion in ML-assisted diagnosis of OA and OP

Multimodal medical image fusion

Multimodal data fusion

Predicting disease progression of OA and OP using AI

Barriers to clinical translation into routine practice

Conclusions

Acknowledgments

Footnote

References

Article Options

Download Citation

Share