Advertisement for orthosearch.org.uk
You currently have no access to view or download this content. Please log in with your institutional or personal account if you should have access to through either of these
The Bone & Joint Journal Logo

Receive monthly Table of Contents alerts from The Bone & Joint Journal

Comprehensive article alerts can be set up and managed through your account settings

View my account settings

Trauma

Deep learning for automated hip fracture detection and classification

achieving superior accuracy



Download PDF

Abstract

Aims

The aim of this study was to develop and evaluate a deep learning-based model for classification of hip fractures to enhance diagnostic accuracy.

Methods

A retrospective study used 5,168 hip anteroposterior radiographs, with 4,493 radiographs from two institutes (internal dataset) for training and 675 radiographs from another institute for validation. A convolutional neural network (CNN)-based classification model was trained on four types of hip fractures (Displaced, Valgus-impacted, Stable, and Unstable), using DAMO-YOLO for data processing and augmentation. The model’s accuracy, sensitivity, specificity, Intersection over Union (IoU), and Dice coefficient were evaluated. Orthopaedic surgeons’ diagnoses served as the reference standard, with comparisons made before and after artificial intelligence assistance.

Results

The accuracy, sensitivity, specificity, IoU, and Dice coefficients of the model for the four fracture categories in the internal dataset were as follows: Displaced (1.0, 0.79, 1.0, 0.70, 0.82), Valgus-impacted (1.0, 0.80, 1.0, 0.70, 0.82), Stable (0.99, 0.95, 0.99, 0.83, 0.89), and Unstable (1.0, 0.98, 0.99, 0.86, 0.92), respectively. For the external validation dataset, the sensitivity and specificity were as follows: Displaced (0.83, 0.94), Valgus-impacted (0.89, 0.90), Stable (0.88, 0.95), and Unstable (0.85, 0.99), respectively. The overall means (Micro AVG and Macro AVG) for the external dataset were Micro AVG (0.83 (SD 0.05), 0.96 (SD 0.01)) and Macro AVG (0.69 (SD 0.02), 0.95 (SD 0.02)), respectively.

Conclusion

Compared to human diagnosis alone, our study demonstrates that the developed model significantly improves the accuracy of detecting and classifying hip fractures. Our model has shown great potential in assisting clinicians with the accurate diagnosis and classification of hip fractures.

Cite this article: Bone Joint J 2025;107-B(2):213–220.


Correspondence should be sent to Dr. Du Hyun Ro. E-mail:

J-W. Park and D. H. Ro are joint senior authors.


For access options please click here