Advertisement for orthosearch.org.uk
Results 1 - 20 of 85
Results per page:
The Bone & Joint Journal
Vol. 102-B, Issue 12 | Pages 1752 - 1759
1 Dec 2020
Tsuda Y Tsoi K Stevenson JD Laitinen M Ferguson PC Wunder JS Griffin AM van de Sande MAJ van Praag V Leithner A Fujiwara T Yasunaga H Matsui H Parry MC Jeys LM

Aims. Our aim was to develop and validate nomograms that would predict the cumulative incidence of sarcoma-specific death (CISSD) and disease progression (CIDP) in patients with localized high-grade primary central and dedifferentiated chondrosarcoma. Methods. The study population consisted of 391 patients from two international sarcoma centres (development cohort) who had undergone definitive surgery for a localized high-grade (histological grade II or III) conventional primary central chondrosarcoma or dedifferentiated chondrosarcoma. Disease progression captured the first event of either metastasis or local recurrence. An independent cohort of 221 patients from three additional hospitals was used for external validation. Two nomograms were internally and externally validated for discrimination (c-index) and calibration plot. Results. In the development cohort, the CISSD at ten years was 32.9% (95% confidence interval (CI) 19.8% to 38.4%). Age at diagnosis, grade, and surgical margin were found to have significant effects on CISSD and CIDP in multivariate analyses. Maximum tumour diameter was also significantly associated with CISSD. In the development cohort, the c-indices for CISSD and CIDP at five years were 0.743 (95% CI 0.700 to 0.819) and 0.761 (95% CI 0.713 to 0.800), respectively. When applied to the validation cohort, the c-indices for CISSD and CIDP at five years were 0.839 (95% CI 0.763 to 0.916) and 0.749 (95% CI 0.672 to 0.825), respectively. The calibration plots for these two nomograms demonstrated good fit. Conclusion. Our nomograms performed well on internal and external validation and can be used to predict CISSD and CIDP after resection of localized high-grade conventional primary central and dedifferentiated chondrosarcomas. They provide a new tool with which clinicians can assess and advise individual patients about their prognosis. Cite this article: Bone Joint J 2020;102-B(12):1752–1759


Bone & Joint Open
Vol. 2, Issue 10 | Pages 879 - 885
20 Oct 2021
Oliveira e Carmo L van den Merkhof A Olczak J Gordon M Jutte PC Jaarsma RL IJpma FFA Doornberg JN Prijs J

Aims. The number of convolutional neural networks (CNN) available for fracture detection and classification is rapidly increasing. External validation of a CNN on a temporally separate (separated by time) or geographically separate (separated by location) dataset is crucial to assess generalizability of the CNN before application to clinical practice in other institutions. We aimed to answer the following questions: are current CNNs for fracture recognition externally valid?; which methods are applied for external validation (EV)?; and, what are reported performances of the EV sets compared to the internal validation (IV) sets of these CNNs?. Methods. The PubMed and Embase databases were systematically searched from January 2010 to October 2020 according to the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) statement. The type of EV, characteristics of the external dataset, and diagnostic performance characteristics on the IV and EV datasets were collected and compared. Quality assessment was conducted using a seven-item checklist based on a modified Methodologic Index for NOn-Randomized Studies instrument (MINORS). Results. Out of 1,349 studies, 36 reported development of a CNN for fracture detection and/or classification. Of these, only four (11%) reported a form of EV. One study used temporal EV, one conducted both temporal and geographical EV, and two used geographical EV. When comparing the CNN’s performance on the IV set versus the EV set, the following were found: AUCs of 0.967 (IV) versus 0.975 (EV), 0.976 (IV) versus 0.985 to 0.992 (EV), 0.93 to 0.96 (IV) versus 0.80 to 0.89 (EV), and F1-scores of 0.856 to 0.863 (IV) versus 0.757 to 0.840 (EV). Conclusion. The number of externally validated CNNs in orthopaedic trauma for fracture recognition is still scarce. This greatly limits the potential for transfer of these CNNs from the developing institute to another hospital to achieve similar diagnostic performance. We recommend the use of geographical EV and statements such as the Consolidated Standards of Reporting Trials–Artificial Intelligence (CONSORT-AI), the Standard Protocol Items: Recommendations for Interventional Trials–Artificial Intelligence (SPIRIT-AI) and the Transparent Reporting of a multivariable prediction model for Individual Prognosis or Diagnosis–Machine Learning (TRIPOD-ML) to critically appraise performance of CNNs and improve methodological rigor, quality of future models, and facilitate eventual implementation in clinical practice. Cite this article: Bone Jt Open 2021;2(10):879–885


The Bone & Joint Journal
Vol. 103-B, Issue 3 | Pages 469 - 478
1 Mar 2021
Garland A Bülow E Lenguerrand E Blom A Wilkinson M Sayers A Rolfson O Hailer NP

Aims. To develop and externally validate a parsimonious statistical prediction model of 90-day mortality after elective total hip arthroplasty (THA), and to provide a web calculator for clinical usage. Methods. We included 53,099 patients with cemented THA due to osteoarthritis from the Swedish Hip Arthroplasty Registry for model derivation and internal validation, as well as 125,428 patients from England and Wales recorded in the National Joint Register for England, Wales, Northern Ireland, the Isle of Man, and the States of Guernsey (NJR) for external model validation. A model was developed using a bootstrap ranking procedure with a least absolute shrinkage and selection operator (LASSO) logistic regression model combined with piecewise linear regression. Discriminative ability was evaluated by the area under the receiver operating characteristic curve (AUC). Calibration belt plots were used to assess model calibration. Results. A main effects model combining age, sex, American Society for Anesthesiologists (ASA) class, the presence of cancer, diseases of the central nervous system, kidney disease, and diagnosed obesity had good discrimination, both internally (AUC = 0.78, 95% confidence interval (CI) 0.75 to 0.81) and externally (AUC = 0.75, 95% CI 0.73 to 0.76). This model was superior to traditional models based on the Charlson (AUC = 0.66, 95% CI 0.62 to 0.70) and Elixhauser (AUC = 0.64, 95% CI 0.59 to 0.68) comorbidity indices. The model was well calibrated for predicted probabilities up to 5%. Conclusion. We developed a parsimonious model that may facilitate individualized risk assessment prior to one of the most common surgical interventions. We have published a web calculator to aid clinical decision-making. Cite this article: Bone Joint J 2021;103-B(3):469–478


The Bone & Joint Journal
Vol. 106-B, Issue 11 | Pages 1348 - 1360
1 Nov 2024
Spek RWA Smith WJ Sverdlov M Broos S Zhao Y Liao Z Verjans JW Prijs J To M Åberg H Chiri W IJpma FFA Jadav B White J Bain GI Jutte PC van den Bekerom MPJ Jaarsma RL Doornberg JN

Aims

The purpose of this study was to develop a convolutional neural network (CNN) for fracture detection, classification, and identification of greater tuberosity displacement ≥ 1 cm, neck-shaft angle (NSA) ≤ 100°, shaft translation, and articular fracture involvement, on plain radiographs.

Methods

The CNN was trained and tested on radiographs sourced from 11 hospitals in Australia and externally validated on radiographs from the Netherlands. Each radiograph was paired with corresponding CT scans to serve as the reference standard based on dual independent evaluation by trained researchers and attending orthopaedic surgeons. Presence of a fracture, classification (non- to minimally displaced; two-part, multipart, and glenohumeral dislocation), and four characteristics were determined on 2D and 3D CT scans and subsequently allocated to each series of radiographs. Fracture characteristics included greater tuberosity displacement ≥ 1 cm, NSA ≤ 100°, shaft translation (0% to < 75%, 75% to 95%, > 95%), and the extent of articular involvement (0% to < 15%, 15% to 35%, or > 35%).


The Bone & Joint Journal
Vol. 106-B, Issue 7 | Pages 688 - 695
1 Jul 2024
Farrow L Zhong M Anderson L

Aims. To examine whether natural language processing (NLP) using a clinically based large language model (LLM) could be used to predict patient selection for total hip or total knee arthroplasty (THA/TKA) from routinely available free-text radiology reports. Methods. Data pre-processing and analyses were conducted according to the Artificial intelligence to Revolutionize the patient Care pathway in Hip and knEe aRthroplastY (ARCHERY) project protocol. This included use of de-identified Scottish regional clinical data of patients referred for consideration of THA/TKA, held in a secure data environment designed for artificial intelligence (AI) inference. Only preoperative radiology reports were included. NLP algorithms were based on the freely available GatorTron model, a LLM trained on over 82 billion words of de-identified clinical text. Two inference tasks were performed: assessment after model-fine tuning (50 Epochs and three cycles of k-fold cross validation), and external validation. Results. For THA, there were 5,558 patient radiology reports included, of which 4,137 were used for model training and testing, and 1,421 for external validation. Following training, model performance demonstrated average (mean across three folds) accuracy, F1 score, and area under the receiver operating curve (AUROC) values of 0.850 (95% confidence interval (CI) 0.833 to 0.867), 0.813 (95% CI 0.785 to 0.841), and 0.847 (95% CI 0.822 to 0.872), respectively. For TKA, 7,457 patient radiology reports were included, with 3,478 used for model training and testing, and 3,152 for external validation. Performance metrics included accuracy, F1 score, and AUROC values of 0.757 (95% CI 0.702 to 0.811), 0.543 (95% CI 0.479 to 0.607), and 0.717 (95% CI 0.657 to 0.778) respectively. There was a notable deterioration in performance on external validation in both cohorts. Conclusion. The use of routinely available preoperative radiology reports provides promising potential to help screen suitable candidates for THA, but not for TKA. The external validation results demonstrate the importance of further model testing and training when confronted with new clinical cohorts. Cite this article: Bone Joint J 2024;106-B(7):688–695


Bone & Joint Open
Vol. 5, Issue 1 | Pages 9 - 19
16 Jan 2024
Dijkstra H van de Kuit A de Groot TM Canta O Groot OQ Oosterhoff JH Doornberg JN

Aims. Machine-learning (ML) prediction models in orthopaedic trauma hold great promise in assisting clinicians in various tasks, such as personalized risk stratification. However, an overview of current applications and critical appraisal to peer-reviewed guidelines is lacking. The objectives of this study are to 1) provide an overview of current ML prediction models in orthopaedic trauma; 2) evaluate the completeness of reporting following the Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD) statement; and 3) assess the risk of bias following the Prediction model Risk Of Bias Assessment Tool (PROBAST) tool. Methods. A systematic search screening 3,252 studies identified 45 ML-based prediction models in orthopaedic trauma up to January 2023. The TRIPOD statement assessed transparent reporting and the PROBAST tool the risk of bias. Results. A total of 40 studies reported on training and internal validation; four studies performed both development and external validation, and one study performed only external validation. The most commonly reported outcomes were mortality (33%, 15/45) and length of hospital stay (9%, 4/45), and the majority of prediction models were developed in the hip fracture population (60%, 27/45). The overall median completeness for the TRIPOD statement was 62% (interquartile range 30 to 81%). The overall risk of bias in the PROBAST tool was low in 24% (11/45), high in 69% (31/45), and unclear in 7% (3/45) of the studies. High risk of bias was mainly due to analysis domain concerns including small datasets with low number of outcomes, complete-case analysis in case of missing data, and no reporting of performance measures. Conclusion. The results of this study showed that despite a myriad of potential clinically useful applications, a substantial part of ML studies in orthopaedic trauma lack transparent reporting, and are at high risk of bias. These problems must be resolved by following established guidelines to instil confidence in ML models among patients and clinicians. Otherwise, there will remain a sizeable gap between the development of ML prediction models and their clinical application in our day-to-day orthopaedic trauma practice. Cite this article: Bone Jt Open 2024;5(1):9–19


To examine whether Natural Language Processing (NLP) using a state-of-the-art clinically based Large Language Model (LLM) could predict patient selection for Total Hip Arthroplasty (THA), across a range of routinely available clinical text sources. Data pre-processing and analyses were conducted according to the Ai to Revolutionise the patient Care pathway in Hip and Knee arthroplasty (ARCHERY) project protocol (. https://www.researchprotocols.org/2022/5/e37092/. ). Three types of deidentified Scottish regional clinical free text data were assessed: Referral letters, radiology reports and clinic letters. NLP algorithms were based on the GatorTron model, a Bidirectional Encoder Representations from Transformers (BERT) based LLM trained on 82 billion words of de-identified clinical text. Three specific inference tasks were performed: assessment of the base GatorTron model, assessment after model-fine tuning, and external validation. There were 3911, 1621 and 1503 patient text documents included from the sources of referral letters, radiology reports and clinic letters respectively. All letter sources displayed significant class imbalance, with only 15.8%, 24.9%, and 5.9% of patients linked to the respective text source documentation having undergone surgery. Untrained model performance was poor, with F1 scores (harmonic mean of precision and recall) of 0.02, 0.38 and 0.09 respectively. This did however improve with model training, with mean scores (range) of 0.39 (0.31–0.47), 0.57 (0.48–0.63) and 0.32 (0.28–0.39) across the 5 folds of cross-validation. Performance deteriorated on external validation across all three groups but remained highest for the radiology report cohort. Even with further training on a large cohort of routinely collected free-text data a clinical LLM fails to adequately perform clinical inference in NLP tasks regarding identification of those selected to undergo THA. This likely relates to the complexity and heterogeneity of free-text information and the way that patients are determined to be surgical candidates


The Bone & Joint Journal
Vol. 105-B, Issue 6 | Pages 702 - 710
1 Jun 2023
Yeramosu T Ahmad W Bashir A Wait J Bassett J Domson G

Aims. The aim of this study was to identify factors associated with five-year cancer-related mortality in patients with limb and trunk soft-tissue sarcoma (STS) and develop and validate machine learning algorithms in order to predict five-year cancer-related mortality in these patients. Methods. Demographic, clinicopathological, and treatment variables of limb and trunk STS patients in the Surveillance, Epidemiology, and End Results Program (SEER) database from 2004 to 2017 were analyzed. Multivariable logistic regression was used to determine factors significantly associated with five-year cancer-related mortality. Various machine learning models were developed and compared using area under the curve (AUC), calibration, and decision curve analysis. The model that performed best on the SEER testing data was further assessed to determine the variables most important in its predictive capacity. This model was externally validated using our institutional dataset. Results. A total of 13,646 patients with STS from the SEER database were included, of whom 35.9% experienced five-year cancer-related mortality. The random forest model performed the best overall and identified tumour size as the most important variable when predicting mortality in patients with STS, followed by M stage, histological subtype, age, and surgical excision. Each variable was significant in logistic regression. External validation yielded an AUC of 0.752. Conclusion. This study identified clinically important variables associated with five-year cancer-related mortality in patients with limb and trunk STS, and developed a predictive model that demonstrated good accuracy and predictability. Orthopaedic oncologists may use these findings to further risk-stratify their patients and recommend an optimal course of treatment. Cite this article: Bone Joint J 2023;105-B(6):702–710


Bone & Joint Research
Vol. 13, Issue 2 | Pages 66 - 82
5 Feb 2024
Zhao D Zeng L Liang G Luo M Pan J Dou Y Lin F Huang H Yang W Liu J

Aims. This study aimed to explore the biological and clinical importance of dysregulated key genes in osteoarthritis (OA) patients at the cartilage level to find potential biomarkers and targets for diagnosing and treating OA. Methods. Six sets of gene expression profiles were obtained from the Gene Expression Omnibus database. Differential expression analysis, weighted gene coexpression network analysis (WGCNA), and multiple machine-learning algorithms were used to screen crucial genes in osteoarthritic cartilage, and genome enrichment and functional annotation analyses were used to decipher the related categories of gene function. Single-sample gene set enrichment analysis was performed to analyze immune cell infiltration. Correlation analysis was used to explore the relationship among the hub genes and immune cells, as well as markers related to articular cartilage degradation and bone mineralization. Results. A total of 46 genes were obtained from the intersection of significantly upregulated genes in osteoarthritic cartilage and the key module genes screened by WGCNA. Functional annotation analysis revealed that these genes were closely related to pathological responses associated with OA, such as inflammation and immunity. Four key dysregulated genes (cartilage acidic protein 1 (CRTAC1), iodothyronine deiodinase 2 (DIO2), angiopoietin-related protein 2 (ANGPTL2), and MAGE family member D1 (MAGED1)) were identified after using machine-learning algorithms. These genes had high diagnostic value in both the training cohort and external validation cohort (receiver operating characteristic > 0.8). The upregulated expression of these hub genes in osteoarthritic cartilage signified higher levels of immune infiltration as well as the expression of metalloproteinases and mineralization markers, suggesting harmful biological alterations and indicating that these hub genes play an important role in the pathogenesis of OA. A competing endogenous RNA network was constructed to reveal the underlying post-transcriptional regulatory mechanisms. Conclusion. The current study explores and validates a dysregulated key gene set in osteoarthritic cartilage that is capable of accurately diagnosing OA and characterizing the biological alterations in osteoarthritic cartilage; this may become a promising indicator in clinical decision-making. This study indicates that dysregulated key genes play an important role in the development and progression of OA, and may be potential therapeutic targets. Cite this article: Bone Joint Res 2024;13(2):66–82


Orthopaedic Proceedings
Vol. 104-B, Issue SUPP_13 | Pages 60 - 60
1 Dec 2022
Martin RK Wastvedt S Pareek A Persson A Visnes H Fenstad AM Moatshe G Wolfson J Lind M Engebretsen L
Full Access

External validation of machine learning predictive models is achieved through evaluation of model performance on different groups of patients than were used for algorithm development. This important step is uncommonly performed, inhibiting clinical translation of newly developed models. Recently, machine learning was used to develop a tool that can quantify revision risk for a patient undergoing primary anterior cruciate ligament (ACL) reconstruction (https://swastvedt.shinyapps.io/calculator_rev/). The source of data included nearly 25,000 patients with primary ACL reconstruction recorded in the Norwegian Knee Ligament Register (NKLR). The result was a well-calibrated tool capable of predicting revision risk one, two, and five years after primary ACL reconstruction with moderate accuracy. The purpose of this study was to determine the external validity of the NKLR model by assessing algorithm performance when applied to patients from the Danish Knee Ligament Registry (DKLR). The primary outcome measure of the NKLR model was probability of revision ACL reconstruction within 1, 2, and/or 5 years. For the index study, 24 total predictor variables in the NKLR were included and the models eliminated variables which did not significantly improve prediction ability - without sacrificing accuracy. The result was a well calibrated algorithm developed using the Cox Lasso model that only required five variables (out of the original 24) for outcome prediction. For this external validation study, all DKLR patients with complete data for the five variables required for NKLR prediction were included. The five variables were: graft choice, femur fixation device, Knee Injury and Osteoarthritis Outcome Score (KOOS) Quality of Life subscale score at surgery, years from injury to surgery, and age at surgery. Predicted revision probabilities were calculated for all DKLR patients. The model performance was assessed using the same metrics as the NKLR study: concordance and calibration. In total, 10,922 DKLR patients were included for analysis. Average follow-up time or time-to-revision was 8.4 (±4.3) years and overall revision rate was 6.9%. Surgical technique trends (i.e., graft choice and fixation devices) and injury characteristics (i.e., concomitant meniscus and cartilage pathology) were dissimilar between registries. The model produced similar concordance when applied to the DKLR population compared to the original NKLR test data (DKLR: 0.68; NKLR: 0.68-0.69). Calibration was poorer for the DKLR population at one and five years post primary surgery but similar to the NKLR at two years. The NKLR machine learning algorithm demonstrated similar performance when applied to patients from the DKLR, suggesting that it is valid for application outside of the initial patient population. This represents the first machine learning model for predicting revision ACL reconstruction that has been externally validated. Clinicians can use this in-clinic calculator to estimate revision risk at a patient specific level when discussing outcome expectations pre-operatively. While encouraging, it should be noted that the performance of the model on patients undergoing ACL reconstruction outside of Scandinavia remains unknown


The Bone & Joint Journal
Vol. 106-B, Issue 11 | Pages 1216 - 1222
1 Nov 2024
Castagno S Gompels B Strangmark E Robertson-Waters E Birch M van der Schaar M McCaskie AW

Aims. Machine learning (ML), a branch of artificial intelligence that uses algorithms to learn from data and make predictions, offers a pathway towards more personalized and tailored surgical treatments. This approach is particularly relevant to prevalent joint diseases such as osteoarthritis (OA). In contrast to end-stage disease, where joint arthroplasty provides excellent results, early stages of OA currently lack effective therapies to halt or reverse progression. Accurate prediction of OA progression is crucial if timely interventions are to be developed, to enhance patient care and optimize the design of clinical trials. Methods. A systematic review was conducted in accordance with PRISMA guidelines. We searched MEDLINE and Embase on 5 May 2024 for studies utilizing ML to predict OA progression. Titles and abstracts were independently screened, followed by full-text reviews for studies that met the eligibility criteria. Key information was extracted and synthesized for analysis, including types of data (such as clinical, radiological, or biochemical), definitions of OA progression, ML algorithms, validation methods, and outcome measures. Results. Out of 1,160 studies initially identified, 39 were included. Most studies (85%) were published between 2020 and 2024, with 82% using publicly available datasets, primarily the Osteoarthritis Initiative. ML methods were predominantly supervised, with significant variability in the definitions of OA progression: most studies focused on structural changes (59%), while fewer addressed pain progression or both. Deep learning was used in 44% of studies, while automated ML was used in 5%. There was a lack of standardization in evaluation metrics and limited external validation. Interpretability was explored in 54% of studies, primarily using SHapley Additive exPlanations. Conclusion. Our systematic review demonstrates the feasibility of ML models in predicting OA progression, but also uncovers critical limitations that currently restrict their clinical applicability. Future priorities should include diversifying data sources, standardizing outcome measures, enforcing rigorous validation, and integrating more sophisticated algorithms. This paradigm shift from predictive modelling to actionable clinical tools has the potential to transform patient care and disease management in orthopaedic practice. Cite this article: Bone Joint J 2024;106-B(11):1216–1222


Bone & Joint Open
Vol. 5, Issue 8 | Pages 671 - 680
14 Aug 2024
Fontalis A Zhao B Putzeys P Mancino F Zhang S Vanspauwen T Glod F Plastow R Mazomenos E Haddad FS

Aims. Precise implant positioning, tailored to individual spinopelvic biomechanics and phenotype, is paramount for stability in total hip arthroplasty (THA). Despite a few studies on instability prediction, there is a notable gap in research utilizing artificial intelligence (AI). The objective of our pilot study was to evaluate the feasibility of developing an AI algorithm tailored to individual spinopelvic mechanics and patient phenotype for predicting impingement. Methods. This international, multicentre prospective cohort study across two centres encompassed 157 adults undergoing primary robotic arm-assisted THA. Impingement during specific flexion and extension stances was identified using the virtual range of motion (ROM) tool of the robotic software. The primary AI model, the Light Gradient-Boosting Machine (LGBM), used tabular data to predict impingement presence, direction (flexion or extension), and type. A secondary model integrating tabular data with plain anteroposterior pelvis radiographs was evaluated to assess for any potential enhancement in prediction accuracy. Results. We identified nine predictors from an analysis of baseline spinopelvic characteristics and surgical planning parameters. Using fivefold cross-validation, the LGBM achieved 70.2% impingement prediction accuracy. With impingement data, the LGBM estimated direction with 85% accuracy, while the support vector machine (SVM) determined impingement type with 72.9% accuracy. After integrating imaging data with a multilayer perceptron (tabular) and a convolutional neural network (radiograph), the LGBM’s prediction was 68.1%. Both combined and LGBM-only had similar impingement direction prediction rates (around 84.5%). Conclusion. This study is a pioneering effort in leveraging AI for impingement prediction in THA, utilizing a comprehensive, real-world clinical dataset. Our machine-learning algorithm demonstrated promising accuracy in predicting impingement, its type, and direction. While the addition of imaging data to our deep-learning algorithm did not boost accuracy, the potential for refined annotations, such as landmark markings, offers avenues for future enhancement. Prior to clinical integration, external validation and larger-scale testing of this algorithm are essential. Cite this article: Bone Jt Open 2024;5(8):671–680


Orthopaedic Proceedings
Vol. 106-B, Issue SUPP_8 | Pages 4 - 4
10 May 2024
Hoffman T Knudsen J Jesani S Clark H
Full Access

Introduction. Debridement, antibiotics irrigation and implant retention (DAIR) is a common management strategy for hip and knee prosthetic joint infections (PJI). However, failure rates remain high, which has led to the development of predictive tools to help determine success. These tools include KLIC and CRIME80 for acute-postoperative (AP) and acute haematogenous (AH) PJI respectively. We investigated whether these tools were applicable to a Waikato cohort. Method. We performed a retrospective cohort study that evaluated patients who underwent DAIR between January 2010 and June 2020 at Waikato Hospital. Pre-operative KLIC and CRIME80 scores were calculated and compared to success of operation. Failure was defined as: (i) need for further surgery, (ii) need for suppressive antibiotics, (iii) death due to the infection. Logistic regression models were used to calculate the area under the curve (AUC). Results. 117 eligible patients underwent DAIR, 53 in the AP cohort and 64 in the AH cohort. Failure rate at 2 years post-op was 43% in the AP cohort and 59% in the AH cohort. In the AP cohort a KLIC score of <4 had a DAIR failure rate of 28.6%, while those who scored ³4 had a failure rate of 72.2% (p=0.002). In the AH cohort a CRIME80 score of <3 had a DAIR failure rate of 48% while those who scored ³3 had a 100% failure rate (p<0.001). Discussion. This study represents the first external validation of the KLIC and CRIME80 scores for predicting DAIR failure in an Australasian population. The results indicate that both KLIC and CRIME80 scoring tools are valuable aids for the clinician seeking to determine the optimal management strategy in patients with AP or AH PJI


Orthopaedic Proceedings
Vol. 105-B, Issue SUPP_14 | Pages 8 - 8
10 Oct 2023
Leow J Oliver W Bell K Molyneux S Clement N Duckworth A
Full Access

To develop a reliable and effective radiological score to assess the healing of isolated ulnar shaft fractures (IUSF), the Radiographic Union Score for Ulna fractures (RUSU). Initially, 20 patients with radiographs six weeks following a non-operatively managed ulnar shaft fracture were selected and scored by three blinded observers. After intraclass correlation (ICC) analysis, a second group of 54 patients with radiographs six weeks after injury (18 who developed a nonunion and 36 who united) were scored by the same observers. In the initial study, interobserver and intraobserver ICC were 0.89 and 0.93, respectively. In the validation study the interobserver ICC was 0.85. The median score for patients who united was significantly higher than those who developed a nonunion (11 vs 7, p<0.001). A ROC curve demonstrated that a RUSU ≤8 had a sensitivity of 88.9% and specificity of 86.1% in identifying patients at risk of nonunion. Patients with a RUSU ≤8 (n = 21) were more likely to develop a nonunion (n = 16/21) than those with a RUSU ≥9 (n = 2/33; OR 49.6, 95% CI 8.6–284.7). Based on a PPV of 76%, if all patients with a RUSU ≤8 underwent fixation at 6-weeks, the number of procedures needed to avoid one nonunion would be 1.3. The RUSU shows good interobserver and intraobserver reliability and is effective in identifying patients at risk of nonunion six weeks after fracture. This tool requires external validation but may enhance the management of patients with isolated ulnar shaft fractures


Orthopaedic Proceedings
Vol. 105-B, Issue SUPP_3 | Pages 5 - 5
23 Feb 2023
Jadresic MC Baker J
Full Access

Numerous prediction tools are available for estimating postoperative risk following spine surgery. External validation studies have shown mixed results. We present the development, validation, and comparative evaluation of novel tool (NZSpine) for modelling risk of complications within 30 days of spine surgery. Data was gathered retrospectively from medical records of patients who underwent spine surgery at Waikato Hospital between January 2019 and December 2020 (n = 488). Variables were selected a priori based on previous evidence and clinical judgement. Postoperative adverse events were classified objectively using the Comprehensive Complication Index. Models were constructed for the occurrence of any complication and significant complications (based on CCI >26). Performance and clinical utility of the novel model was compared against SpineSage (. https://depts.washington.edu/spinersk/. ), an extant online tool which we have shown in unpublished work to be valid in our local population. Overall complication rate was 34%. In the multivariate model, higher age, increased surgical invasiveness and the presence of preoperative anemia were most strongly predictive of any postoperative complication (OR = 1.03, 1.09, 2.1 respectively, p <0.001), whereas the occurrence of a major postoperative complication (CCI >26) was most strongly associated with the presence of respiratory disease (OR = 2.82, p <0.001). Internal validation using the bootstrapped models showed the model was robust, with an AUC of 0.73. Using sensitivity analysis, 80% of the model's predictions were correct. By comparison SpineSage had an AUC of 0.71, and in decision curve analysis the novel model showed greater expected benefit at all thresholds of risk. NZSpine is a novel risk assessment tool for patients undergoing acute and elective spine surgery and may help inform clinicians and patients of their prognosis. Use of an objective tool may help to provide uniformity between DHBs when completing the “clinician assessment of risk” section of the national prioritization tool


Orthopaedic Proceedings
Vol. 105-B, Issue SUPP_16 | Pages 23 - 23
17 Nov 2023
Castagno S Birch M van der Schaar M McCaskie A
Full Access

Abstract. Introduction. Precision health aims to develop personalised and proactive strategies for predicting, preventing, and treating complex diseases such as osteoarthritis (OA), a degenerative joint disease affecting over 300 million people worldwide. Due to OA heterogeneity, which makes developing effective treatments challenging, identifying patients at risk for accelerated disease progression is essential for efficient clinical trial design and new treatment target discovery and development. Objectives. This study aims to create a trustworthy and interpretable precision health tool that predicts rapid knee OA progression based on baseline patient characteristics using an advanced automated machine learning (autoML) framework, “Autoprognosis 2.0”. Methods. All available 2-year follow-up periods of 600 patients from the FNIH OA Biomarker Consortium were analysed using “Autoprognosis 2.0” in two separate approaches, with distinct definitions of clinical outcomes: multi-class predictions (categorising patients into non-progressors, pain-only progressors, radiographic-only progressors, and both pain and radiographic progressors) and binary predictions (categorising patients into non-progressors and progressors). Models were developed using a training set of 1352 instances and all available variables (including clinical, X-ray, MRI, and biochemical features), and validated through both stratified 10-fold cross-validation and hold-out validation on a testing set of 339 instances. Model performance was assessed using multiple evaluation metrics, such as AUC-ROC, AUC-PRC, F1-score, precision, and recall. Additionally, interpretability analyses were carried out to identify important predictors of rapid disease progression. Results. Our final models yielded high accuracy scores for both multi-class predictions (AUC-ROC: 0.858, 95% CI: 0.856–0.860; AUC-PRC: 0.675, 95% CI: 0.671–0.679; F1-score: 0.560, 95% CI: 0.554–0.566) and binary predictions (AUC-ROC: 0.717, 95% CI: 0.712–0.722; AUC-PRC: 0.620, 95% CI: 0.616–0.624; F1-score: 0.676, 95% CI: 0.673–0679). Important predictors of rapid disease progression included the Western Ontario and McMaster Universities Osteoarthritis Index (WOMAC) scores and MRI features. Our models were further successfully validated using a hold-out dataset, which was previously omitted from model development and training (AUC-ROC: 0.877 for multi-class predictions; AUC-ROC: 0.746 for binary predictions). Additionally, accurate ML models were developed for predicting OA progression in a subgroup of patients aged 65 or younger (AUC-ROC: 0.862, 95% CI: 0.861–0.863 for multi-class predictions; AUC-ROC: 0.736, 95% CI: 0.734–0.738 for binary predictions). Conclusions. This study presents a reliable and interpretable precision health tool for predicting rapid knee OA progression using “Autoprognosis 2.0”. Our models provide accurate predictions and offer insights into important predictors of rapid disease progression. Furthermore, the transparency and interpretability of our methods may facilitate their acceptance by clinicians and patients, enabling effective utilisation in clinical practice. Future work should focus on refining these models by increasing the sample size, integrating additional features, and using independent datasets for external validation. Declaration of Interest. (b) declare that there is no conflict of interest that could be perceived as prejudicing the impartiality of the research reported:I declare that there is no conflict of interest that could be perceived as prejudicing the impartiality of the research project


Orthopaedic Proceedings
Vol. 101-B, Issue SUPP_12 | Pages 49 - 49
1 Oct 2019
Schwabe M Graesser E Rhea L Pascual-Garrido C Nepple J Clohisy JC
Full Access

Topic. Utilizing radiographic, physical exam and history findings, we developed a novel clinical score to aid in the surgical decision making process for hips with borderline/ transitional dysplastic hips. Background. Treatment of borderline acetabular dysplasia (BD) is controversial with some patients having primarily instability-based symptoms while others have impingement-based symptoms. The purpose of this study was to identify the most important patient characteristics influencing the diagnosis of instability vs. non-instability, develop a clinical score (Borderline Hip Instability Score, BHIS) to collectively characterize these factors and to externally validate BHIS in a multicenter cohort BD patients. Methods. First a retrospective cohort of 186 hips undergoing surgical treatment of BD (LCEA 20°-25°) from a single surgeon experienced in arthroscopic and open techniques was used. Multivariate analysis determined characteristics associated with presence of instability (PAO+/−hip arthroscopy) or absence of instability (isolated hip arthroscopy) based on clinical diagnosis. During the study period, 39.8% of the cohort underwent PAO. Multivariate analysis with bootstrapping was performed and results were transformed into a BHIS nomogram (higher score representing more instability). Then, BHIS was externally validated in 114 BD patients enrolled in a multicenter prospective cohort study across 10 surgeons (with varied treatment approaches from arthroscopy to open procedures). Results. The most parsimonious, best fit model included 4 variables associated with the diagnosis of instability: acetabular inclination (AI), anterior center edge angle (ACEA), maximum alpha angle, and internal rotation in 90 degrees of flexion (IRF). Sex and LCEA were not significant predictors. Mean BHIS in the population was 50.0 (instability 57.7 ±7.9; non-instability 44.8±7.3, p<0.001). BHIS demonstrated excellent predictive (discriminatory) ability with c-statistic=0.89. In Part 2, BHIS maintained excellent c-statistic=0.92 in external validation. Mean BHIS in the external cohort was 53.9 (instability 66.5±11.5; non-instability 43.0±10.8, p<0.001). Discussion. In patients with BD, key factors in diagnosing significant instability treated with PAO were AI, ACEA, maximum alpha-angle, and IRF. The BHIS score allowed for differentiation of patients with and without instability in the development and external validation cohort. For any tables or figures, please contact the authors directly


The Bone & Joint Journal
Vol. 101-B, Issue 10 | Pages 1300 - 1306
1 Oct 2019
Oliver WM Smith TJ Nicholson JA Molyneux SG White TO Clement ND Duckworth AD

Aims. The primary aim of this study was to develop a reliable, effective radiological score to assess the healing of humeral shaft fractures, the Radiographic Union Score for HUmeral fractures (RUSHU). The secondary aim was to assess whether the six-week RUSHU was predictive of nonunion at six months after the injury. Patients and Methods. Initially, 20 patients with radiographs six weeks following a humeral shaft fracture were selected at random from a trauma database and scored by three observers, based on the Radiographic Union Scale for Tibial fractures system. After refinement of the RUSHU criteria, a second group of 60 patients with radiographs six weeks after injury, 40 with fractures that united and 20 with fractures that developed nonunion, were scored by two blinded observers. Results. After refinement, the interobserver intraclass correlation coefficient (ICC) was 0.79 (95% confidence interval (CI) 0.67 to 0.87), indicating substantial agreement. At six weeks after injury, patients whose fractures united had a significantly higher median score than those who developed nonunion (10 vs 7; p < 0.001). A receiver operating characteristic curve determined that a RUSHU cut-off of < 8 was predictive of nonunion (area under the curve = 0.84, 95% CI 0.74 to 0.94). The sensitivity was 75% and specificity 80% with a positive predictive value (PPV) of 65% and a negative predictive value of 86%. Patients with a RUSHU < 8 (n = 23) were more likely to develop nonunion than those with a RUSHU ≥ 8 (n = 37, odds ratio 12.0, 95% CI 3.4 to 42.9). Based on a PPV of 65%, if all patients with a RUSHU < 8 underwent fixation, the number of procedures needed to avoid one nonunion would be 1.5. Conclusion. The RUSHU is reliable and effective in identifying patients at risk of nonunion of a humeral shaft fracture at six weeks after injury. This tool requires external validation but could potentially reduce the morbidity associated with delayed treatment of an established nonunion. Cite this article: Bone Joint J 2019;101-B:1300–1306


Orthopaedic Proceedings
Vol. 102-B, Issue SUPP_8 | Pages 12 - 12
1 Aug 2020
Melo L White S Chaudhry H Stavrakis A Wolfstadt J Ward S Atrey A Khoshbin A Nowak L
Full Access

Over 300,000 total hip arthroplasties (THA) are performed annually in the USA. Surgical Site Infections (SSI) are one of the most common complications and are associated with increased morbidity, mortality and cost. Risk factors for SSI include obesity, diabetes and smoking, but few studies have reported on the predictive value of pre-operative blood markers for SSI. The purpose of this study was to create a clinical prediction model for acute SSI (classified as either superficial, deep and overall) within 30 days of THA based on commonly ordered pre-operative lab markers and using data from the American College of Surgeons National Surgical Quality Improvement Program (ACS-NSQIP) database. All adult patients undergoing an elective unilateral THA for osteoarthritis from 2011–2016 were identified from the NSQIP database using Current Procedural Terminology (CPT) codes. Patients with active or chronic, local or systemic infection/sepsis or disseminated cancer were excluded. Multivariate logistic regression was used to determine coefficients, with manual stepwise reduction. Receiver Operating Characteristic (ROC) curves were also graphed. The SSI prediction model included the following covariates: body mass index (BMI) and sex, comorbidities such as congestive heart failure (CHF), chronic obstructive pulmonary disease (COPD), smoking, current/previous steroid use, as well as pre-operative blood markers, albumin, alkaline phosphate, blood urea nitrogen (BUN), creatinine, hematocrit, international normalized ratio (INR), platelets, prothrombin time (PT), sodium and white blood cell (WBC) levels. Since the data met logistic assumption requirements, bootstrap estimation was used to measure internal validity. The area under the ROC curve for final derivations along with McFadden's R-squared were utilized to compare prediction models. A total of 130,619 patients were included with the median age of patients at time of THA was 67 years (mean=66.6+11.6 years) with 44.8% (n=58,757) being male. A total of 1,561 (1.20%) patients had a superficial or deep SSI (overall SSI). Of all SSI, 45.1% (n=704) had a deep SSI and 55.4% (n=865) had a superficial SSI. The incidence of SSI occurring annually decreased from 1.44% in 2011 to 1.16% in 2016. Area under the ROC curve for the SSI prediction model was 0.79 and 0.78 for deep and superficial SSI, respectively and 0.71 for overall SSI. CHF had the largest effect size (Odds Ratio(OR)=2.88, 95% Confidence Interval (95%CI): 1.56 – 5.32) for overall SSI risk. Albumin (OR=0.44, 95% CI: 0.37 – 0.52, OR=0.31, 95% CI: 0.25 – 0.39, OR=0.48, 95% CI: 0.41 – 0.58) and sodium (OR=0.95, 95% CI: 0.93 – 0.97, OR=0.94, 95% CI: 0.91 – 0.97, OR=0.95, 95% CI: 0.93 – 0.98) levels were consistently significant in all clinical prediction models for superficial, deep and overall SSI, respectively. In terms of pre-operative blood markers, hypoalbuminemia and hyponatremia are both significant risk factors for superficial, deep and overall SSI. In this large NSQIP database study, we were able to create an SSI prediction model and identify risk factors for predicting acute superficial, deep and overall SSI after THA. To our knowledge, this is the first clinical model whereby pre-operative hyponatremia (in addition to hypoalbuminemia) levels have been predictive of SSI after THA. Although the model remains without external validation, it is a vital starting point for developing a risk prediction model for SSI and can help physicians mitigate risk factors for acute SSI post THA


Orthopaedic Proceedings
Vol. 91-B, Issue SUPP_I | Pages 28 - 28
1 Mar 2009
Tannast M Mistry S Steppacher S Langlotz F Zheng G Siebenrock K
Full Access

Introduction: It could be shown that an ample number of classical hip parameters for radiographic quantification of hip morphology on anteroposterior (AP) pelvic radiographs vary significantly with individual pelvic tilt and rotation. This could be proven not only for classical hip parameters (e.g. the lateral centre edge angle) but also for more recently described radiographic features such as acetabular retroversion. The resulting misdiagnosis and misinterpretation can potentially impair a correct therapy for the patient. We developed fast and easy-to-use computer software to perform three-dimensional (3D) analysis of the individual hip joint morphology using two-dimensional (2D) AP pelvic radiographs. Landmarks extracted from the radiograph were combined with a cone beam x-ray projection model and a strong lateral pelvic radiograph to reconstruct 3D hip joints. Twenty-five parameters including quantification of femoral head coverage can be calculated for a neutral orientation. The aim of the study was to evaluate the validity of this method for tilt and rotation correction of the acetabular rim and associated radiographic parameters. Methods: The validation comprised three steps:. External validation;. internal validation; and. intra-/interobserver analysis. A series of x-rays of 30 cadaver pelves mounted on a flexible holding device were available for step 1 and 2. External validation comprised the comparison of radiographical parameters of the cadaver hips when determined with our software in comparison with CT-based measurements or actual radiographs in a neutral pelvic orientation as gold standard. Internal validation evaluated the consistency of the parameters when each single pelvis was calculated back from different random orientations to the same neutral pelvic position. The intra-/interobserver analysis investigated the reliability and reproducibility of all parameters with the help of 100 randomized, blinded AP pelvic radiographs of a consecutive patient series. Results:. All but one parameter (acetabular index) showed a substantial to almost perfect correlation with the CT-measurements. Internal validity was substantial to almost perfect for all parameters. There was a substantial to almost perfect reliability and reproducibility of all parameters except the acetabular index. Conclusion: The software could be shown to be an accurate, reliable and reproducible method for correction of AP pelvic radiographs. This computer-assisted method allows standardized evaluation of all relevant radiographic parameters for detection of anatomic morphologic differences. It will be used to study the influence of pelvic malorientation on the radiographic appearance of each individual parameter. In addition, it allows evaluating the clinical significance of standardizing pelvic parameters