Aims. Our aim was to develop and validate nomograms that would predict the cumulative incidence of sarcoma-specific death (CISSD) and disease progression (CIDP) in patients with localized high-grade primary central and dedifferentiated chondrosarcoma. Methods. The study population consisted of 391 patients from two international sarcoma centres (development cohort) who had undergone definitive surgery for a localized high-grade (histological grade II or III) conventional primary central chondrosarcoma or dedifferentiated chondrosarcoma. Disease progression captured the first event of either metastasis or local recurrence. An independent cohort of 221 patients from three additional hospitals was used for
Aims. The number of convolutional neural networks (CNN) available for fracture detection and classification is rapidly increasing.
Aims. To develop and externally validate a parsimonious statistical prediction model of 90-day mortality after elective total hip arthroplasty (THA), and to provide a web calculator for clinical usage. Methods. We included 53,099 patients with cemented THA due to osteoarthritis from the Swedish Hip Arthroplasty Registry for model derivation and internal validation, as well as 125,428 patients from England and Wales recorded in the National Joint Register for England, Wales, Northern Ireland, the Isle of Man, and the States of Guernsey (NJR) for
The purpose of this study was to develop a convolutional neural network (CNN) for fracture detection, classification, and identification of greater tuberosity displacement ≥ 1 cm, neck-shaft angle (NSA) ≤ 100°, shaft translation, and articular fracture involvement, on plain radiographs. The CNN was trained and tested on radiographs sourced from 11 hospitals in Australia and externally validated on radiographs from the Netherlands. Each radiograph was paired with corresponding CT scans to serve as the reference standard based on dual independent evaluation by trained researchers and attending orthopaedic surgeons. Presence of a fracture, classification (non- to minimally displaced; two-part, multipart, and glenohumeral dislocation), and four characteristics were determined on 2D and 3D CT scans and subsequently allocated to each series of radiographs. Fracture characteristics included greater tuberosity displacement ≥ 1 cm, NSA ≤ 100°, shaft translation (0% to < 75%, 75% to 95%, > 95%), and the extent of articular involvement (0% to < 15%, 15% to 35%, or > 35%).Aims
Methods
Aims. To examine whether natural language processing (NLP) using a clinically based large language model (LLM) could be used to predict patient selection for total hip or total knee arthroplasty (THA/TKA) from routinely available free-text radiology reports. Methods. Data pre-processing and analyses were conducted according to the Artificial intelligence to Revolutionize the patient Care pathway in Hip and knEe aRthroplastY (ARCHERY) project protocol. This included use of de-identified Scottish regional clinical data of patients referred for consideration of THA/TKA, held in a secure data environment designed for artificial intelligence (AI) inference. Only preoperative radiology reports were included. NLP algorithms were based on the freely available GatorTron model, a LLM trained on over 82 billion words of de-identified clinical text. Two inference tasks were performed: assessment after model-fine tuning (50 Epochs and three cycles of k-fold cross validation), and
Aims. Machine-learning (ML) prediction models in orthopaedic trauma hold great promise in assisting clinicians in various tasks, such as personalized risk stratification. However, an overview of current applications and critical appraisal to peer-reviewed guidelines is lacking. The objectives of this study are to 1) provide an overview of current ML prediction models in orthopaedic trauma; 2) evaluate the completeness of reporting following the Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD) statement; and 3) assess the risk of bias following the Prediction model Risk Of Bias Assessment Tool (PROBAST) tool. Methods. A systematic search screening 3,252 studies identified 45 ML-based prediction models in orthopaedic trauma up to January 2023. The TRIPOD statement assessed transparent reporting and the PROBAST tool the risk of bias. Results. A total of 40 studies reported on training and internal validation; four studies performed both development and
To examine whether Natural Language Processing (NLP) using a state-of-the-art clinically based Large Language Model (LLM) could predict patient selection for Total Hip Arthroplasty (THA), across a range of routinely available clinical text sources. Data pre-processing and analyses were conducted according to the Ai to Revolutionise the patient Care pathway in Hip and Knee arthroplasty (ARCHERY) project protocol (. https://www.researchprotocols.org/2022/5/e37092/. ). Three types of deidentified Scottish regional clinical free text data were assessed: Referral letters, radiology reports and clinic letters. NLP algorithms were based on the GatorTron model, a Bidirectional Encoder Representations from Transformers (BERT) based LLM trained on 82 billion words of de-identified clinical text. Three specific inference tasks were performed: assessment of the base GatorTron model, assessment after model-fine tuning, and
Aims. The aim of this study was to identify factors associated with five-year cancer-related mortality in patients with limb and trunk soft-tissue sarcoma (STS) and develop and validate machine learning algorithms in order to predict five-year cancer-related mortality in these patients. Methods. Demographic, clinicopathological, and treatment variables of limb and trunk STS patients in the Surveillance, Epidemiology, and End Results Program (SEER) database from 2004 to 2017 were analyzed. Multivariable logistic regression was used to determine factors significantly associated with five-year cancer-related mortality. Various machine learning models were developed and compared using area under the curve (AUC), calibration, and decision curve analysis. The model that performed best on the SEER testing data was further assessed to determine the variables most important in its predictive capacity. This model was externally validated using our institutional dataset. Results. A total of 13,646 patients with STS from the SEER database were included, of whom 35.9% experienced five-year cancer-related mortality. The random forest model performed the best overall and identified tumour size as the most important variable when predicting mortality in patients with STS, followed by M stage, histological subtype, age, and surgical excision. Each variable was significant in logistic regression.
Aims. This study aimed to explore the biological and clinical importance of dysregulated key genes in osteoarthritis (OA) patients at the cartilage level to find potential biomarkers and targets for diagnosing and treating OA. Methods. Six sets of gene expression profiles were obtained from the Gene Expression Omnibus database. Differential expression analysis, weighted gene coexpression network analysis (WGCNA), and multiple machine-learning algorithms were used to screen crucial genes in osteoarthritic cartilage, and genome enrichment and functional annotation analyses were used to decipher the related categories of gene function. Single-sample gene set enrichment analysis was performed to analyze immune cell infiltration. Correlation analysis was used to explore the relationship among the hub genes and immune cells, as well as markers related to articular cartilage degradation and bone mineralization. Results. A total of 46 genes were obtained from the intersection of significantly upregulated genes in osteoarthritic cartilage and the key module genes screened by WGCNA. Functional annotation analysis revealed that these genes were closely related to pathological responses associated with OA, such as inflammation and immunity. Four key dysregulated genes (cartilage acidic protein 1 (CRTAC1), iodothyronine deiodinase 2 (DIO2), angiopoietin-related protein 2 (ANGPTL2), and MAGE family member D1 (MAGED1)) were identified after using machine-learning algorithms. These genes had high diagnostic value in both the training cohort and
Aims. Machine learning (ML), a branch of artificial intelligence that uses algorithms to learn from data and make predictions, offers a pathway towards more personalized and tailored surgical treatments. This approach is particularly relevant to prevalent joint diseases such as osteoarthritis (OA). In contrast to end-stage disease, where joint arthroplasty provides excellent results, early stages of OA currently lack effective therapies to halt or reverse progression. Accurate prediction of OA progression is crucial if timely interventions are to be developed, to enhance patient care and optimize the design of clinical trials. Methods. A systematic review was conducted in accordance with PRISMA guidelines. We searched MEDLINE and Embase on 5 May 2024 for studies utilizing ML to predict OA progression. Titles and abstracts were independently screened, followed by full-text reviews for studies that met the eligibility criteria. Key information was extracted and synthesized for analysis, including types of data (such as clinical, radiological, or biochemical), definitions of OA progression, ML algorithms, validation methods, and outcome measures. Results. Out of 1,160 studies initially identified, 39 were included. Most studies (85%) were published between 2020 and 2024, with 82% using publicly available datasets, primarily the Osteoarthritis Initiative. ML methods were predominantly supervised, with significant variability in the definitions of OA progression: most studies focused on structural changes (59%), while fewer addressed pain progression or both. Deep learning was used in 44% of studies, while automated ML was used in 5%. There was a lack of standardization in evaluation metrics and limited
Aims. Precise implant positioning, tailored to individual spinopelvic biomechanics and phenotype, is paramount for stability in total hip arthroplasty (THA). Despite a few studies on instability prediction, there is a notable gap in research utilizing artificial intelligence (AI). The objective of our pilot study was to evaluate the feasibility of developing an AI algorithm tailored to individual spinopelvic mechanics and patient phenotype for predicting impingement. Methods. This international, multicentre prospective cohort study across two centres encompassed 157 adults undergoing primary robotic arm-assisted THA. Impingement during specific flexion and extension stances was identified using the virtual range of motion (ROM) tool of the robotic software. The primary AI model, the Light Gradient-Boosting Machine (LGBM), used tabular data to predict impingement presence, direction (flexion or extension), and type. A secondary model integrating tabular data with plain anteroposterior pelvis radiographs was evaluated to assess for any potential enhancement in prediction accuracy. Results. We identified nine predictors from an analysis of baseline spinopelvic characteristics and surgical planning parameters. Using fivefold cross-validation, the LGBM achieved 70.2% impingement prediction accuracy. With impingement data, the LGBM estimated direction with 85% accuracy, while the support vector machine (SVM) determined impingement type with 72.9% accuracy. After integrating imaging data with a multilayer perceptron (tabular) and a convolutional neural network (radiograph), the LGBM’s prediction was 68.1%. Both combined and LGBM-only had similar impingement direction prediction rates (around 84.5%). Conclusion. This study is a pioneering effort in leveraging AI for impingement prediction in THA, utilizing a comprehensive, real-world clinical dataset. Our machine-learning algorithm demonstrated promising accuracy in predicting impingement, its type, and direction. While the addition of imaging data to our deep-learning algorithm did not boost accuracy, the potential for refined annotations, such as landmark markings, offers avenues for future enhancement. Prior to clinical integration,
Introduction. Debridement, antibiotics irrigation and implant retention (DAIR) is a common management strategy for hip and knee prosthetic joint infections (PJI). However, failure rates remain high, which has led to the development of predictive tools to help determine success. These tools include KLIC and CRIME80 for acute-postoperative (AP) and acute haematogenous (AH) PJI respectively. We investigated whether these tools were applicable to a Waikato cohort. Method. We performed a retrospective cohort study that evaluated patients who underwent DAIR between January 2010 and June 2020 at Waikato Hospital. Pre-operative KLIC and CRIME80 scores were calculated and compared to success of operation. Failure was defined as: (i) need for further surgery, (ii) need for suppressive antibiotics, (iii) death due to the infection. Logistic regression models were used to calculate the area under the curve (AUC). Results. 117 eligible patients underwent DAIR, 53 in the AP cohort and 64 in the AH cohort. Failure rate at 2 years post-op was 43% in the AP cohort and 59% in the AH cohort. In the AP cohort a KLIC score of <4 had a DAIR failure rate of 28.6%, while those who scored ³4 had a failure rate of 72.2% (p=0.002). In the AH cohort a CRIME80 score of <3 had a DAIR failure rate of 48% while those who scored ³3 had a 100% failure rate (p<0.001). Discussion. This study represents the first
To develop a reliable and effective radiological score to assess the healing of isolated ulnar shaft fractures (IUSF), the Radiographic Union Score for Ulna fractures (RUSU). Initially, 20 patients with radiographs six weeks following a non-operatively managed ulnar shaft fracture were selected and scored by three blinded observers. After intraclass correlation (ICC) analysis, a second group of 54 patients with radiographs six weeks after injury (18 who developed a nonunion and 36 who united) were scored by the same observers. In the initial study, interobserver and intraobserver ICC were 0.89 and 0.93, respectively. In the validation study the interobserver ICC was 0.85. The median score for patients who united was significantly higher than those who developed a nonunion (11 vs 7, p<0.001). A ROC curve demonstrated that a RUSU ≤8 had a sensitivity of 88.9% and specificity of 86.1% in identifying patients at risk of nonunion. Patients with a RUSU ≤8 (n = 21) were more likely to develop a nonunion (n = 16/21) than those with a RUSU ≥9 (n = 2/33; OR 49.6, 95% CI 8.6–284.7). Based on a PPV of 76%, if all patients with a RUSU ≤8 underwent fixation at 6-weeks, the number of procedures needed to avoid one nonunion would be 1.3. The RUSU shows good interobserver and intraobserver reliability and is effective in identifying patients at risk of nonunion six weeks after fracture. This tool requires
Numerous prediction tools are available for estimating postoperative risk following spine surgery.
Abstract. Introduction. Precision health aims to develop personalised and proactive strategies for predicting, preventing, and treating complex diseases such as osteoarthritis (OA), a degenerative joint disease affecting over 300 million people worldwide. Due to OA heterogeneity, which makes developing effective treatments challenging, identifying patients at risk for accelerated disease progression is essential for efficient clinical trial design and new treatment target discovery and development. Objectives. This study aims to create a trustworthy and interpretable precision health tool that predicts rapid knee OA progression based on baseline patient characteristics using an advanced automated machine learning (autoML) framework, “Autoprognosis 2.0”. Methods. All available 2-year follow-up periods of 600 patients from the FNIH OA Biomarker Consortium were analysed using “Autoprognosis 2.0” in two separate approaches, with distinct definitions of clinical outcomes: multi-class predictions (categorising patients into non-progressors, pain-only progressors, radiographic-only progressors, and both pain and radiographic progressors) and binary predictions (categorising patients into non-progressors and progressors). Models were developed using a training set of 1352 instances and all available variables (including clinical, X-ray, MRI, and biochemical features), and validated through both stratified 10-fold cross-validation and hold-out validation on a testing set of 339 instances. Model performance was assessed using multiple evaluation metrics, such as AUC-ROC, AUC-PRC, F1-score, precision, and recall. Additionally, interpretability analyses were carried out to identify important predictors of rapid disease progression. Results. Our final models yielded high accuracy scores for both multi-class predictions (AUC-ROC: 0.858, 95% CI: 0.856–0.860; AUC-PRC: 0.675, 95% CI: 0.671–0.679; F1-score: 0.560, 95% CI: 0.554–0.566) and binary predictions (AUC-ROC: 0.717, 95% CI: 0.712–0.722; AUC-PRC: 0.620, 95% CI: 0.616–0.624; F1-score: 0.676, 95% CI: 0.673–0679). Important predictors of rapid disease progression included the Western Ontario and McMaster Universities Osteoarthritis Index (WOMAC) scores and MRI features. Our models were further successfully validated using a hold-out dataset, which was previously omitted from model development and training (AUC-ROC: 0.877 for multi-class predictions; AUC-ROC: 0.746 for binary predictions). Additionally, accurate ML models were developed for predicting OA progression in a subgroup of patients aged 65 or younger (AUC-ROC: 0.862, 95% CI: 0.861–0.863 for multi-class predictions; AUC-ROC: 0.736, 95% CI: 0.734–0.738 for binary predictions). Conclusions. This study presents a reliable and interpretable precision health tool for predicting rapid knee OA progression using “Autoprognosis 2.0”. Our models provide accurate predictions and offer insights into important predictors of rapid disease progression. Furthermore, the transparency and interpretability of our methods may facilitate their acceptance by clinicians and patients, enabling effective utilisation in clinical practice. Future work should focus on refining these models by increasing the sample size, integrating additional features, and using independent datasets for
Topic. Utilizing radiographic, physical exam and history findings, we developed a novel clinical score to aid in the surgical decision making process for hips with borderline/ transitional dysplastic hips. Background. Treatment of borderline acetabular dysplasia (BD) is controversial with some patients having primarily instability-based symptoms while others have impingement-based symptoms. The purpose of this study was to identify the most important patient characteristics influencing the diagnosis of instability vs. non-instability, develop a clinical score (Borderline Hip Instability Score, BHIS) to collectively characterize these factors and to externally validate BHIS in a multicenter cohort BD patients. Methods. First a retrospective cohort of 186 hips undergoing surgical treatment of BD (LCEA 20°-25°) from a single surgeon experienced in arthroscopic and open techniques was used. Multivariate analysis determined characteristics associated with presence of instability (PAO+/−hip arthroscopy) or absence of instability (isolated hip arthroscopy) based on clinical diagnosis. During the study period, 39.8% of the cohort underwent PAO. Multivariate analysis with bootstrapping was performed and results were transformed into a BHIS nomogram (higher score representing more instability). Then, BHIS was externally validated in 114 BD patients enrolled in a multicenter prospective cohort study across 10 surgeons (with varied treatment approaches from arthroscopy to open procedures). Results. The most parsimonious, best fit model included 4 variables associated with the diagnosis of instability: acetabular inclination (AI), anterior center edge angle (ACEA), maximum alpha angle, and internal rotation in 90 degrees of flexion (IRF). Sex and LCEA were not significant predictors. Mean BHIS in the population was 50.0 (instability 57.7 ±7.9; non-instability 44.8±7.3, p<0.001). BHIS demonstrated excellent predictive (discriminatory) ability with c-statistic=0.89. In Part 2, BHIS maintained excellent c-statistic=0.92 in
Aims. The primary aim of this study was to develop a reliable, effective radiological score to assess the healing of humeral shaft fractures, the Radiographic Union Score for HUmeral fractures (RUSHU). The secondary aim was to assess whether the six-week RUSHU was predictive of nonunion at six months after the injury. Patients and Methods. Initially, 20 patients with radiographs six weeks following a humeral shaft fracture were selected at random from a trauma database and scored by three observers, based on the Radiographic Union Scale for Tibial fractures system. After refinement of the RUSHU criteria, a second group of 60 patients with radiographs six weeks after injury, 40 with fractures that united and 20 with fractures that developed nonunion, were scored by two blinded observers. Results. After refinement, the interobserver intraclass correlation coefficient (ICC) was 0.79 (95% confidence interval (CI) 0.67 to 0.87), indicating substantial agreement. At six weeks after injury, patients whose fractures united had a significantly higher median score than those who developed nonunion (10 vs 7; p < 0.001). A receiver operating characteristic curve determined that a RUSHU cut-off of < 8 was predictive of nonunion (area under the curve = 0.84, 95% CI 0.74 to 0.94). The sensitivity was 75% and specificity 80% with a positive predictive value (PPV) of 65% and a negative predictive value of 86%. Patients with a RUSHU < 8 (n = 23) were more likely to develop nonunion than those with a RUSHU ≥ 8 (n = 37, odds ratio 12.0, 95% CI 3.4 to 42.9). Based on a PPV of 65%, if all patients with a RUSHU < 8 underwent fixation, the number of procedures needed to avoid one nonunion would be 1.5. Conclusion. The RUSHU is reliable and effective in identifying patients at risk of nonunion of a humeral shaft fracture at six weeks after injury. This tool requires
Over 300,000 total hip arthroplasties (THA) are performed annually in the USA. Surgical Site Infections (SSI) are one of the most common complications and are associated with increased morbidity, mortality and cost. Risk factors for SSI include obesity, diabetes and smoking, but few studies have reported on the predictive value of pre-operative blood markers for SSI. The purpose of this study was to create a clinical prediction model for acute SSI (classified as either superficial, deep and overall) within 30 days of THA based on commonly ordered pre-operative lab markers and using data from the American College of Surgeons National Surgical Quality Improvement Program (ACS-NSQIP) database. All adult patients undergoing an elective unilateral THA for osteoarthritis from 2011–2016 were identified from the NSQIP database using Current Procedural Terminology (CPT) codes. Patients with active or chronic, local or systemic infection/sepsis or disseminated cancer were excluded. Multivariate logistic regression was used to determine coefficients, with manual stepwise reduction. Receiver Operating Characteristic (ROC) curves were also graphed. The SSI prediction model included the following covariates: body mass index (BMI) and sex, comorbidities such as congestive heart failure (CHF), chronic obstructive pulmonary disease (COPD), smoking, current/previous steroid use, as well as pre-operative blood markers, albumin, alkaline phosphate, blood urea nitrogen (BUN), creatinine, hematocrit, international normalized ratio (INR), platelets, prothrombin time (PT), sodium and white blood cell (WBC) levels. Since the data met logistic assumption requirements, bootstrap estimation was used to measure internal validity. The area under the ROC curve for final derivations along with McFadden's R-squared were utilized to compare prediction models. A total of 130,619 patients were included with the median age of patients at time of THA was 67 years (mean=66.6+11.6 years) with 44.8% (n=58,757) being male. A total of 1,561 (1.20%) patients had a superficial or deep SSI (overall SSI). Of all SSI, 45.1% (n=704) had a deep SSI and 55.4% (n=865) had a superficial SSI. The incidence of SSI occurring annually decreased from 1.44% in 2011 to 1.16% in 2016. Area under the ROC curve for the SSI prediction model was 0.79 and 0.78 for deep and superficial SSI, respectively and 0.71 for overall SSI. CHF had the largest effect size (Odds Ratio(OR)=2.88, 95% Confidence Interval (95%CI): 1.56 – 5.32) for overall SSI risk. Albumin (OR=0.44, 95% CI: 0.37 – 0.52, OR=0.31, 95% CI: 0.25 – 0.39, OR=0.48, 95% CI: 0.41 – 0.58) and sodium (OR=0.95, 95% CI: 0.93 – 0.97, OR=0.94, 95% CI: 0.91 – 0.97, OR=0.95, 95% CI: 0.93 – 0.98) levels were consistently significant in all clinical prediction models for superficial, deep and overall SSI, respectively. In terms of pre-operative blood markers, hypoalbuminemia and hyponatremia are both significant risk factors for superficial, deep and overall SSI. In this large NSQIP database study, we were able to create an SSI prediction model and identify risk factors for predicting acute superficial, deep and overall SSI after THA. To our knowledge, this is the first clinical model whereby pre-operative hyponatremia (in addition to hypoalbuminemia) levels have been predictive of SSI after THA. Although the model remains without
Introduction: It could be shown that an ample number of classical hip parameters for radiographic quantification of hip morphology on anteroposterior (AP) pelvic radiographs vary significantly with individual pelvic tilt and rotation. This could be proven not only for classical hip parameters (e.g. the lateral centre edge angle) but also for more recently described radiographic features such as acetabular retroversion. The resulting misdiagnosis and misinterpretation can potentially impair a correct therapy for the patient. We developed fast and easy-to-use computer software to perform three-dimensional (3D) analysis of the individual hip joint morphology using two-dimensional (2D) AP pelvic radiographs. Landmarks extracted from the radiograph were combined with a cone beam x-ray projection model and a strong lateral pelvic radiograph to reconstruct 3D hip joints. Twenty-five parameters including quantification of femoral head coverage can be calculated for a neutral orientation. The aim of the study was to evaluate the validity of this method for tilt and rotation correction of the acetabular rim and associated radiographic parameters. Methods: The validation comprised three steps:.