Advertisement for orthosearch.org.uk
Results 1 - 50 of 807
Results per page:
The Bone & Joint Journal
Vol. 102-B, Issue 8 | Pages 1041 - 1047
1 Aug 2020
Hamoodi Z Singh J Elvey MH Watts AC

Aims. The Wrightington classification system of fracture-dislocations of the elbow divides these injuries into six subtypes depending on the involvement of the coronoid and the radial head. The aim of this study was to assess the reliability and reproducibility of this classification system. Methods. This was a blinded study using radiographs and CT scans of 48 consecutive patients managed according to the Wrightington classification system between 2010 and 2018. Four trauma and orthopaedic consultants, two post CCT fellows, and one speciality registrar based in the UK classified the injuries. The seven observers reviewed preoperative radiographs and CT scans twice, with a minimum four-week interval. Radiographs and CT scans were reviewed separately. Inter- and intraobserver reliability were calculated using Fleiss and Cohen kappa coefficients. The Landis and Koch criteria were used to interpret the strength of the kappa values. Validity was assessed by calculating the percentage agreement against intraoperative findings. Results. Of the 48 patients, three (6%) had type A injury, 11 (23%) type B, 16 (33%) type B+, 16 (33%) Type C, two (4%) type D+, and none had a type D injury. All 48 patients had anteroposterior (AP) and lateral radiographs, 44 had 2D CT scans, and 39 had 3D reconstructions. The interobserver reliability kappa value was 0.52 for radiographs, 0.71 for 2D CT scans, and 0.73 for a combination of 2D and 3D reconstruction CT scans. The median intraobserver reliability was 0.75 (interquartile range (IQR) 0.62 to 0.79) for radiographs, 0.77 (IQR 0.73 to 0.94) for 2D CT scans, and 0.89 (IQR 0.77 to 0.93) for the combination of 2D and 3D reconstruction. Validity analysis showed that accuracy significantly improved when using CT scans (p = 0.018 and p = 0.028 respectively). Conclusion. The Wrightington classification system is a reliable and valid method of classifying fracture-dislocations of the elbow. CT scans are significantly more accurate than radiographs when identifying the pattern of injury, with good intra- and interobserver reproducibility. Cite this article: Bone Joint J 2020;102-B(8):1041–1047


The Journal of Bone & Joint Surgery British Volume
Vol. 94-B, Issue 1 | Pages 32 - 36
1 Jan 2012
Nho J Lee Y Kim HJ Ha Y Suh Y Koo K

A variety of radiological methods of measuring version of the acetabular component after total hip replacement (THR) have been described. The aim of this study was to evaluate the reliability and validity of six methods (those of Lewinnek; Widmer; Hassan et al; Ackland, Bourne and Uhthoff; Liaw et al; and Woo and Morrey) that are currently in use. In 36 consecutive patients who underwent THR, version of the acetabular component was measured by three independent examiners on plain radiographs using these six methods and compared with measurements using CT scans. The intra- and interobserver reliabilities of each measurement were estimated. All measurements on both radiographs and CT scans had excellent intra- and interobserver reliability and the results from each of the six methods correlated well with the CT measurements. However, measurements made using the methods of Widmer and of Ackland, Bourne and Uhthoff were significantly different from the CT measurements (both p < 0.001), whereas measurements made using the remaining four methods were similar to the CT measurements. With regard to reliability and convergent validity, we recommend the use of the methods described by Lewinnek, Hassan et al, Liaw et al and Woo and Morrey for measurement of version of the acetabular component


The Bone & Joint Journal
Vol. 102-B, Issue 4 | Pages 478 - 484
1 Apr 2020
Daniels AM Wyers CE Janzing HMJ Sassen S Loeffen D Kaarsemaker S van Rietbergen B Hannemann PFW Poeze M van den Bergh JP

Aims

Besides conventional radiographs, the use of MRI, CT, and bone scintigraphy is frequent in the diagnosis of a fracture of the scaphoid. However, which techniques give the best results remain unknown. The investigation of a new imaging technique initially requires an analysis of its precision. The primary aim of this study was to investigate the interobserver agreement of high-resolution peripheral quantitative CT (HR-pQCT) in the diagnosis of a scaphoid fracture. A secondary aim was to investigate the interobserver agreement for the presence of other fractures and for the classification of scaphoid fracture.

Methods

Two radiologists and two orthopaedic trauma surgeons evaluated HR-pQCT scans of 31 patients with a clinically-suspected scaphoid fracture. The observers were asked to determine the presence of a scaphoid or other fracture and to classify the scaphoid fracture based on the Herbert classification system. Fleiss kappa statistics were used to calculate the interobserver agreement for the diagnosis of a fracture. Intraclass correlation coefficients (ICCs) were used to assess the agreement for the classification of scaphoid fracture.


The Bone & Joint Journal
Vol. 103-B, Issue 8 | Pages 1339 - 1344
1 Aug 2021
Jain S Mohrir G Townsend O Lamb JN Palan J Aderinto J Pandit H

Aims. This aim of this study was to assess the reliability and validity of the Unified Classification System (UCS) for postoperative periprosthetic femoral fractures (PFFs) around cemented polished taper-slip (PTS) stems. Methods. Radiographs of 71 patients with a PFF admitted consecutively at two centres between 25 February 2012 and 19 May 2020 were collated by an independent investigator. Six observers (three hip consultants and three trainees) were familiarized with the UCS. Each PFF was classified on two separate occasions, with a mean time between assessments of 22.7 days (16 to 29). Interobserver reliability for more than two observers was assessed using percentage agreement and Fleiss’ kappa statistic. Intraobserver reliability between two observers was calculated with Cohen kappa statistic. Validity was tested on surgically managed UCS type B PFFs where stem stability was documented in operation notes (n = 50). Validity was assessed using percentage agreement and Cohen kappa statistic between radiological assessment and intraoperative findings. Kappa statistics were interpreted using Landis and Koch criteria. All six observers were blinded to operation notes and postoperative radiographs. Results. Interobserver reliability percentage agreement was 58.5% and the overall kappa value was 0.442 (moderate agreement). Lowest kappa values were seen for type B fractures (0.095 to 0.360). The mean intraobserver reliability kappa value was 0.672 (0.447 to 0.867), indicating substantial agreement. Validity percentage agreement was 65.7% and the mean kappa value was 0.300 (0.160 to 0.4400) indicating only fair agreement. Conclusion. This study demonstrates that the UCS is unsatisfactory for the classification of PFFs around PTS stems, and that it has considerably lower reliability and validity than previously described for other stem types. Radiological PTS stem loosening in the presence of PFF is poorly defined and formal intraoperative testing of stem stability is recommended. Cite this article: Bone Joint J 2021;103-B(8):1339–1344


The Journal of Bone & Joint Surgery British Volume
Vol. 92-B, Issue 4 | Pages 571 - 575
1 Apr 2010
Clint SA Morris TP Shaw OM Oddy MJ Rudge B Barry M

The databases of the Picture Archiving and Communication Systems of two hospitals were searched and all children who had a lateral radiograph of the ankle during their attendance at the emergency department were identified. In 227 radiographs, Bohler’s and Gissane’s angles were measured on two separate occasions and by two separate authors to allow calculation of inter- and intra-observer variation. Intraclass correlation coefficients were used to assess the reliability of the measurements. For Bohler’s angle the overall inter-observer reliability, the intraclass correlation coefficient was 0.90 and the intra-observer reliability 0.95, giving excellent agreement. This reliability was maintained across the age groups. For Gissane’s angle, inter- and intra-observer reliability was only fair or poor across most age groups. Further analysis of the Bohler’s angle showed a significant variation in the mean angle with age. Contrary to published opinion, the angle is not uniformly lower than that of adults but varies with age, peaking towards the end of the first decade before attaining adult values. The age-related radiologic changes presented here may help in the interpretation of injuries to the hindfoot in children


The Journal of Bone & Joint Surgery British Volume
Vol. 88-B, Issue 8 | Pages 1048 - 1052
1 Aug 2006
Jerosch-Herold C Rosén B Shepstone L

Locognosia, the ability to localise touch, is one aspect of tactile spatial discrimination which relies on the integrity of peripheral end-organs as well as the somatosensory representation of the surface of the body in the brain. The test presented here is a standardised assessment which uses a protocol for testing locognosia in the zones of the hand supplied by the median and/or ulnar nerves. The test-retest reliability and discriminant validity were investigated in 39 patients with injuries to the median or ulnar nerve. Intraclass correlation coefficients were used to calculate the test-retest reliability. Discriminant validity was assessed by comparing the injured with the unaffected hand. Excellent test-retest reliability was demonstrated for the injuries to the median (intraclass correlation coefficient 0.924, 95% confidence interval 0.848 to 1.00) and the ulnar nerves (intraclass correlation coefficient 0.859, 95% confidence interval 0.693 to 1.00). The magnitude of the difference in scores between affected and unaffected hands showed good discriminant validity. For injuries to the median nerve the mean difference was 11.1 points (1 to 33; . sd. 7.4), which was statistically significant (p < 0.0001, paired t-test) and for those of the ulnar nerve it was 4.75 points (1 to 13.5; . sd. 3.16), which was also statistically significant (paired t-test, p < 0.0001). The locognosia test has excellent test-retest reliability, is a valid test of tactile spatial discrimination and should be included in the evaluation of outcome after injury to peripheral nerves


The Journal of Bone & Joint Surgery British Volume
Vol. 91-B, Issue 7 | Pages 903 - 906
1 Jul 2009
Trickett RW Hodgson P Forster MC Robertson A

We aimed to determine the reliability, accuracy and the clinical role of digital templating in the pre-operative work-up for total knee replacement. Initially a sample of ten pre-operative digital radiographs were templated by four independent observers to determine the inter- and intra-observer reliability of the process. Digital templating was then performed on the radiographs of 40 consecutive patients undergoing total knee replacement by a consultant surgeon not involved with the operation, who was blinded to the size of the implant inserted. The Press Fit Condylar Sigma Knee system was used in all the patients. The size of the implant as judged by templating was then compared to that of the size used. Good inter- and intra-observer agreement was demonstrated for both femoral and tibial templating. However, the correct size of the implant was predicted in only 48% of the femoral and 55% of the tibial components. Albeit reproducible, digital templating does not currently predict the correct size of component often enough to be of clinical benefit


The Journal of Bone & Joint Surgery British Volume
Vol. 80-B, Issue 4 | Pages 670 - 672
1 Jul 1998
Flinkkilä T Nikkola-Sihto A Kaarela O Päakkö E Raatikainen T

Interobserver reliability of the AO system of classification of fractures of the distal radius was assessed using plain radiographs and CT. Five observers classified 30 Colles’-type fractures using only plain radiographs; two months later they were reclassified using CT in addition. Interobserver reliability was poor in both series when detailed classification was used. By reducing the categories to five, interobserver reliability was slightly improved, but was still poor. When only two AO types were used, the reliability was moderate using plain radiographs and good to excellent with the addition of CT. The use of CT as well as plain radiographs brings interobserver reliability to a good level in assessment of the presence or absence of articular involvement, but is otherwise of minor value in improving the interobserver reliability of the AO system of classification of fractures of the distal radius


The Bone & Joint Journal
Vol. 96-B, Issue 12 | Pages 1669 - 1673
1 Dec 2014
Van der Merwe JM Haddad FS Duncan CP

The Unified Classification System (UCS) was introduced because of a growing need to have a standardised universal classification system of periprosthetic fractures. It combines and simplifies many existing classification systems, and can be applied to any fracture around any partial or total joint replacement occurring during or after operation. Our goal was to assess the inter- and intra-observer reliability of the UCS in association with knee replacement when classifying fractures affecting one or more of the femur, tibia or patella. We used an international panel of ten orthopaedic surgeons with subspecialty fellowship training and expertise in adult hip and knee reconstruction (‘experts’) and ten residents of orthopaedic surgery in the last two years of training (‘pre-experts’). They each received 15 radiographs for evaluation. After six weeks they evaluated the same radiographs again but in a different order. . The reliability was assessed using the Kappa and weighted Kappa values. The Kappa values for inter-observer reliability for the experts and the pre-experts were 0.741 (95% confidence interval (CI) 0.707 to 0.774) and 0.765 (95% CI 0.733 to 0.797), respectively. The weighted Kappa values for intra-observer reliability for the experts and pre-experts were 0.898 (95% CI 0.846 to 0.950) and 0.878 (95% CI 0.815 to 0.942) respectively. The UCS has substantial inter-observer reliability and ‘near perfect’ intra-observer reliability when used for periprosthetic fractures in association with knee replacement in the hands of experienced and inexperienced users. Cite this article: Bone Joint J 2014;96-B:1669–73


The Bone & Joint Journal
Vol. 98-B, Issue 2 | Pages 166 - 172
1 Feb 2016
Langlois J Hamadouche M

Previous standards for assessing the reliability of a measurement tool have lacked consistency. We reviewed the most current American Society for Testing and Materials and International Organisation for Standardisation (ISO) recommendations, and propose an algorithm for orthopaedic surgeons. When assessing a measurement tool, conditions of the experimental set-up and clear formulae used to compile the results should be strictly reported. According to these recent guidelines, accuracy is a confusing word with an overly broad meaning and should therefore be abandoned. Depending on the experimental conditions, one should be referring to bias (when the study protocol involves accepted reference values), and repeatability (sr, r) or reproducibility (SR, R). In the absence of accepted reference values, only repeatability (sr, r) or reproducibility (SR, R) should be provided. Take home message: Assessing the reliability of a measurement tool involves reporting bias, repeatability and/or reproducibility depending on the defined conditions, instead of precision or accuracy. Cite this article: Bone Joint J 2016;98-B2:166–72


The Bone & Joint Journal
Vol. 97-B, Issue 5 | Pages 611 - 616
1 May 2015
Shin WC Lee SM Lee KW Cho HJ Lee JS Suh KT

There is no single standardised method of measuring the orientation of the acetabular component on plain radiographs after total hip arthroplasty. We assessed the reliability and accuracy of three methods of assessing anteversion of the acetabular component for 551 THAs using the PolyWare software and the methods of Liaw et al, and of Woo and Morrey. All measurements of the three methods had excellent intra- and inter-observer reliability. The values of the PolyWare software, which determines version of the acetabular component by edge detection were regarded as the reference standard. Although the PolyWare software and the method of Liaw et al were similarly precise, the method of Woo and Morrey was significantly less accurate (p < 0.001). The method of Liaw et al seemed to be more accurate than that of Woo and Morrey when compared with the measurements using the PolyWare software. If the qualified lateral radiograph was selected, anteversion measured using the method of Woo and Morrey was considered to be relatively reliable. Cite this article: Bone Joint J 2015; 97-B:611–16


The Journal of Bone & Joint Surgery British Volume
Vol. 88-B, Issue 9 | Pages 1204 - 1206
1 Sep 2006
Malek IA Machani B Mevcha AM Hyder NH

Our aim was to assess the reproducibility and the reliability of the Weber classification system for fractures of the ankle based on anteroposterior and lateral radiographs. Five observers with varying clinical experience reviewed 50 sets of blinded radiographs. The same observers reviewed the same radiographs again after an interval of four weeks. Inter- and intra-observer agreement was assessed based on the proportion of agreement and the values of the kappa coefficient. For inter-observer agreement, the mean kappa value was 0.61 (0.59 to 0.63) and the proportion of agreement was 78% (76% to 79%) and for intra-observer agreement the mean kappa value was 0.74 (0.39 to 0.86) with an 85% (60% to 93%) observed agreement. These results show that the Weber classification of fractures of the ankle based on two radiological views has substantial inter-observer reliability and intra-observer reproducibility


The Journal of Bone & Joint Surgery British Volume
Vol. 91-B, Issue 6 | Pages 766 - 771
1 Jun 2009
Brunner A Honigmann P Treumann T Babst R

We evaluated the impact of stereo-visualisation of three-dimensional volume-rendering CT datasets on the inter- and intraobserver reliability assessed by kappa values on the AO/OTA and Neer classifications in the assessment of proximal humeral fractures. Four independent observers classified 40 fractures according to the AO/OTA and Neer classifications using plain radiographs, two-dimensional CT scans and with stereo-visualised three-dimensional volume-rendering reconstructions. Both classification systems showed moderate interobserver reliability with plain radiographs and two-dimensional CT scans. Three-dimensional volume-rendered CT scans improved the interobserver reliability of both systems to good. Intraobserver reliability was moderate for both classifications when assessed by plain radiographs. Stereo visualisation of three-dimensional volume rendering improved intraobserver reliability to good for the AO/OTA method and to excellent for the Neer classification. These data support our opinion that stereo visualisation of three-dimensional volume-rendering datasets is of value when analysing and classifying complex fractures of the proximal humerus


The Journal of Bone & Joint Surgery British Volume
Vol. 84-B, Issue 1 | Pages 42 - 47
1 Jan 2002
Brismar BH Wredmark T Movin T Leandersson J Svensson O

We studied 19 videotaped knee arthroscopies in 19 patients with mild to moderate osteoarthritis (OA) of the knee in order to compare the intraobserver and interobserver reliability and the patterns of disagreement between four orthopaedic surgeons. The classifications of OA of Collins, Outerbridge and the French Society of Arthroscopy were used. Intraobserver and interobserver agreements using kappa measures were 0.42 to 0.66 and 0.43 to 0.49, respectively. Only 6% to 8% of paired intraobserver classifications differed by more than one category. Observer-specific disagreement was evident both within and between observers. A small, but significant, occasional variation was also seen. Although reliability may improve by an analysis of disagreement, it appears that the arthroscopic grading of early osteoarthritic lesions is inexact


The Bone & Joint Journal
Vol. 100-B, Issue 2 | Pages 242 - 246
1 Feb 2018
Ghoshal A Enninghorst N Sisak K Balogh ZJ

Aims. To evaluate interobserver reliability of the Orthopaedic Trauma Association’s open fracture classification system (OTA-OFC). Patients and Methods. Patients of any age with a first presentation of an open long bone fracture were included. Standard radiographs, wound photographs, and a short clinical description were given to eight orthopaedic surgeons, who independently evaluated the injury using both the Gustilo and Anderson (GA) and OTA-OFC classifications. The responses were compared for variability using Cohen’s kappa. Results. The overall interobserver agreement was ĸ = 0.44 for the GA classification and ĸ = 0.49 for OTA-OFC, which reflects moderate agreement (0.41 to 0.60) for both classifications. The agreement in the five categories of OTA-OFC was: for skin, ĸ = 0.55 (moderate); for muscle, ĸ = 0.44 (moderate); for arterial injury, ĸ = 0.74 (substantial); for contamination, ĸ = 0.35 (fair); and for bone loss, ĸ = 0.41 (moderate). Conclusion. Although the OTA-OFC, with similar interobserver agreement to GA, offers a more detailed description of open fractures, further development may be needed to make it a reliable and robust tool. Cite this article: Bone Joint J 2018;100-B:242–6


The Bone & Joint Journal
Vol. 96-B, Issue 5 | Pages 597 - 603
1 May 2014
Nomura T Naito M Nakamura Y Ida T Kuroda D Kobayashi T Sakamoto T Seo H

Several radiological methods of measuring anteversion of the acetabular component after total hip replacement (THR) have been described. These studies used different definitions and reference planes to compare methods, allowing for misinterpretation of the results. We compared the reliability and accuracy of five current methods using plain radiographs (those of Lewinnek, Widmer, Liaw, Pradhan, and Woo and Morrey) with CT measurements, using the same definition and reference plane. We retrospectively studied the plain radiographs and CT scans in 84 hips of 84 patients who underwent primary THR. Intra- and inter-observer reliability were high for the measurement of inclination and anteversion with all methods on plain radiographs and CT scans. The measurements of inclination on plain radiographs were similar to the measurements using CT (p = 0.043). The mean difference between CT measurements was 0.6° (-5.9° to 6.8°). Measurements using Widmer’s method were the most similar to those using CT (p = 0.088), with a mean difference between CT measurements of -0.9° (-10.4° to 9.1°), whereas the other four methods differed significantly from those using CT (p < 0.001). This study has shown that Widmer’s method is the best for evaluating the anteversion of the acetabular component on plain radiographs. Cite this article: Bone Joint J 2014; 96-B:597–603


The Journal of Bone & Joint Surgery British Volume
Vol. 75-B, Issue 3 | Pages 479 - 482
1 May 1993
Dias J Thomas I Lamont A Mody B Thompson

Ultrasound scans were made of the hips of 209 neonates born consecutively over a two-week period. Of the 418 scans, 62 images were selected at random and 25 of these were duplicated to give a total of 87 scans. These static images were then presented to five experienced observers who each made nine different assessments and measurements. Interobserver and intraboserver agreement was calculated and expressed as kappa values. Our results showed poor reliability on both counts


The Journal of Bone & Joint Surgery British Volume
Vol. 79-B, Issue 4 | Pages 570 - 575
1 Jul 1997
Boniforti FG Fujii G Angliss RD Benson MKD

We have evaluated the reliability of the measurement of radiological indicators in developmental dysplasia of the hip. Three observers each independently assessed 60 pelvic radiographs from infants aged from 3 to 36 months. Errors from the true value of a single measurement made by a single observer (E1), of the average of two measurements by a single observer (E2), and of the average of two single measurements by two different observers (E3) were established for the acetabular index of Hilgenreiner, for the assessment of superior and lateral femoral displacement and for indicators of pelvic alignment. The errors for the assessment of the acetabular index were E1 ± 5°, E2 ± 5°, and E3 ± 3.5°. There was a significant correlation between the presence of an acetabular notch on the radiograph and an increased error in measurement (p = 0.01). Yamamuro’s measurement of lateral femoral displacement was more reliable than the Hilgenreiner distance. The errors of indicators of pelvic alignment showed a correlation with the age of the infant; the quotient of pelvic rotation was more reliable after seven months of age (p < 0.0001). The errors of the measurement of the symphysis os-ischium angle tended to increase with age and those of the measurement of the index of pelvic tilt decreased with skeletal maturation (p = 0.002)


The Journal of Bone & Joint Surgery British Volume
Vol. 68-B, Issue 4 | Pages 614 - 615
1 Aug 1986
Christensen F Soballe K Ejsted R Luxhoj T

The reliability of the Catterall grouping of Perthes' disease was examined by determining the agreement between pairs of observers using weighted kappa statistics. Anteroposterior and lateral radiographs of 100 hip joints were grouped independently by four experienced observers. There was a low, and in our opinion, unacceptable degree of inter-observer agreement even when Groups 2 and 3 were combined


The Journal of Bone & Joint Surgery British Volume
Vol. 74-B, Issue 2 | Pages 287 - 291
1 Mar 1992
Wright J Feinstein A


The Journal of Bone & Joint Surgery British Volume
Vol. 72-B, Issue 5 | Pages 924 - 924
1 Sep 1990
Asirvatham R Watts H Ware B Rooney R


The Journal of Bone & Joint Surgery British Volume
Vol. 83-B, Issue 5 | Pages 775 - 777
1 Jul 2001
Rushton N


The Journal of Bone & Joint Surgery British Volume
Vol. 85-B, Issue 3 | Pages 463 - 464
1 Apr 2003
MENCHE DS


The Journal of Bone & Joint Surgery British Volume
Vol. 71-B, Issue 1 | Pages 6 - 8
1 Jan 1989
Broughton N Brougham D Cole W Menelaus M

We investigated the reproducibility of the various radiological methods of assessment of hip dysplasia by making 474 assessments of hips and quantifying the inter-observer and intra-observer variation. There was a wide range of variability between the readings made by different observers and by one observer on two occasions. A measurement of acetabular index has to be given a range of +/- 6 degrees in order to be 95% confident of including the true measurement. We found the most helpful measurements to be the acetabular index, up to the age of eight years; the centre-edge angle, over the age of five years; and Smith's c/b ratio and neck-shaft angle. We feel, however, that the change in value over a series of radiographs in the same child is much more valuable. Single readings of all the radiological measurements investigated in this study were unreliable.


The Bone & Joint Journal
Vol. 97-B, Issue 8 | Pages 1139 - 1143
1 Aug 2015
Hutt JRB Ortega-Briones A Daurka JS Bircher MD Rickman MS

The most widely used classification system for acetabular fractures was developed by Judet, Judet and Letournel over 50 years ago primarily to aid surgical planning. As population demographics and injury mechanisms have altered over time, the fracture patterns also appear to be changing. We conducted a retrospective review of the imaging of 100 patients with a mean age of 54.9 years (19 to 94) and a male to female ratio of 69:31 seen between 2010 and 2013 with acetabular fractures in order to determine whether the current spectrum of injury patterns can be reliably classified using the original system.

Three consultant pelvic and acetabular surgeons and one senior fellow analysed anonymous imaging. Inter-observer agreement for the classification of fractures that fitted into defined categories was substantial, (κ = 0.65, 95% confidence interval (CI) 0.51 to 0.76) with improvement to near perfect on inclusion of CT imaging (κ = 0.80, 95% CI 0.69 to 0.91). However, a high proportion of injuries (46%) were felt to be unclassifiable by more than one surgeon; there was moderate agreement on which these were (κ = 0.42 95% CI 0.31 to 0.54).

Further review of the unclassifiable fractures in this cohort of 100 patients showed that they tended to occur in an older population (mean age 59.1 years; 22 to 94 vs 47.2 years; 19 to 94; p = 0.003) and within this group, there was a recurring pattern of anterior column and quadrilateral plate involvement, with or without an incomplete posterior element injury.

Cite this article: Bone Joint J 2015;97-B:1139–43.


The Journal of Bone & Joint Surgery British Volume
Vol. 94-B, Issue 11 | Pages 1522 - 1528
1 Nov 2012
Wallander H Saebö M Jonsson K Bjönness T Hansson G

We investigated 60 patients (89 feet) with a mean age of 64 years (61 to 67) treated for congenital clubfoot deformity, using standardised weight-bearing radiographs of both feet and ankles together with a functional evaluation. Talocalcaneal and talonavicular relationships were measured and the degree of osteo-arthritic change in the ankle and talonavicular joints was assessed. The functional results were evaluated using a modified Laaveg-Ponseti score. The talocalcaneal (TC) angles in the clubfeet were significantly lower in both anteroposterior (AP) and lateral projections than in the unaffected feet (p < 0.001 for both views). There was significant medial subluxation of the navicular in the clubfeet compared with the unaffected feet (p < 0.001). Severe osteoarthritis in the ankle joint was seen in seven feet (8%) and in the talonavicular joint in 11 feet (12%). The functional result was excellent or good (≥ 80 points) in 29 patients (48%), and fair or poor (< 80 points) in 31 patients (52%). Patients who had undergone few (0 to 1) surgical procedures had better functional outcomes than those who had undergone two or more procedures (p < 0.001). There was a significant correlation between the functional result and the degree of medial subluxation of the navicular (p < 0.001, r2 = 0.164), the talocalcaneal angle on AP projection (p < 0.02, r2 = 0.025) and extent of osteoarthritis in the ankle joint (p < 0.001).

We conclude that poor functional outcome in patients with congenital clubfoot occurs more frequently in those with medial displacement of the navicular, osteoarthritis of the talonavicular and ankle joints, and a low talocalcaneal angle on the AP projection, and in patients who have undergone two or more surgical procedures. However, the ankle joint in these patients appeared relatively resistant to the development of osteoarthritis.


The Bone & Joint Journal
Vol. 105-B, Issue 1 | Pages 56 - 63
1 Jan 2023
de Klerk HH Oosterhoff JHF Schoolmeesters B Nieboer P Eygendaal D Jaarsma RL IJpma FFA van den Bekerom MPJ Doornberg JN

Aims. This study aimed to answer the following questions: do 3D-printed models lead to a more accurate recognition of the pattern of complex fractures of the elbow?; do 3D-printed models lead to a more reliable recognition of the pattern of these injuries?; and do junior surgeons benefit more from 3D-printed models than senior surgeons?. Methods. A total of 15 orthopaedic trauma surgeons (seven juniors, eight seniors) evaluated 20 complex elbow fractures for their overall pattern (i.e. varus posterior medial rotational injury, terrible triad injury, radial head fracture with posterolateral dislocation, anterior (trans-)olecranon fracture-dislocation, posterior (trans-)olecranon fracture-dislocation) and their specific characteristics. First, fractures were assessed based on radiographs and 2D and 3D CT scans; and in a subsequent round, one month later, with additional 3D-printed models. Diagnostic accuracy (acc) and inter-surgeon reliability (κ) were determined for each assessment. Results. Accuracy significantly improved with 3D-printed models for the whole group on pattern recognition (acc. 2D/3D. = 0.62 vs acc. 3Dprint. = 0.69; Δacc = 0.07 (95% confidence interval (CI) 0.00 to 0.14); p = 0.025). A significant improvement was also seen in reliability for pattern recognition with the additional 3D-printed models (κ. 2D/3D. = 0.41 (moderate) vs κ. 3Dprint. = 0.59 (moderate); Δκ = 0.18 (95% CI 0.14 to 0.22); p ≤ 0.001). Accuracy was comparable between junior and senior surgeons with the 3D-printed model (acc. junior. = 0.70 vs acc. senior. = 0.68; Δacc = -0.02 (95% CI -0.17 to 0.13); p = 0.904). Reliability was also comparable between junior and senior surgeons without the 3D-printed model (κ. junior. = 0.39 (fair) vs κ. senior. = 0.43 (moderate); Δκ = 0.03 (95% CI -0.03 to 0.10); p = 0.318). However, junior surgeons showed greater improvement regarding reliability than seniors with 3D-printed models (κ. junior. = 0.65 (substantial) vs κ. senior. = 0.54 (moderate); Δκ = 0.11 (95% CI 0.04 to 0.18); p = 0.002). Conclusion. The use of 3D-printed models significantly improved the accuracy and reliability of recognizing the pattern of complex fractures of the elbow. However, the current long printing time and non-reusable materials could limit the usefulness of 3D-printed models in clinical practice. They could be suitable as a reusable tool for teaching residents. Cite this article: Bone Joint J 2023;105-B(1):56–63


The Bone & Joint Journal
Vol. 106-B, Issue 1 | Pages 19 - 27
1 Jan 2024
Tang H Guo S Ma Z Wang S Zhou Y

Aims. The aim of this study was to evaluate the reliability and validity of a patient-specific algorithm which we developed for predicting changes in sagittal pelvic tilt after total hip arthroplasty (THA). Methods. This retrospective study included 143 patients who underwent 171 THAs between April 2019 and October 2020 and had full-body lateral radiographs preoperatively and at one year postoperatively. We measured the pelvic incidence (PI), the sagittal vertical axis (SVA), pelvic tilt, sacral slope (SS), lumbar lordosis (LL), and thoracic kyphosis to classify patients into types A, B1, B2, B3, and C. The change of pelvic tilt was predicted according to the normal range of SVA (0 mm to 50 mm) for types A, B1, B2, and B3, and based on the absolute value of one-third of the PI-LL mismatch for type C patients. The reliability of the classification of the patients and the prediction of the change of pelvic tilt were assessed using kappa values and intraclass correlation coefficients (ICCs), respectively. Validity was assessed using the overall mean error and mean absolute error (MAE) for the prediction of the change of pelvic tilt. Results. The kappa values were 0.927 (95% confidence interval (CI) 0.861 to 0.992) and 0.945 (95% CI 0.903 to 0.988) for the inter- and intraobserver reliabilities, respectively, and the ICCs ranged from 0.919 to 0.997. The overall mean error and MAE for the prediction of the change of pelvic tilt were -0.3° (SD 3.6°) and 2.8° (SD 2.4°), respectively. The overall absolute change of pelvic tilt was 5.0° (SD 4.1°). Pre- and postoperative values and changes in pelvic tilt, SVA, SS, and LL varied significantly among the five types of patient. Conclusion. We found that the proposed algorithm was reliable and valid for predicting the standing pelvic tilt after THA. Cite this article: Bone Joint J 2024;106-B(1):19–27


The Bone & Joint Journal
Vol. 106-B, Issue 5 | Pages 468 - 474
1 May 2024
d'Amato M Flevas DA Salari P Bornes TD Brenneis M Boettner F Sculco PK Baldini A

Aims. Obtaining solid implant fixation is crucial in revision total knee arthroplasty (rTKA) to avoid aseptic loosening, a major reason for re-revision. This study aims to validate a novel grading system that quantifies implant fixation across three anatomical zones (epiphysis, metaphysis, diaphysis). Methods. Based on pre-, intra-, and postoperative assessments, the novel grading system allocates a quantitative score (0, 0.5, or 1 point) for the quality of fixation achieved in each anatomical zone. The criteria used by the algorithm to assign the score include the bone quality, the size of the bone defect, and the type of fixation used. A consecutive cohort of 245 patients undergoing rTKA from 2012 to 2018 were evaluated using the current novel scoring system and followed prospectively. In addition, 100 first-time revision cases were assessed radiologically from the original cohort and graded by three observers to evaluate the intra- and inter-rater reliability of the novel radiological grading system. Results. At a mean follow-up of 90 months (64 to 130), only two out of 245 cases failed due to aseptic loosening. Intraoperative grading yielded mean scores of 1.87 (95% confidence interval (CI) 1.82 to 1.92) for the femur and 1.96 (95% CI 1.92 to 2.0) for the tibia. Only 3.7% of femoral and 1.7% of tibial reconstructions fell below the 1.5-point threshold, which included the two cases of aseptic loosening. Interobserver reliability for postoperative radiological grading was 0.97 for the femur and 0.85 for the tibia. Conclusion. A minimum score of 1.5 points for each skeletal segment appears to be a reasonable cut-off to define sufficient fixation in rTKA. There were no revisions for aseptic loosening at mid-term follow-up when this fixation threshold was achieved or exceeded. When assessing first-time revisions, this novel grading system has shown excellent intra- and interobserver reliability. Cite this article: Bone Joint J 2024;106-B(5):468–474


The Bone & Joint Journal
Vol. 105-B, Issue 10 | Pages 1123 - 1130
1 Oct 2023
Donnan M Anderson N Hoq M Donnan L

Aims. The aim of this study was to investigate the agreement in interpretation of the quality of the paediatric hip ultrasound examination, the reliability of geometric and morphological assessment, and the relationship between these measurements. Methods. Four investigators evaluated 60 hip ultrasounds and assessed their quality based the standard plane of Graf et al. They measured geometric parameters, described the morphology of the hip, and assigned the Graf grade of dysplasia. They analyzed one self-selected image and one randomly selected image from the ultrasound series, and repeated the process four weeks later. The intra- and interobserver agreement, and correlations between various parameters were analyzed. Results. In the assessment of quality, there a was moderate to substantial intraobserver agreement for each element investigated, but interobserver agreement was poor. Morphological features showed weak to moderate agreement across all parameters but improved to significant when responses were reduced. The geometric measurements showed nearly perfect agreement, and the relationship between them and the morphological features showed a dose response across all parameters with moderate to substantial correlations. There were strong correlations between geometric measurements. The Graf classification showed a fair to moderate interobserver agreement, and moderate to substantial intraobserver agreement. Conclusion. This investigation into the reliability of the interpretation of hip ultrasound scans identified the difficulties in defining what is a high-quality ultrasound. We confirmed that geometric measurements are reliably interpreted and may be useful as a further measurement of quality. Morphological features are generally poorly interpreted, but a simpler binary classification considerably improves agreement. As there is a clear dose response relationship between geometric and morphological measurements, the importance of morphology in the diagnosis of hip dysplasia should be questioned. Cite this article: Bone Joint J 2023;105-B(10):1123–1130


The Bone & Joint Journal
Vol. 106-B, Issue 9 | Pages 964 - 969
1 Sep 2024
Wang YC Song JJ Li TT Yang D Lv ZB Wang ZY Zhang ZM Luo Y

Aims. To propose a new method for evaluating paediatric radial neck fractures and improve the accuracy of fracture angulation measurement, particularly in younger children, and thereby facilitate planning treatment in this population. Methods. Clinical data of 117 children with radial neck fractures in our hospital from August 2014 to March 2023 were collected. A total of 50 children (26 males, 24 females, mean age 7.6 years (2 to 13)) met the inclusion criteria and were analyzed. Cases were excluded for the following reasons: Judet grade I and Judet grade IVb (> 85° angulation) classification; poor radiograph image quality; incomplete clinical information; sagittal plane angulation; severe displacement of the ulna fracture; and Monteggia fractures. For each patient, standard elbow anteroposterior (AP) view radiographs and corresponding CT images were acquired. On radiographs, Angle P (complementary to the angle between the long axis of the radial head and the line perpendicular to the physis), Angle S (complementary to the angle between the long axis of the radial head and the midline through the proximal radial shaft), and Angle U (between the long axis of the radial head and the straight line from the distal tip of the capitellum to the coronoid process) were identified as candidates approximating the true coronal plane angulation of radial neck fractures. On the coronal plane of the CT scan, the angulation of radial neck fractures (CTa) was measured and served as the reference standard for measurement. Inter- and intraobserver reliabilities were assessed by Kappa statistics and intraclass correlation coefficient (ICC). Results. Angle U showed the strongest correlation with CTa (p < 0.001). In the analysis of inter- and intraobserver reliability, Kappa values were significantly higher for Angles S and U compared with Angle P. ICC values were excellent among the three groups. Conclusion. Angle U on AP view was the best substitute for CTa when evaluating radial neck fractures in children. Further studies are required to validate this method. Cite this article: Bone Joint J 2024;106-B(9):964–969


The Bone & Joint Journal
Vol. 106-B, Issue 3 | Pages 293 - 302
1 Mar 2024
Vogt B Lueckingsmeier M Gosheger G Laufer A Toporowski G Antfang C Roedl R Frommer A

Aims. As an alternative to external fixators, intramedullary lengthening nails (ILNs) can be employed for distraction osteogenesis. While previous studies have demonstrated that typical complications of external devices, such as soft-tissue tethering, and pin site infection can be avoided with ILNs, there is a lack of studies that exclusively investigated tibial distraction osteogenesis with motorized ILNs inserted via an antegrade approach. Methods. A total of 58 patients (median age 17 years (interquartile range (IQR) 15 to 21)) treated by unilateral tibial distraction osteogenesis for a median leg length discrepancy of 41 mm (IQR 34 to 53), and nine patients with disproportionate short stature treated by bilateral simultaneous tibial distraction osteogenesis, with magnetically controlled motorized ILNs inserted via an antegrade approach, were retrospectively analyzed. The median follow-up was 37 months (IQR 30 to 51). Outcome measurements were accuracy, precision, reliability, bone healing, complications, and patient-reported outcome assessed by the Limb Deformity-Scoliosis Research Society Score (LD-SRS-30). Results. A median tibial distraction of 44 mm (IQR 31 to 49) was achieved with a mean distraction index of 0.5 mm/day (standard deviation 0.13) and median consolidation index of 41.2 days/cm (IQR 34 to 51). Accuracy, precision, and reliability were 91%, 92%, and 97%, respectively. New temporary range of motion limitations occurred in 51% of segments (34/67). Distraction-related equinus deformity treated by Achilles tendon lengthening was the most common major complication recorded in 16% of segments (11/67). In 95% of patients (55/58) the distraction goal was achieved with 42% unplanned additional interventions per segment (28/67). The median postoperative LD-SRS-30 score was 4.0 (IQR 3.6 to 4.3). Conclusion. Tibial distraction osteogenesis using motorized ILNs inserted via an antegrade approach appears to be a reliable and precise procedure. Temporary joint stiffness of the knee or ankle should be expected in up to every second patient. A high rate and wide range of complications of variable severity should be anticipated. Cite this article: Bone Joint J 2024;106-B(3):293–302


The Bone & Joint Journal
Vol. 104-B, Issue 6 | Pages 715 - 720
1 Jun 2022
Dunsmuir RA Nisar S Cruickshank JA Loughenbury PR

Aims. The aim of the study was to determine if there was a direct correlation between the pain and disability experienced by patients and size of their disc prolapse, measured by the disc’s cross-sectional area on T2 axial MRI scans. Methods. Patients were asked to prospectively complete visual analogue scale (VAS) and Oswestry Disability Index (ODI) scores on the day of their MRI scan. All patients with primary disc herniation were included. Exclusion criteria included recurrent disc herniation, cauda equina syndrome, or any other associated spinal pathology. T2 weighted MRI scans were reviewed on picture archiving and communications software. The T2 axial image showing the disc protrusion with the largest cross sectional area was used for measurements. The area of the disc and canal were measured at this level. The size of the disc was measured as a percentage of the cross-sectional area of the spinal canal on the chosen image. The VAS leg pain and ODI scores were each correlated with the size of the disc using the Pearson correlation coefficient (PCC). Intraobserver reliability for MRI measurement was assessed using the interclass correlation coefficient (ICC). We assessed if the position of the disc prolapse (central, lateral recess, or foraminal) altered the symptoms described by the patient. The VAS and ODI scores from central and lateral recess disc prolapses were compared. Results. A total of 56 patients (mean age 41.1 years (22.8 to 70.3)) were included. A high degree of intraobserver reliability was observed for MRI measurement: single measure ICC was 0.99 (95% confidence interval (CI) from 0.97 to 0.99 (p < 0.001)). The PCC comparing VAS leg scores with canal occupancy for herniated disc was 0.056. The PCC comparing ODI for herniated disc was 0.070. We found 13 disc prolapses centrally and 43 lateral recess prolapses. There were no foraminal prolapses in this group. The position of the prolapse was not found to be related to the mean VAS score or ODI experienced by the patients (VAS, p = 0.251; ODI, p = 0.093). Conclusion. The results of the statistical analysis show that there is no direct correlation between the size or position of the disc prolapse and a patient’s symptoms. The symptoms experienced by patients should be the primary concern in deciding to perform discectomy. Cite this article: Bone Joint J 2022;104-B(6):715–720


The Bone & Joint Journal
Vol. 105-B, Issue 9 | Pages 1007 - 1012
1 Sep 2023
Hoeritzauer I Paterson M Jamjoom AAB Srikandarajah N Soleiman H Poon MTC Copley PC Graves C MacKay S Duong C Leung AHC Eames N Statham PFX Darwish S Sell PJ Thorpe P Shekhar H Roy H Woodfield J

Aims. Patients with cauda equina syndrome (CES) require emergency imaging and surgical decompression. The severity and type of symptoms may influence the timing of imaging and surgery, and help predict the patient’s prognosis. Categories of CES attempt to group patients for management and prognostication purposes. We aimed in this study to assess the inter-rater reliability of dividing patients with CES into categories to assess whether they can be reliably applied in clinical practice and in research. Methods. A literature review was undertaken to identify published descriptions of categories of CES. A total of 100 real anonymized clinical vignettes of patients diagnosed with CES from the Understanding Cauda Equina Syndrome (UCES) study were reviewed by consultant spinal surgeons, neurosurgical registrars, and medical students. All were provided with published category definitions and asked to decide whether each patient had ‘suspected CES’; ‘early CES’; ‘incomplete CES’; or ‘CES with urinary retention’. Inter-rater agreement was assessed for all categories, for all raters, and for each group of raters using Fleiss’s kappa. Results. Each of the 100 participants were rated by four medical students, five neurosurgical registrars, and four consultant spinal surgeons. No groups achieved reasonable inter-rater agreement for any of the categories. CES with retention versus all other categories had the highest inter-rater agreement (kappa 0.34 (95% confidence interval 0.27 to 0.31); minimal agreement). There was no improvement in inter-rater agreement with clinical experience. Across all categories, registrars agreed with each other most often (kappa 0.41), followed by medical students (kappa 0.39). Consultant spinal surgeons had the lowest inter-rater agreement (kappa 0.17). Conclusion. Inter-rater agreement for categorizing CES is low among clinicians who regularly manage these patients. CES categories should be used with caution in clinical practice and research studies, as groups may be heterogenous and not comparable. Cite this article: Bone Joint J 2023;105-B(9):1007–1012


The Bone & Joint Journal
Vol. 106-B, Issue 3 | Pages 277 - 285
1 Mar 2024
Pinto D Hussain S Leo DG Bridgens A Eastwood D Gelfer Y

Aims. Children with spinal dysraphism can develop various musculoskeletal deformities, necessitating a range of orthopaedic interventions, causing significant morbidity, and making considerable demands on resources. This systematic review aimed to identify what outcome measures have been reported in the literature for children with spinal dysraphism who undergo orthopaedic interventions involving the lower limbs. Methods. A PROSPERO-registered systematic literature review was performed following PRISMA guidelines. All relevant studies published until January 2023 were identified. Individual outcomes and outcome measurement tools were extracted verbatim. The measurement tools were assessed for reliability and validity, and all outcomes were grouped according to the Outcome Measures Recommended for use in Randomized Clinical Trials (OMERACT) filters. Results. From 91 eligible studies, 27 individual outcomes were identified, including those related to clinical assessment (n = 12), mobility (n = 4), adverse events (n = 6), investigations (n = 4), and miscellaneous (n = 1). Ten outcome measurement tools were identified, of which Hoffer’s Functional Ambulation Scale was the most commonly used. Several studies used unvalidated measurement tools originally developed for other conditions, and 26 studies developed new measurement tools. On the OMERACT filter, most outcomes reported pathophysiology and/or the impact on life. There were only six patient- or parent-reported outcomes, and none assessed the quality of life. Conclusion. The outcomes that were reported were heterogenous, lack validation and failed to incorporate patient or family perceptions. Until outcomes can be reported unequivocally, research in this area will remain limited. Our findings should guide the development of a core outcome set, which will allow consistency in the reporting of outcomes for this condition. Cite this article: Bone Joint J 2024;106-B(3):277–285


The Bone & Joint Journal
Vol. 105-B, Issue 1 | Pages 21 - 28
1 Jan 2023
Ndlovu S Naqshband M Masunda S Ndlovu K Chettiar K Anugraha A

Aims. Clinical management of open fractures is challenging and frequently requires complex reconstruction procedures. The Gustilo-Anderson classification lacks uniform interpretation, has poor interobserver reliability, and fails to account for injuries to musculotendinous units and bone. The Ganga Hospital Open Injury Severity Score (GHOISS) was designed to address these concerns. The major aim of this review was to ascertain the evidence available on accuracy of the GHOISS in predicting successful limb salvage in patients with mangled limbs. Methods. We searched electronic data bases including PubMed, CENTRAL, EMBASE, CINAHL, Scopus, and Web of Science to identify studies that employed the GHOISS risk tool in managing complex limb injuries published from April 2006, when the score was introduced, until April 2021. Primary outcome was the measured sensitivity and specificity of the GHOISS risk tool for predicting amputation at a specified threshold score. Secondary outcomes included length of stay, need for plastic surgery, deep infection rate, time to fracture union, and functional outcome measures. Diagnostic test accuracy meta-analysis was performed using a random effects bivariate binomial model. Results. We identified 1,304 records, of which six prospective cohort studies and two retrospective cohort studies evaluating a total of 788 patients were deemed eligible for inclusion. A diagnostic test meta-analysis conducted on five cohort studies, with 474 participants, showed that GHOISS at a threshold score of 14 has a pooled sensitivity of 93.4% (95% confidence interval (CI) 78.4 to 98.2) and a specificity of 95% (95% CI 88.7 to 97.9) for predicting primary or secondary amputations in people with complex lower limb injuries. Conclusion. GHOISS is highly accurate in predicting success of limb salvage, and can inform management and predict secondary outcomes. However, there is a need for high-quality multicentre trials to confirm these findings and investigate the effectiveness of the score in children, and in predicting secondary amputations. Cite this article: Bone Joint J 2023;105-B(1):21–28


The Bone & Joint Journal
Vol. 103-B, Issue 8 | Pages 1345 - 1350
1 Aug 2021
Czubak-Wrzosek M Nitek Z Sztwiertnia P Czubak J Grzelecki D Kowalczewski J Tyrakowski M

Aims. The aim of the study was to compare two methods of calculating pelvic incidence (PI) and pelvic tilt (PT), either by using the femoral heads or acetabular domes to determine the bicoxofemoral axis, in patients with unilateral or bilateral primary hip osteoarthritis (OA). Methods. PI and PT were measured on standing lateral radiographs of the spine in two groups: 50 patients with unilateral (Group I) and 50 patients with bilateral hip OA (Group II), using the femoral heads or acetabular domes to define the bicoxofemoral axis. Agreement between the methods was determined by intraclass correlation coefficient (ICC) and the standard error of measurement (SEm). The intraobserver reproducibility and interobserver reliability of the two methods were analyzed on 31 radiographs in both groups to calculate ICC and SEm. Results. In both groups, excellent agreement between the two methods was obtained, with ICC of 0.99 and SEm 0.3° for Group I, and ICC 0.99 and SEm 0.4° for Group II. The intraobserver reproducibility was excellent for both methods in both groups, with an ICC of at least 0.97 and SEm not exceeding 0.8°. The study also revealed excellent interobserver reliability for both methods in both groups, with ICC 0.99 and SEm 0.5° or less. Conclusion. Either the femoral heads or acetabular domes can be used to define the bicoxofemoral axis on the lateral standing radiographs of the spine for measuring PI and PT in patients with idiopathic unilateral or bilateral hip OA. Cite this article: Bone Joint J 2021;103-B(8):1345–1350


The Bone & Joint Journal
Vol. 106-B, Issue 3 | Pages 227 - 231
1 Mar 2024
Todd NV Casey A Birch NC

The diagnostic sub-categorization of cauda equina syndrome (CES) is used to aid communication between doctors and other healthcare professionals. It is also used to determine the need for, and urgency of, MRI and surgery in these patients. A recent paper by Hoeritzauer et al (2023) in this journal examined the interobserver reliability of the widely accepted subcategories in 100 patients with cauda equina syndrome. They found that there is no useful interobserver agreement for the subcategories, even for experienced spinal surgeons. This observation is supported by the largest prospective study of the treatment of cauda equina syndrome in the UK by Woodfield et al (2023). If the accepted subcategories are unreliable, they cannot be used in the way that they are currently, and they should be revised or abandoned. This paper presents a reassessment of the diagnostic and prognostic subcategories of cauda equina syndrome in the light of this evidence, with a suggested cure based on a more inclusive synthesis of symptoms, signs, bladder ultrasound scan results, and pre-intervention urinary catheterization. Cite this article: Bone Joint J 2024;106-B(3):227–231


The Bone & Joint Journal
Vol. 103-B, Issue 8 | Pages 1380 - 1385
2 Aug 2021
Kim Y Ryu J Kim JK Al-Dhafer BAA Shin YH

Aims. The aim of this study was to assess arthritis of the basal joint of the thumb quantitatively using bone single-photon emission CT/CT (SPECT/CT) and evaluate its relationship with patients’ pain and function. Methods. We retrospectively reviewed 30 patients (53 hands) with symptomatic basal joint arthritis of the thumb between April 2019 and March 2020. Visual analogue scale (VAS) scores for pain, grip strength, and pinch power of both hands and Patient-Rated Wrist/Hand Evaluation (PRWHE) scores were recorded for all patients. Basal joint arthritis was classified according to the modified Eaton-Glickel stage using routine radiographs and the CT scans of SPECT/CT, respectively. The maximum standardized uptake value (SUVmax) from SPECT/CT was measured in the four peritrapezial joints and the highest uptake was used for analysis. Results. According to Eaton-Glickel classification, 11, 17, 17, and eight hands were stage 0 to I, II, III, and IV, respectively. The interobserver reliability for determining the stage of arthritis was moderate for radiographs (k = 0.41) and substantial for CT scans (k = 0.67). In a binary categorical analysis using SUVmax, pain (p < 0.001) and PRWHE scores (p = 0.004) were significantly higher in hands with higher SUVmax. Using multivariate linear regression to estimate the pain VAS, only SUVmax (B 0.172 (95% confidence interval (CI) 0.065 to 0.279; p = 0.002) showed a significant association. Estimating the variation of PRWHE scores using the same model, only SUVmax (B 1.378 (95% CI, 0.082 to 2.674); p = 0.038) showed a significant association. Conclusion. The CT scans of SPECT/CT provided better interobserver reliability than routine radiographs for evaluating the severity of arthritis. A higher SUVmax in SPECT/CT was associated with more pain and functional disabilities of basal joint arthritis of the thumb. This approach could be used to complement radiographs for the evaluation of patients with this condition. Cite this article: Bone Joint J 2021;103-B(8):1380–1385


The Bone & Joint Journal
Vol. 101-B, Issue 12 | Pages 1578 - 1584
1 Dec 2019
Batailler C Weidner J Wyatt M Pfluger D Beck M

Aims. A borderline dysplastic hip can behave as either stable or unstable and this makes surgical decision making challenging. While an unstable hip may be best treated by acetabular reorientation, stable hips can be treated arthroscopically. Several imaging parameters can help to identify the appropriate treatment, including the Femoro-Epiphyseal Acetabular Roof (FEAR) index, measured on plain radiographs. The aim of this study was to assess the reliability and the sensitivity of FEAR index on MRI compared with its radiological measurement. Patients and Methods. The technique of measuring the FEAR index on MRI was defined and its reliability validated. A retrospective study assessed three groups of 20 patients: an unstable group of ‘borderline dysplastic hips’ with lateral centre edge angle (LCEA) less than 25° treated successfully by periacetabular osteotomy; a stable group of ‘borderline dysplastic hips’ with LCEA less than 25° treated successfully by impingement surgery; and an asymptomatic control group with LCEA between 25° and 35°. The following measurements were performed on both standardized radiographs and on MRI: LCEA, acetabular index, femoral anteversion, and FEAR index. Results. The FEAR index showed excellent intraobserver and interobserver reliability on both MRI and radiographs. The FEAR index was more reliable on radiographs than on MRI. The FEAR index on MRI was lower in the stable borderline group (mean -4.2° (. sd. 9.1°)) compared with the unstable borderline group (mean 7.9° (. sd. 6.8°)). With a FEAR index cut-off value of 2°, 90% of patients were correctly identified as stable or unstable using the radiological FEAR index, compared with 82.5% using the FEAR index on MRI. The FEAR index was a better predictor of instability on plain radiographs than on MRI. Conclusion. The FEAR index measured on MRI is less reliable and less sensitive than the FEAR index measured on radiographs. The cut-off value of 2° for radiological FEAR index predicted hip stability with 90% probability. Cite this article: Bone Joint J 2019;101-B:1578–1584


The Bone & Joint Journal
Vol. 100-B, Issue 5 | Pages 596 - 602
1 May 2018
Bock P Pittermann M Chraim M Rois S

Aims. Various radiological parameters are used to evaluate a flatfoot deformity and their measurements may differ. The aims of this study were to answer the following questions: 1) Which of the 11 parameters have the best inter- and intraobserver reliability in a standardized radiological setting? 2) Are pre- and postoperative assessments equally reliable? 3) What are the identifiable sources of variation?. Patients and Methods. Measurements of the 11 parameters were recorded on anteroposterior and lateral weight-bearing radiographs of 38 feet before and after surgery for flatfoot, by three observers with different experience in foot surgery (A, ten years; B, three years; C, third-year orthopaedic resident). The inter- and intraobserver reliability was calculated. Results. Preoperative interobserver reliability was high for four, moderate for five, and low for two parameters. Postoperative interobserver reliability was high for four, moderate for five, and low for two parameters. Intraobserver reliability was excellent for all parameters preoperatively as recorded by observer A (PB) and B (MP), and for eight parameters as recorded by observer C (SR). Intraobserver reliability was excellent for ten parameters postoperatively as recorded by observer A and B, and for eight parameters as recorded by observer C. Conclusion. The following parameters can be recommended. For preoperative and postoperative evaluation of flatfoot: anteroposterior, talonavicular coverage angle; lateral, talometatarsal I angle, calcaneal pitch angle, and cuneiform-medial height (high interobserver reliability); and anteroposterior, talometatarsal II angle; lateral, talocalcaneal angle,tibiocalcaneal angle (moderate interobserver reliability). For more experienced observers, we also recommend the anteroposterior talometatarsal I angle (moderate reliability). The inter- and intraobserver reliability for most parameters were similar pre- and postoperatively. The experience of the observer and the definition and ability to measure the parameters themselves were sources of variation. Cite this article: Bone Joint J 2018;100-B:596–602


The Bone & Joint Journal
Vol. 102-B, Issue 1 | Pages 102 - 107
1 Jan 2020
Sharma N Brown A Bouras T Kuiper JH Eldridge J Barnett A

Aims. Trochlear dysplasia is a significant risk factor for patellofemoral instability. The Dejour classification is currently considered the standard for classifying trochlear dysplasia, but numerous studies have reported poor reliability on both plain radiography and MRI. The severity of trochlear dysplasia is important to establish in order to guide surgical management. We have developed an MRI-specific classification system to assess the severity of trochlear dysplasia, the Oswestry-Bristol Classification (OBC). This is a four-part classification system comprising normal, mild, moderate, and severe to represent a normal, shallow, flat, and convex trochlear, respectively. The purpose of this study was to assess the inter- and intraobserver reliability of the OBC and compare it with that of the Dejour classification. Methods. Four observers (two senior and two junior orthopaedic surgeons) independently assessed 32 CT and axial MRI scans for trochlear dysplasia and classified each according to the OBC and the Dejour classification systems. Assessments were repeated following a four-week interval. The inter- and intraobserver agreement was determined by using Fleiss’ generalization of Cohen’s kappa statistic and S-statistic nominal and linear weights. Results. The OBC showed fair-to-good interobserver agreement and good-to-excellent intraobserver agreement (mean kappa 0.68). The Dejour classification showed poor interobserver agreement and fair-to-good intraobserver agreement (mean kappa 0.52). Conclusion. The OBC can be used to assess the severity of trochlear dysplasia. It can be applied in clinical practice to simplify and standardize surgical decision-making in patients with recurrent patella instability. Cite this article: Bone Joint J 2020;102-B(1):102–107


The Bone & Joint Journal
Vol. 102-B, Issue 3 | Pages 301 - 309
1 Mar 2020
Keenan OJF Holland G Maempel JF Keating JF Scott CEH

Aims. Although knee osteoarthritis (OA) is diagnosed and monitored radiologically, actual full-thickness cartilage loss (FTCL) has rarely been correlated with radiological classification. This study aims to analyze which classification system correlates best with FTCL and to assess their reliability. Methods. A prospective study of 300 consecutive patients undergoing unilateral total knee arthroplasty (TKA) for OA (mean age 69 years (44 to 91; standard deviation (SD) 9.5), 178 (59%) female). Two blinded examiners independently graded preoperative radiographs using five common systems: Kellgren-Lawrence (KL); International Knee Documentation Committee (IKDC); Fairbank; Brandt; and Ahlbäck. Interobserver agreement was assessed using the intraclass correlation coefficient (ICC). Intraoperatively, anterior cruciate ligament (ACL) status and the presence of FTCL in 16 regions of interest were recorded. Radiological classification and FTCL were correlated using the Spearman correlation coefficient. Results. Knees had a mean of 6.8 regions of FTCL (SD 3.1), most common medially. The commonest patterns of FTCL were medial ± patellofemoral (143/300, 48%) and tricompartmental (89/300, 30%). ACL status was associated with pattern of FTCL (p = 0.023). All radiological classification systems demonstrated moderate ICC, but this was highest for the IKDC: whole knee 0.68 (95% confidence interval (CI) 0.60 to 0.74); medial compartment 0.84 (95% CI 0.80 to 0.87); and lateral compartment 0.79 (95% CI 0.73 to 0.83). Correlation with actual FTCL was strongest for Ahlbäck (Spearman rho 0.27 to 0.39) and KL (0.30 to 0.33) systems, although all systems demonstrated medium correlation. The Ahlbäck score was the most discriminating in severe knee OA. Osteophyte presence in the medial compartment had high positive predictive value (PPV) for FTCL, but not in the lateral compartment. Conclusion. The Ahlbäck and KL systems had the highest correlation with confirmed cartilage loss at TKA. However, the IKDC system displayed the best interobserver reliability, with favourable correlation with FTCL in medial and lateral compartments, although it was less discriminating in more severe disease. Cite this article: Bone Joint J 2020;102-B(3):301–309


The Bone & Joint Journal
Vol. 101-B, Issue 9 | Pages 1042 - 1049
1 Sep 2019
Murphy MP Killen CJ Ralles SJ Brown NM Hopkinson WJ Wu K

Aims. Several radiological methods of measuring anteversion of the acetabular component after total hip arthroplasty (THA) have been described. These are limited by low reproducibility, are less accurate than CT 3D reconstruction, and are cumbersome to use. These methods also partly rely on the identification of obscured radiological borders of the component. We propose two novel methods, the Area and Orthogonal methods, which have been designed to maximize use of readily identifiable points while maintaining the same trigonometric principles. Patients and Methods. A retrospective study of plain radiographs was conducted on 160 hips of 141 patients who had undergone primary THA. We compared the reliability and accuracy of the Area and Orthogonal methods with two of the current leading methods: those of Widmer and Lewinnek, respectively. Results. The 160 anteroposterior pelvis films revealed that the proposed Area method was statistically different from those described by Widmer and Lewinnek (p < 0.001 and p = 0.004, respectively). They gave the highest inter- and intraobserver reliability (0.992 and 0.998, respectively), and took less time (27.50 seconds (. sd. 3.19); p < 0.001) to complete. In addition, 21 available CT 3D reconstructions revealed the Area method achieved the highest Pearson’s correlation coefficient (r = 0.956; p < 0.001) and least statistical difference (p = 0.704) from CT with a mean within 1° of CT-3D reconstruction between ranges of 1° to 30° of measured radiological anteversion. Conclusion. Our results support the proposed Area method to be the most reliable, accurate, and speedy. They did not support any statistical superiority of the proposed Orthogonal method to that of the Widmer or Lewinnek method. Cite this article: Bone Joint J 2019;101-B:1042–1049


The Bone & Joint Journal
Vol. 102-B, Issue 1 | Pages 17 - 25
1 Jan 2020
Trickett RW Mudge E Price P Pallister I

Aims. The aim of this study was to develop a psychometrically sound measure of recovery for use in patients who have suffered an open tibial fracture. Methods. An initial pool of 109 items was generated from previous qualitative data relating to recovery following an open tibial fracture. These items were field tested in a cohort of patients recovering from an open tibial fracture. They were asked to comment on the content of the items and structure of the scale. Reduction in the number of items led to a refined scale tested in a larger cohort of patients. Principal components analysis permitted further reduction and the development of a definitive scale. Internal consistency, test-retest reliability, and responsiveness were assessed for the retained items. Results. The initial scale was completed by 35 patients who were recovering from an open tibial fracture. Subjective and objective analysis permitted removal of poorly performing items and the addition of items suggested by patients. The refined scale consisted of 50 Likert scaled items and eight additional items. It was completed on 228 occasions by a different cohort of 204 patients with an open tibial fracture recruited from several UK orthoplastic tertiary referral centres. There were eight underlying components with tangible real-life meaning, which were retained as sub-scales represented by ten Likert scaled and eight non-Likert items. Internal consistency and test-retest reliability were good to excellent. Conclusion. The Wales Lower Limb Trauma Recovery (WaLLTR) Scale is the first tool to be developed from patient data with the potential to assess recovery following an open tibial fracture. Cite this article: Bone Joint J 2020;102-B(1):17–25


The Bone & Joint Journal
Vol. 99-B, Issue 4 | Pages 445 - 450
1 Apr 2017
Marsh AG Nisar A El Refai M Patil S Meek RMD

Aims. The purpose of this study was to evaluate whether an innovative templating technique could predict the need for acetabular augmentation during primary total hip arthroplasty for patients with dysplastic hips. Patients and Methods. We developed a simple templating technique to estimate acetabular component coverage at total hip arthroplasty, the True Cup: False Cup (TC:FC) ratio. We reviewed all patients with dysplastic hips who underwent primary total hip arthroplasty between 2005 and 2012. Traditional radiological methods of assessing the degree of acetabular dysplasia (Sharp’s angle, Tönnis angle, centre-edge angle) as well as the TC:FC ratio were measured from the pre-operative radiographs. A comparison of augmented and non-augmented hips was undertaken to determine any difference in pre-operative radiological indices between the two cohorts. The intra- and inter-observer reliability for all radiological indices used in the study were also calculated. Results. Of the 128 cases reviewed, 33 (26%) needed acetabular augmentation. We found no difference in the median Sharp’s angle (p = 0.10), Tönnis angle (p = 0.28), or centre-edge angle (p = 0.07) between the two groups. A lower TC:FC ratio was observed in the augmented group compared with the non-augmented group (median = 0.66 versus 0.88, p <  0.001). Intra-observer reliability was found to be high for all radiological indices analysed (interclass correlation coefficient (ICC) > 0.7). However, inter-observer reliability was more variable and was only high for the TC: FC ratio (ICC > 0.7). Conclusion. The TC: FC ratio gives an accurate estimate of acetabular component coverage. It can help predict which dysplastic hips are likely to need acetabular augmentation at primary total hip arthroplasty. It has high intra- and inter-observer reliability. Cite this article: Bone Joint J 2017;99-B:445–50


The Bone & Joint Journal
Vol. 103-B, Issue 6 | Pages 1168 - 1172
1 Jun 2021
Iliadis AD Wright J Stoddart MT Goodier WD Calder P

Aims. The STRYDE nail is an evolution of the PRECICE Intramedullary Limb Lengthening System, with unique features regarding its composition. It is designed for load bearing throughout treatment in order to improve patient experience and outcomes and allow for simultaneous bilateral lower limb lengthening. The literature published to date is limited regarding outcomes and potential problems. We report on our early experience and raise awareness for the potential of adverse effects from this device. Methods. This is a retrospective review of prospective data collected on all patients treated in our institution using this implant. We report the demographics, nail accuracy, reliability, consolidation index, and cases where concerning clinical and radiological findings were encountered. There were 14 STRYDE nails implanted in nine patients (three male and six female) between June 2019 and September 2020. Mean age at surgery was 33 years (14 to 65). Five patients underwent bilateral lengthening (two femoral and three tibial) and four patients unilateral femoral lengthening for multiple aetiologies. Results. At the time of reporting, eight patients (13 implants) had completed lengthening. Osteolysis and periosteal reaction at the junction of the telescopic nail was evident in nine implants. Five patients experienced localized pain and swelling. Macroscopic appearances following retrieval were consistent with corrosion at the telescopic junction. Tissue histology was consistent with effects of focal metallic wear debris. Conclusion. From our early experience with this implant we have found the process of lengthening to be accurate and reliable with good regenerate formation and consolidation. Proposed advantages of early load bearing and the ability for bilateral lengthening are promising. We have, however, encountered concerning clinical and radiological findings in several patients. We have elected to discontinue its use to allow further investigation into the retrieved implants and patient outcomes from users internationally. Cite this article: Bone Joint J 2021;103-B(6):1168–1172


The Bone & Joint Journal
Vol. 97-B, Issue 6 | Pages 818 - 823
1 Jun 2015
Plant CE Hickson C Hedley H Parsons NR Costa ML

We conducted an observational radiographic study to determine the inter- and intra-observer reliability of the AO classification of fractures of the distal radius. Plain posteroanterior and lateral radiographs of 456 patients with an acute fracture of the distal radius were classified by a consultant orthopaedic hand specialist and two specialist trainees, and the k coefficient for the inter- and intra-observer reliability of the type, group and subgroup classification was calculated. . Only the type of fracture (A, B or C) was found to provide substantial intra-observer reliability (k . type. 0.65). The inclusion of ‘group’ and ‘subgroup’ into the classification reduced the inter-observer reliability to fair (k. group. 0.29, k. subgroup . = 0.28) and the intra-observer reliability to moderate (k. group. 0.53, k. subgroup. 0.49). Disagreement was found to arise between specific subgroups, which may be amenable to clarification. Cite this article: Bone Joint J 2015; 97-B:818–23


The Bone & Joint Journal
Vol. 98-B, Issue 3 | Pages 374 - 380
1 Mar 2016
Kocsis G Thyagarajan DS Fairbairn KJ Wallace WA

Aims. Glenoid bone loss can be a challenging problem when revising a shoulder arthroplasty. Precise pre-operative planning based on plain radiographs or CT scans is essential. We have investigated a new radiological classification system to describe the degree of medialisation of the bony glenoid and that will indicate the amount of bone potentially available for supporting a glenoid component. It depends on the relationship between the most medial part of the articular surface of the glenoid with the base of the coracoid process and the spinoglenoid notch: it classifies the degree of bone loss into three types. It also attempts to predict the type of glenoid reconstruction that may be possible (impaction bone grafting, structural grafting or simple non-augmented arthroplasty) and gives guidance about whether a pre-operative CT scan is indicated. Patients and Methods. Inter-method reliability between plain radiographs and CT scans was assessed retrospectively by three independent observers using data from 39 randomly selected patients. . Inter-observer reliability and test-retest reliability was tested on the same cohort using Cohen's kappa statistics. Correlation of the type of glenoid with the Constant score and its pain component was analysed using the Kruskal-Wallis method on data from 128 patients. Anatomical studies of the scapula were reviewed to explain the findings. Results. Excellent inter-method reliability, inter-observer and test-retest reliability were seen. The system did not correlate with the Constant score, but correlated well with its pain component. . Take home message: Our system of classification is a helpful guide to the degree of glenoid bone loss when embarking on revision shoulder arthroplasty. Cite this article: Bone Joint J 2016;98-B:374–80


The Bone & Joint Journal
Vol. 103-B, Issue 7 Supple B | Pages 17 - 24
1 Jul 2021
Vigdorchik JM Sharma AK Buckland AJ Elbuluk AM Eftekhary N Mayman DJ Carroll KM Jerabek SA

Aims. Patients with spinal pathology who undergo total hip arthroplasty (THA) have an increased risk of dislocation and revision. The aim of this study was to determine if the use of the Hip-Spine Classification system in these patients would result in a decreased rate of postoperative dislocation in patients with spinal pathology. Methods. This prospective, multicentre study evaluated 3,777 consecutive patients undergoing THA by three surgeons, between January 2014 and December 2019. They were categorized using The Hip-Spine Classification system: group 1 with normal spinal alignment; group 2 with a flatback deformity, group 2A with normal spinal mobility, and group 2B with a stiff spine. Flatback deformity was defined by a pelvic incidence minus lumbar lordosis of > 10°, and spinal stiffness was defined by < 10° change in sacral slope from standing to seated. Each category determined a patient-specific component positioning. Survivorship free of dislocation was recorded and spinopelvic measurements were compared for reliability using intraclass correlation coefficient. Results. A total of 2,081 patients met the inclusion criteria. There were 987 group 1A, 232 group 1B, 715 group 2A, and 147 group 2B patients. A total of 70 patients had a lumbar fusion, most had L4-5 (16; 23%) or L4-S1 (12; 17%) fusions; 51 patients (73%) had one or two levels fused, and 19 (27%) had > three levels fused. Dual mobility (DM) components were used in 166 patients (8%), including all of those in group 2B and with > three level fusions. Survivorship free of dislocation at five years was 99.2% with a 0.8% dislocation rate. The correlation coefficient was 0.83 (95% confidence interval 0.89 to 0.91). Conclusion. This is the largest series in the literature evaluating the relationship between hip-spine pathology and dislocation after THA, and guiding appropriate treatment. The Hip-Spine Classification system allows surgeons to make appropriate evaluations preoperatively, and it guides the use of DM components in patients with spinopelvic pathology in order to reduce the risk of dislocation in these high-risk patients. Cite this article: Bone Joint J 2021;103-B(7 Supple B):17–24