Aims. A recent study used the RAND Corporation at University of California, Los Angeles (RAND/UCLA) method to develop anatomical total shoulder arthroplasty (aTSA) appropriateness criteria. The purpose of our study was to determine how patient-reported outcome measures (PROMs) vary based on appropriateness. Methods. Clinical data from a multicentre database identified patients who underwent primary aTSA from November 2004 to January 2023. A total of 390 patients (mean follow-up 48.1 months (SD 42.0)) were included: 97 (24.9%) were classified as appropriate, 218 (55.9%) inconclusive, and 75 (19.2%) inappropriate. Patients were classified as “appropriate”, “inconclusive”, or “inappropriate”, using a modified version of an appropriateness algorithm, which accounted for age, rotator cuff status, mobility, symptomatology, and Walch classification. Multiple pre- and postoperative scores were analyzed using Pearson’s chi-squared test and one-way analysis of variance (ANOVA). Postoperative complications were also analyzed. Results. All groups achieved significant improvement in mean PROM scores postoperatively. “Appropriate” patients experienced significantly greater improvement in visual analogue scale (VAS) and American Shoulder and Elbow Surgeons (ASES) score compared to “inconclusive” and “inappropriate”. The appropriate group had a significantly greater proportion of patients who achieved minimal clinically important difference (MCID) (95.8%; n = 93) and substantial clinical benefit (SCB) (92.6%; n = 89). Overall, 13 patients had postoperative complications. No significant differences in postoperative complications among classifications were found. Conclusion. Our data clinically validate the