AI Assisted Detection of Fractures on X-Rays (FRACT-AI)

FRACT-AI: Evaluating the Impact of Artificial Intelligence-Enhanced Image Analysis on the Diagnostic Accuracy of Frontline Clinicians in the Detection of Fractures on Plain X-Ray

Status

Completed

Phases

Unknown

Study type

Observational

Source

ClinicalTrials.gov

Registry ID

NCT06130397

Acronym

FRACT-AI

Enrollment

Registered

2023-11-14

Start date

2024-02-08

Completion date

2025-06-01

Last updated

2025-11-24

For informational purposes only — not medical advice. Sourced from public registries and may not reflect the latest updates. Terms

Conditions

Bone Fracture, Dislocation, Fracture, Fracture Multiple, Fractures, Closed, Fractures, Open

Keywords

radiology, emergency medicine, artificial intelligence, XRays, Fractures

Brief summary

This study has been added as a sub study to the Simulation Training for Emergency Department Imaging 2 study (ClinicalTrials.gov ID NCT05427838). This work aims to evaluate the impact of an Artificial Intelligence (AI)-enhanced algorithm called Boneview on the diagnostic accuracy of clinicians in the detection of fractures on plain XR (X-Ray). The study will create a dataset of 500 plain X-Rays involving standard images of all bones other than the skull and cervical spine, with 50% normal cases and 50% containing fractures. A reference 'ground truth' for each image to confirm the presence or absence of a fracture will be established by a senior radiologist panel. This dataset will then be inferenced by the Gleamer Boneview algorithm to identify fractures. Performance of the algorithm will be compared against the reference standard. The study will then undertake a Multiple-Reader Multiple-Case study in which clinicians interpret all images without AI and then subsequently with access to the output of the AI algorithm. 18 clinicians will be recruited as readers with 3 from each of six distinct clinical groups: Emergency Medicine, Trauma and Orthopedic Surgery, Emergency Nurse Practitioners, Physiotherapy, Radiology and Radiographers, with three levels of seniority in each group. Changes in reporting accuracy (sensitivity, specificity), confidence, and speed of readers in two sessions will be compared. The results will be analyzed in a pooled analysis for all readers as well as for the following subgroups: Clinical role, Level of seniority, Pathological finding, Difficulty of image. The study will demonstrate the impact of an AI interpretation as compared with interpretation by clinicians, and as compared with clinicians using the AI as an adjunct to their interpretation. The study will represent a range of professional backgrounds and levels of experience among the clinical element. The study will use plain film x-rays that will represent a range of anatomical views and pathological presentations, however x-rays will present equal numbers of pathological and non-pathological x-rays, giving equal weight to assessment of specificity and sensitivity. Ethics approval has already been granted, and the study will be disseminated through publication in peer-reviewed journals and presentation at relevant conferences.

Interventions

OTHERCases reading

The reading will be done remotely via the Report and Image Quality Control site (www.RAIQC.com), an online platform allowing medical imaging viewing and reporting. Participants can work from any location, but the work must be done from a computer with internet access. For avoidance of doubt, the work cannot be performed from a phone or tablet. The project is divided into two phases and participants are required to complete both phases. The estimated total involvement in the project is up to 20-24 hours. Phase 1: Time allowed: 2 weeks \- Participants must review 500 X-rays and express a clinical opinion through a structured reporting template (multiple choice, no open text required). Rest/washout period - Time allowed: 4 weeks, to mitigate the effects of recall bias. Phase 2 - Time allowed: 2 weeks \- Review 500 X-rays together with an AI report for each case and express their clinical opinion through the same structured reporting template used in Phase 1.

OTHERGround truthing

Two consultant musculoskeletal radiologists will independently review the images to establish the 'ground truth' findings on the XRs, where a consensus is reached this will then be used as the reference standard. In the case of disagreement, a third senior musculoskeletal radiologist's opinion (\>20 years experience) will undertake arbitration. A difficulty score will be assigned to each abnormality by the ground truthers using a 4-point Likert scale (1 being easy/obvious to 4 being hard/poorly visualised).

Study design

Observational model

COHORT

Time perspective

RETROSPECTIVE

Eligibility

Sex/Gender

ALL

Healthy volunteers

Yes

Inclusion criteria

* Emergency medicine doctors, trauma and orthopaedic surgeons, emergency nurse practitioners, physiotherapists, general radiologists and radiographers reviewing X-rays as part of their routine clinical practice. * Currently working in the National Health Service (NHS).

Exclusion criteria

* Non-radiology physicians with previous formal postgraduate XR reporting training. * Non-radiology physicians with previous career in radiology

Design outcomes

Primary

Measure	Time frame	Description
Performance of AI algorithm: sensitivity	During 4 weeks of reading time	Evaluation of the Gleamer Boneview algorithm will be performed comparing it to the reference standard in order to determine sensitivity.
Performance of AI algorithm: specificity	During 4 weeks of reading time	Evaluation of the Gleamer Boneview will be performed comparing it to the reference standard in order to determine specificity.
Performance of AI algorithm: Area under the ROC Curve (AU ROC)	During 4 weeks of reading time	Evaluation of the Gleamer Boneview algorithm will be performed comparing it to the reference standard. Continuous probability score from the algorithm will be utilised for the ROC analyses, while binary classification results with a predefined operating cut-off will be used for evaluation of sensitivity, specificity, positive predictive value, and negative predictive value.
Performance of readers with and without AI assistance: Sensitivity	During 4 weeks of reading time	The study will include two sessions (with and without AI overlay), with all 18 readers reviewing all 500 XR cases each time separated by a washout period to mitigate recall bias. The cases will be randomised between the two reads and for every reader.
Performance of readers with and without AI assistance: Specificity	During 4 weeks of reading time	The study will include two sessions (with and without AI overlay), with all 18 readers reviewing all 500 XR cases each time separated by a washout period to mitigate recall bias. The cases will be randomised between the two reads and for every reader.
Performance of readers with and without AI assistance: Area under the ROC Curve (AU ROC)	During 4 weeks of reading time	The study will include two sessions (with and without AI overlay), with all 18 readers reviewing all 500 XR cases each time separated by a washout period to mitigate recall bias. The cases will be randomised between the two reads and for every reader.
Reader speed with vs without AI assistance.	During 4 weeks of reading time	Mean time taken to review a XR, with vs without AI assistance.

Countries

United Kingdom

Outcome results

None listed

Source: ClinicalTrials.gov · Data processed: Feb 5, 2026