Artificial intelligence was less accurate in mock examination vs. radiology trainees
Click Here to Manage Email Alerts
Published results showed an artificial intelligence candidate was less accurate on the Fellowship of the Royal College of Radiologists examination compared with human trainees who had recently passed the examination.
Researchers performed a prospective, multi-reader diagnostic accuracy study that compared results of a mock Fellowship of the Royal College of Radiologists (FRCR) examination for one AI candidate (Smarturgences, Milvue) and 26 radiologists who had passed the FRCR examination in the preceding year. According to the study, the examination required participants to interpret 30 acute chest and musculoskeletal radiographs with at least 90% accuracy to pass.
Overall, the AI candidate was unable pass any of the 10 mock examinations when marked against similarly strict criteria as the human trainees; however, AI was able to pass two of the 10 examinations when uninterpretable images (images that AI had not been trained on) were excluded.
Excluding uninterpretable images, AI had overall sensitivity, specificity and accuracy rates of 83.6%, 75.2% and 79.5% respectively, while the human trainees passed an average of four out of 10 examinations with overall sensitivity, specificity and accuracy rates of 84.1%, 87.3% and 84.8%, respectively.
Researchers noted most AI mistakes were related to interpretation of musculoskeletal radiographs rather than chest radiographs.
“Further training and revision are strongly recommended, particularly for cases the AI considers ‘noninterpretable,’ such as abdominal radiographs and those of the axial skeleton,” the researchers wrote in the study. “Increased familiarity with less common and more subtle bony pathologies will also help to boost the chances of examination success,” they concluded.