ChatGPT scores 46% on multiple-choice ophthalmology test

US scientists have described the results of an AI chatbot answering practice questions for board certification in ophthalmology

SP exam
Pixabay/F1 Digitals

An artificial intelligence (AI) chatbot that answered 125 multiple-choice questions for board certification in ophthalmology scored 46%.

The study, which was described in JAMA Ophthalmology, found that ChatGPT improved its score by 10% a month after the first test.

ChatGPT selected the most common answer chosen by ophthalmology trainees 44% of the time.

Lead author, Andrew Mihalache, of Western University, highlighted that the chatbot performed best on general medicine questions – answering 79% correctly.

“On the other hand, its accuracy was considerably lower on questions for ophthalmology subspecialties. For instance, the chatbot answered 20% of questions correctly on oculoplastics and 0% correctly from the subspecialty of retina. The accuracy of ChatGPT will likely improve most in niche subspecialties in the future,” he said.

Principal investigator, Dr Rajeev Muni, of St Michael’s Hospital, which led the study, highlighted that ChatGPT may play an increasing role in medical education and clinical practice over time.

“However, it is important to stress the responsible use of such AI systems,” Muni emphasised.

“ChatGPT as used in this investigation did not answer sufficient multiple choice questions correctly for it to provide substantial assistance in preparing for board certification at this time,” he concluded.