Search

Australian researchers test performance of ChatGPT on optometry exam questions

Scientists find the latest version of the AI tool “excelled” across a range of optometry and vision science written questions

A woman speaks into a mobile phone held horizontally
Getty/Keeproll

Researchers from the University of New South Wales have described the performance of a large language model (LLM) across a variety of optometry and vision science written response questions.

Writing in Ophthalmic and Physiological Optics, scientists highlighted that earlier models of ChatGPT (GPT-3.5 and GPT-4) demonstrated “variable but generally passable performance” across the set of sample questions – which included past written exam questions.

The latest version of ChatGPT (o1) “excelled across all questions,” the authors noted.

“The results of the study have shown that LLMs are able to generate satisfactory responses to various assessment questions in the field of optometry and vision science, and in many cases excel at these,” the researchers highlighted.

“Subsequent models showed significantly greater capabilities over preceding models,” they added.

The authors also assessed the performance of ChatGPT as a grader of written questions, by exploring the concordance between the AI tool and a human grader.

They found that while ChatGPT graders generally awarded higher marks than human graders, this was only statistically significant for GPT-3.5.

“The result of the study suggests there is an urgent need for optometry and vision science educators to adopt new learning and teaching strategies in the ‘ChatGPT-era’,” the researchers stated.