AI neural network ChatGPT has almost passed the USMLE medical licensing exam used to test the knowledge and skills of physicians and medical students in the United States. It is considered to be an extraordinarily difficult test, usually requiring up to 400 hours of preparation.
The research was published in the prestigious peer-reviewed scientific journal PLOS Digital Health.
As the authors of the study noted: "We evaluate[d] the performance of ChatGPT, a non-domain specific LLM, on its ability to perform clinical reasoning by testing its performance on questions from the United States Medical Licensing Examination (USMLE)."
The researchers indicated that ChatGPT passed the medical exam without specialized training. This means that the neural network was not trained on special medical papers, but simply used the information available on the internet. In doing so, researchers made sure that the answers could not simply be googled online. Artificial intelligence not only provided coherent responses but also scientifically grounded them.
ChatGPT scored between 52 and 75 percent across the three tests, while the pass mark is commonly 60 percent. In 88.9% of the responses, artificial intelligence gave answers that were perceived by the researchers as "new, non-obvious, and clinically valid". In other words, the neural network managed to produce medical insight.
2 February 2023, 16:06 GMT
The researchers added that the capabilities of the neural network should not be overestimated - it constructs plausible-sounding sentences based on the analyzed material, which means that ChatGPT can generate both extremely non-trivial ideas and complete nonsense.
The article highlighted that ChatGPT was much more effective in its responses than PubMedGPT, a bot specifically trained in medical literature.
However, scholars believe that artificial intelligence will not be able to replace doctors in a foreseeable future. However, it may play an important role not only in training future medical students but also as an assistant to a real doctor, leading them to think about non-obvious diagnoses.