AI language models outperform medical specialists

Congress participants compete against AI in knowledge test on acute kidney injury

23-Apr-2026
AI-generated image

Symbol image

Are AI systems already able to retrieve medical knowledge better than doctors? Researchers at the University of Marburg and the University Hospital Giessen and Marburg (UKGM) investigated how well 13 of the best-known publicly available AI language models can recall and apply clinical knowledge on acute kidney injury in a standardized test situation. Dr. Philipp Russ and his team compared these models with 123 volunteer participants, including medical students and doctors from the field of internal medicine. The medical test subjects were participants at the 131st Annual Congress of the German Society of Internal Medicine (DGIM). The conference is one of the largest internal medicine congresses in Europe with around 9,000 participants. It took place in Wiesbaden in May 2025. The researchers led by Dr. Philipp Russ and Prof. Dr. Ivica Grgic report on the results in the journal "Scientific Reports".

Machine advantage

Both groups completed the same German-language knowledge test on kidney damage with two realistic patient cases and 15 multiple-choice questions. The result was clear: The language models tested answered an average of 90 percent of the questions correctly, compared to just 49 percent for the specialist congress participants. Several models answered all questions correctly and took only a fraction of the time needed by the participants.

The human advantage

The study thus shows that large language models can now very reliably reproduce guideline-compliant medical knowledge in standardized question situations. At the same time, the authors emphasize that a good score in the knowledge test does not mean that these systems can or even should make clinical decisions independently. "Human judgment and clinical experience remain crucial. The ultimate responsibility for patient care remains clearly with the treating physicians," emphasizes Marburg nephrologist and AI expert Prof. Ivica Grgic, MD.

Opportunity for everyday clinical practice

Study leader Dr. Philipp Russ comments on the results: "Large language models can provide medical factual knowledge very quickly. This is an opportunity for everyday clinical practice. At the same time, they have clear limitations: among other things, they can generate incorrect content, fail to capture people in all their complexity and lack empathy. A language model does not see, hear or feel what a person is really concerned about. This is precisely why it cannot replace medical action and clinical judgment. However, if used correctly, it could give us more time for what patients need in particular: Attention, care and human closeness."

Perspective

Against this backdrop, from today's perspective, AI in the clinical context primarily appears to be a supportive tool. At the same time, its further development cannot be reliably predicted due to the high pace of innovation. The empirical basis for many areas of application is still limited. It remains to be seen whether and to what extent future systems will take on more autonomous functions and to what extent such a development is desired and accepted by society. Integration into clinical practice should therefore be seen as a gradual process that requires continuous professional, regulatory and ethical reflection.

Note: This article has been translated using a computer system without human intervention. LUMITOS offers these automatic translations to present a wider range of current news. Since this article has been translated with automatic translation, it is possible that it contains errors in vocabulary, syntax or grammar. The original article in German can be found here.

Original publication

Other news from the department science

Most read news

More news from our other portals

Fighting cancer: latest developments and advances

See the theme worlds for related content

Topic world Diagnostics

Diagnostics is at the heart of modern medicine and forms a crucial interface between research and patient care in the biotech and pharmaceutical industries. It not only enables early detection and monitoring of disease, but also plays a central role in individualized medicine by enabling targeted therapies based on an individual's genetic and molecular signature.

View topic world
Topic world Diagnostics

Topic world Diagnostics

Diagnostics is at the heart of modern medicine and forms a crucial interface between research and patient care in the biotech and pharmaceutical industries. It not only enables early detection and monitoring of disease, but also plays a central role in individualized medicine by enabling targeted therapies based on an individual's genetic and molecular signature.