SimFlow
Evidence & Research · SimFlow.ai

Standing with Science: What New Research Featuring SimFlow.ai Means for Healthcare Educators

A peer-reviewed study published in JMIR Formative Research examined whether conversational AI could help address that. The study is an independent evaluation of SimFlow.ai in real UK primary care training, with findings that are specific, honest, and worth reading in full.

Section 1

Standing with Science

This World Health Day, we are asking what it means to 'stand with science' in the world of healthcare simulation. When a new tool enters clinical training, what should we reasonably expect of it, evidence-wise? Science surrounds the industry; it is what simulation strives to replicate. So, it follows that the tools we use to practise exacting medical science should be subjected to that same objective scientific scrutiny – to ensure simulation is standing with science every step of the way.

Cost, logistics, and lack of standardisation at scale are fundamental challenges that communication skills training faces. These create consequential gaps in communication competencies that continually impact patient care. It is against this backdrop that a new peer-reviewed study evaluating SimFlow.ai has been published – welcoming both its findings of strength and its identified areas for improvement with equal regard.

Section 2

The Latest JMIR Research on SimFlow.ai

The research serves as a meaningful, well-designed early-stage study into agentic AI scalability in medical education. The study used a cross-sectional design, running the SimFlow.ai platform across more than 70 simultaneous teaching practice interfaces in a single morning session, generating 47 questionnaire responses from medical students and GPs across a census sample of one UK medical school and its 70 associated general practice training sites.

Participants all completed the same standardised clinical scenario and evaluated the platform across five domains:

  • AI realism
  • Medical content
  • Educational value
  • Usability

These domains were selected for their direct relevance to the key implementation questions that determine SimFlow.ai's efficacy as a supplementary training tool.

The study's own authors note its scope limitations plainly: single institution, cross-sectional design, and some collaborative survey responses that may carry group dynamic effects. An adjusted response rate of 61.3% and a pre-registered protocol on the Open Science Framework reflect the methodological care taken. A prospectively registered follow-up study is already underway.

Section 3

What the Research Found

The study identified four qualitative themes tied to the effectiveness and future directions of agentic AI in scaled medical education: clinical authenticity, interactional limitations, educational potential, and implementation considerations. Alongside these themes, the quantitative findings reveal a clear gap between the high accuracy of the AI's medical 'logic' and the ongoing challenges of natural, human-like conversation.

Strengths
  • Medical content was the standout performer: 97.8% of participants rated clinical plausibility highly, covering medically accurate symptoms and coherent patient histories.
  • Educational value was rated strongly, particularly for improving clinical reasoning and preparing learners for real-world interactions and clinical decision-making.
  • The platform was easy to use and deployed without any reported safety concerns or clinically inappropriate outputs across more than 70 simultaneous interfaces.
Limitations
  • Conversational fidelity scored moderately. Participants noted repetitive questioning patterns, unnatural response lengths, and the simulated patient using medical jargon a real patient would be unlikely to use.
  • Feedback capabilities scored lowest of all five domains. Participants did not find AI-generated feedback equivalent to facilitator feedback. "This kind of balance matters. In healthcare education, it is not enough to be innovative; tools also need to be practical, evidence-led, and honest about where they still need to improve." - Dr. Jon Truvey
A Consideration Point: The Expectation Effect

Participants with prior AI experience rated realism significantly higher — suggesting that how learners are introduced to AI simulation is an important variable determining their expectations and experiences. Onboarding design matters as much as platform design given this expectation effect.

Section 4

What the Findings Mean for Communication Skills Training Design

These findings offer practicable implications for implementing AI simulation software into communication training settings. Perhaps the most significant of these sits in the relationship between the two fidelity scores: educational value was rated positively even where conversational realism scored only moderate. This finding directly challenges the assumption that simulation must feel completely real to produce learning.

Clinical accuracy holds the most value for learning outcomes
Sound medical content superseded robotic dialogue — participants could work effectively with the simulation even where conversational naturalness fell short. Functional fidelity is the primary standard against which AI simulation tools should first be assessed.
The research supports a specific use case
SimFlow.ai shows aptitude for low-stakes, supplementary scenario practice for early-stage learners preparing for patient contact. The authors are explicit that current capabilities are not yet suited to high-stakes lone assessment.
Onboarding and expectation-setting are part of the training design question
The expectation effect finding suggests that learners' prior experience with AI shapes their perception of the tool's realism from first use. Institutions deploying AI simulation should consider how they frame the tool to learners before they begin.
The research identifies clear development priorities
The authors call for continued improvement in dialogue naturalness, more sophisticated scenario performance feedback, and greater stakeholder involvement in platform design — named openly by independent researchers and reflected in a follow-up study already underway.
Section 5

SimFlow.ai — Standing with Science at Every Step

This platform and research are a clear example of what evidence-into-action looks like in healthcare simulation education: an AI tool open to real training deployment, user scrutiny, peer-reviewed evaluation, and considered improvement — which is precisely the empirical standard of science and simulation.

SimFlow.ai is addressing a real and well-documented scale issue in communication skills training, in as considered a way as current AI technology allows. Its realism will continue to develop in step with the capabilities of that technology. What the research gives us, for now, is an honest account of where those capabilities currently sit.

At Sim & Skills, we believe the value of AI in healthcare education is directly proportional to the rigour and honesty of its evaluation. The research gives us a clear-eyed basis for understanding what SimFlow.ai currently does well and where development continues. We think that is the right foundation for any training provider considering the platform — and we offer it as the basis for you to reach your own view.

Read the Research

Full Paper Citation & Further Reading

Full paper citation
Scaling Multimodal Agentic AI in Medical Education: Multisite Cross-Sectional Study of Simulation Effectiveness in Primary Care
Jacobs C, Johnson H, Brownlie K, Joiner R, Thompson T.
JMIR Formative Research 2026