Medical Student Reflection Exercises Created
Using AI: Can We Tell the Difference, and Does It Matter?
If you’ve spent any time in medical
education recently—whether in lectures, clinical supervision, or curriculum
design—you’ve likely been a part of the growing conversation around student
(and resident/fellow) use of generative AI. From drafting SOAP notes to
summarizing journal articles, AI tools like ChatGPT are rapidly becoming
ubiquitous. But now we’re seeing them show up in more personal activities such
as reflective assignments. A new question has emerged: can educators really
tell the difference between a student’s genuine reflection and something
written by AI?
The recent article in Medical
Education by Wraith et al (1) took a shot at this question. They conducted an
elegant, slightly disconcerting study: faculty reviewers were asked to
distinguish between reflective writing submitted by actual medical students and
those generated by AI. The results? About as accurate as flipping a coin, maybe
a little better. Accuracy was between 64% and 75%, regardless of the faculty
member’s experience or confidence. They did seem to get better as they read
more reflections.
I’ll admit, when I first read this, I
had a visceral reaction. Something about the idea that we can’t tell what’s
“real” from what’s machine-generated in a genre that is supposed to be deeply
personal—reflective writing—felt jarring. Aren’t we trained to pick up on
nuance, empathy, sincerity? But as I sat with it, I realized the issue goes
much deeper than just our ability to “spot the fake.” It forces us to confront
how we define authenticity, the purpose of reflection in medical education, and
how we want to relate to the tools that are now part of our students’ daily
workflows.
What Makes a Reflection Authentic?
We often emphasize reflection as a
professional habit: a way to develop clinical insight, emotional intelligence,
and lifelong learning. But much of that hinges on the assumption that the act of
writing the reflection is what promotes growth. If a student bypasses that
internal process and asks an AI to “write a reflection on breaking bad news to
a patient,” I worry that the learning opportunity is lost.
But here’s the rub: the Wraith study
didn’t test whether students were using AI to replace reflection or to aid it.
It simply asked whether educators could tell the difference. And they could not
do that reliably. This suggests that AI can replicate the tone, structure, and
emotional cadence that we expect a medical student to provide in a reflective
essay. That is both fascinating and problematic.
If AI can mimic reflective writing
well enough to fool seasoned educators, then maybe it is time to reevaluate how
we assess reflection in the first place. Are we grading sincerity? Emotional
language? The presence of keywords like “empathy,” “growth,” or “uncertainty”?
If we do not have a robust framework for evaluating whether reflection is
actually happening—as an internal, cognitive-emotional process—then it
shouldn’t surprise us that AI fake it by just checking the boxes.
Faculty Attitudes: Cautious Curiosity
Another recent study, this one in the Journal
of Investigative Medicine by Cervantes et al (2), explored how medical
educators are thinking about generative AI more broadly. They did a survey of
250 allopathic and osteopathic medical school faculty at Nova Southeastern
University. Their results revealed a mix of excitement and unease. Most saw
potential for improving education—particularly in the ability to conduct more
efficient research, tutuoring, task automation, and increased content
accessibility—but they were also deeply concerned about professionalism,
academic integrity, removal of human interaction in important feedback, and
overreliance on AI-generated content.
Interestingly, one of the biggest
predictors of positive attitudes toward AI was prior use. Faculty who had
experimented with ChatGPT or similar tools were more likely to see educational
value and less likely to view it as a threat. That tracks with my own anecdotal
experience: once people see what AI can do—and just as importantly, what it
can’t do—they develop a more nuanced, measured perspective.
Still, the discomfort lingers. If
students can generate polished reflections without deep thought, is the
assignment still worth doing? Should we redesign reflective writing tasks to
include oral defense or peer feedback? Or should we simply accept that AI will
be part of the process and shift our focus toward cultivating meaningful inputs
rather than fixating on outputs?
What about using AI-augmented reflection?
Let me propose a middle path. What if
we reframe AI not as a threat to reflective writing, but as a catalyst? Imagine
a student who types out some thoughts after a tough patient encounter, then
asks an AI to help clarify or expand them. They read what the AI produces,
agree with some parts, reject others, revise accordingly. The final product is
stronger—not because AI did the work, but because it facilitated a richer
internal dialogue.
That’s not cheating. That’s
collaboration. And it’s arguably closer to how most of us write in real
life—drafting, editing, bouncing ideas off others (human or machine). Of
course, this assumes we teach students to use AI ethically and reflectively,
which means we need to model that ourselves. Faculty development around AI
literacy is no longer optional. We must move beyond fear-based policies and
invest in practical training, guidelines, and conversations that encourage
responsible use.
So, where do we go from here?
A few concrete steps seem worth
considering:
1. Redesign reflective assignments. Move beyond short essays. Try audio
reflections, peer feedback, or structured prompts that emphasize personal
growth over polished prose.
2. Focus on process, not just product. Ask students to document how they
engaged with the reflection—did they use AI? Did they discuss it with a peer or
preceptor? Did it change their thinking?
3. Embrace transparency. Normalize the use of AI in education and ask
students to disclose when and how they used it. Make that part of the learning
conversation from the beginning.
4. Invest in AI literacy. Faculty need space and time to learn what these
tools can and can’t do. The more familiar we are as faculty, the better we can
guide our students.
5. Stay curious. The technology isn’t going away. The sooner we stop
wringing our hands and start asking deeper pedagogical questions, the better
positioned we’ll be to adapt with purpose.
In the end, the real question isn’t
“Can we tell if a reflection is AI-generated?” It’s “Are we creating learning
environments where authentic reflection is valued, supported, and developed—whether
or not AI is in the room?”
If we can answer yes to that, then
maybe it doesn’t matter so much who—or what—wrote the first draft.
References
(1)
, , , , . Can educators distinguish between medical student and generative
AI-authored reflections? Med
Educ. 2025; 1-8.
doi:10.1111/medu.15750
(2)
, , , , , . Decoding medical educators' perceptions on generative artificial
intelligence in medical education. J
Invest Med. 2024; 72(7): 633-639. doi:10.1177/10815589241257215
No comments:
Post a Comment