AI avatars in corporate training: what changes when there's a face talking
"Is the avatar actually necessary, or is it just to make it look better?"
That's the question almost every training manager asks when evaluating AI video platforms. And it's a fair one: the avatar adds cost, complexity, and setup time. If the outcome were the same without it, there would be no reason to include it.
The answer isn't obvious, and it depends on the type of content. But now there's data.
What the research says about having a face on screen
In 2024, TechSmith published a study with 768 workers across four countries (US, UK, Canada, and Australia) comparing different formats for training video, including versions with and without avatars. Quiz-based comprehension scores showed that the avatar picture-in-picture format produced around 76% correct answers, compared to approximately 66% for other formats ¹. A difference of about ten percentage points.
In 2025, a peer-reviewed study published in Education and Information Technologies (Springer) went further: combining an AI avatar with an AI voice produced a statistically significant improvement in engagement. Neither element alone reached that threshold. The combination did ².
What these studies measure is not that the avatar is magic. It's that the presence of a face activates attention mechanisms that static text or voice without an image don't trigger in the same way. It's not about design. It's about how the brain processes communication when there's a visible person speaking.
When the avatar changes the outcome — and when it doesn't
The avatar matters more in some contexts than others. It's worth being precise here.
Where the avatar has real impact:
Welcome content and company identity. Onboarding is exactly the case where a new employee needs to feel that someone is speaking to them, not reading a document at them. A corporate avatar conveys culture in a way a PDF simply can't.
Compliance and workplace safety training. Tone matters in these modules. An avatar that maintains eye contact and a serious register communicates the importance of the content better than a voice over slides.
Content that will be viewed many times or across many locations. When the same module reaches fifty branches or three consecutive seasonal cohorts, the consistency of the avatar is worth more than in a one-off video.
Where the avatar adds little:
Software tutorials or highly technical procedures where the screen is the central element. Screen recording with narration works well there, and the avatar takes up screen space without contributing.
Very short reinforcement capsules (under two minutes). The avatar's appearance time doesn't justify the visual overhead.
