Reading time: 5 min minutes
Blog
AI avatars in corporate training: what changes when there's a face talking

Beñat Arrizabalaga
Content Specialist
DifferentiationDigitization
AI avatars in corporate training: what changes when there's a face talking

"Is the avatar actually necessary, or is it just to make it look better?"
That's the question almost every training manager asks when evaluating AI video platforms. And it's a fair one: the avatar adds cost, complexity, and setup time. If the outcome were the same without it, there would be no reason to include it.
The answer isn't obvious, and it depends on the type of content. But now there's data.
In 2024, TechSmith published a study with 768 workers across four countries (US, UK, Canada, and Australia) comparing different formats for training video, including versions with and without avatars. Quiz-based comprehension scores showed that the avatar picture-in-picture format produced around 76% correct answers, compared to approximately 66% for other formats ¹. A difference of about ten percentage points.
In 2025, a peer-reviewed study published in Education and Information Technologies (Springer) went further: combining an AI avatar with an AI voice produced a statistically significant improvement in engagement. Neither element alone reached that threshold. The combination did ².
What these studies measure is not that the avatar is magic. It's that the presence of a face activates attention mechanisms that static text or voice without an image don't trigger in the same way. It's not about design. It's about how the brain processes communication when there's a visible person speaking.
The avatar matters more in some contexts than others. It's worth being precise here.
Where the avatar has real impact:
Welcome content and company identity. Onboarding is exactly the case where a new employee needs to feel that someone is speaking to them, not reading a document at them. A corporate avatar conveys culture in a way a PDF simply can't.
Compliance and workplace safety training. Tone matters in these modules. An avatar that maintains eye contact and a serious register communicates the importance of the content better than a voice over slides.
Content that will be viewed many times or across many locations. When the same module reaches fifty branches or three consecutive seasonal cohorts, the consistency of the avatar is worth more than in a one-off video.
Where the avatar adds little:
Software tutorials or highly technical procedures where the screen is the central element. Screen recording with narration works well there, and the avatar takes up screen space without contributing.
Very short reinforcement capsules (under two minutes). The avatar's appearance time doesn't justify the visual overhead.
Not all avatars are equal on a corporate platform. The operational difference between types is more relevant than how they look.
Standard catalogue avatar. Pre-recorded professional models with a wide range of expressions and movements. Lets you publish the first module in hours with no setup process. The use case is clear: L&D teams that need speed and a professional result without the trainer's own image being part of the message.
Customisable catalogue avatar. Control over clothing and appearance to maintain brand consistency across all modules. Useful for groups with multiple properties or brands where corporate image matters and a recognisable avatar — that isn't a generic one — is needed.
Custom avatar from recording. The L&D manager, operations director, or internal expert records five to fifteen minutes of video. The platform generates a 3D model of their face. From that point, that expert can "speak" in any new module without ever recording again.
This third type is the one that changes the equation most significantly for companies where technical knowledge is concentrated in a small number of people. The expert records once, and their voice and face are available to scale that knowledge indefinitely. No scheduling coordination, no studio, no new shoot every time a procedure changes.
There is one use case that illustrates better than any other why the custom avatar is worth what it costs in corporate training.
Picture the maintenance manager who has been on the plant floor for twenty years and knows exactly how to operate every machine. That knowledge lives in their head, not in any document. When they need to train someone new, everything stops. When they go on holiday, the process halts. When they retire, the knowledge disappears.
The custom avatar turns that knowledge into reusable content without that person having to be available every time. They record fifteen minutes. The platform generates the model. From then on, they can "explain" new procedures, regulatory updates, or onboarding modules without anyone claiming their time. The script changes, the avatar speaks. The expert remains the source, without the coordination overhead.
That argument has nothing to do with looking better. It has to do with the only way that exists to scale tacit knowledge without the expert having to repeat it in person every time.
Once the evaluation reaches this point, the question shifts from "does it have an avatar?" to "does this avatar work for what I need?" The criteria that most often make the difference:
Custom avatar availability without an Enterprise plan. Some platforms reserve custom avatar creation for their highest-tier plans. If the use case is the internal expert described above, it's worth checking which plan unlocks that feature.
Voice quality in your actual working language. There's a difference between a platform that "supports Spanish" via auto-translation and one that has native voices with regional variants. For training in Castilian, Catalan, or any co-official language, the difference in naturalness is audible and affects the module's credibility.
SCORM/xAPI compatibility for LMS integration. If there's an existing corporate LMS, integration is a requirement, not a nice-to-have. It's worth verifying that the export includes completion tracking, not just the video file.
Data management with European guarantees. For companies in Spain or the EU, the location of data processing and certifications (ISO 27001, GDPR) are elimination criteria before any feature evaluation begins.
Ease of updating without re-recording. Maintenance cost matters as much as initial production cost. A platform where updating the script regenerates the video without touching the avatar is fundamentally different from one where any content change means a new recording.
Does creating a custom avatar require a recording studio?
No. The standard process is a five to fifteen-minute recording in any environment with good lighting and a clean background. No professional equipment or external production needed. The platform generates the 3D model from that recording.
Can I update content without re-recording the avatar?
Yes — and that's exactly the differentiating point. You change the script, regenerate the video, and the original avatar delivers the new content. The expert doesn't need to be available for every procedure update.
Does an AI avatar count as in-person training for subsidised training schemes?
No. AI avatar modules fall under e-learning, which has its own eligibility requirements (traceability, tutoring, certification). Whether a specific module qualifies for subsidy reimbursement should be verified with your training subsidy manager or labour advisor.
What's the difference between a catalogue avatar and a custom avatar in terms of output quality?
The technical quality (resolution, movement, lip-sync) is comparable. The difference is in perceived naturalness depending on context. A professional catalogue avatar works well in procedure or compliance modules where the message matters more than who delivers it. The custom avatar has more impact when the audience recognises that person as an authority within the organisation: the operations director explaining a process change, the veteran technician walking through a critical procedure. The credibility of the speaker influences how the message lands.
How many languages does the avatar support?
It depends on the platform. Vidext supports over 120 languages with native voices by regional variant. The same avatar can "speak" in Castilian Spanish, English, Brazilian Portuguese, or Polish with different voices — useful for chains with operations across multiple countries that want to maintain the same visual identity with localised content.
The avatar isn't an aesthetic question. It's a question of what type of content you need to produce and how often.
For onboarding, compliance, and training where the tone and visible presence of a speaker affect how the message lands, the data points to a real improvement in comprehension and engagement. For software tutorials or highly visual procedures, the avatar adds little and can take up useful screen space.
The strongest argument is for scaling expert knowledge. When someone in your organisation knows things nobody else knows, and that knowledge needs to reach a lot of people consistently, the custom avatar is the most efficient way to solve that equation without depending on that person's availability every time.
@ 2026 Vidext Inc.
Newsletter
Discover all news and updates from Vidext
@ 2026 Vidext Inc.