Designing AI-Powered Digital Humans for Bank Customer Service: A Practical Guide
This article explores the classification of digital humans, examines how they transform user experience, and presents a detailed design practice for AI‑driven virtual bank tellers, covering visual identity, interaction flow, ergonomic screen layout, and future considerations for natural multimodal interfaces.
1.1 Digital Human Service Classification
With the surge of the metaverse concept, digital humans have exploded in popularity, but they have existed for nearly 40 years since the first virtual singer Lin Mingmei. Broadly, any digital asset with a human‑like appearance and behavior is considered a digital human. Based on this definition, market services are grouped into three categories: the long‑standing IP type, the business‑oriented type that completes services, and the avatar type that emphasizes a one‑to‑one replacement of a real person.
Examples of the IP type include virtual idols such as Luo Tianyi and AYAYI, which focus on personal traits and world‑building, often using high‑fidelity 3D and motion capture. Avatar‑type digital twins replicate real individuals, e.g., Travis’s collaboration with Fortnite for a virtual concert. The business type, which this article focuses on, uses AI to drive human‑computer interaction and aims to create realistic service experiences.
1.2 Experience Transformation Brought by Digital Humans
Digital humans as information carriers make services more transparent. Traditional interfaces stack elements in fixed hierarchies, requiring users to learn and adapt, which reduces autonomy. Don Norman’s “borderless interface” concept seeks transparency, minimizing interaction friction. Digital humans become the interface itself, allowing users to bypass screen barriers and engage in natural dialogue.
The interface becomes a dynamic relational space. When a digital human handles the conversation, the interface’s value shifts to facilitating interaction between the human and the digital human. Unlike linear interaction logic, digital human interfaces dynamically push information based on context, guide and recommend, and seamlessly blend with the environment, mimicking human‑to‑human relationship building.
2.0 Design Practice for Offline Bank Customer Service
2.1 Project Background
Low‑cost, high‑conversion virtual bank tellers are an ideal entry point for digital humans in finance. Traditional offline staff answer repetitive questions at high labor cost and with inconsistent quality. AI‑driven digital humans retain face‑to‑face communication while improving efficiency, stability, and configurability.
2.2 Scene Relationship Construction
Breaking device barriers: placing the “professional image” in the “professional scene”. The large screen serves only as a medium; the digital human should appear to stand naturally in the environment, gaining user trust without feeling intrusive.
How to shape the professional image of a financial customer service agent
How to create a natural, realistic dialogue scene
A. Shaping the Professional Image
Appearance dimension – facial features, hairstyle, makeup, and attire are designed to convey intelligence and approachability. Personality traits suggest smart, professional demeanor; avatar studies recommend slightly pointed faces, prominent eyebrows, and a gentle smile.
Professional traits – a side‑parted, tidy hairstyle, modest makeup, and a uniform with brand exposure convey competence and friendliness.
B. Creating a Natural Dialogue Scene
To ensure comfortable interaction, the digital human’s height is set to 168 cm (average Chinese adult). A polite distance of ~1.2 m is maintained, while the screen is placed ~0.5 m away, requiring the avatar to be scaled appropriately.
2.3 Interaction Design of the Digital Human Large Screen
A. Functional Partitioning of the Interactive Screen
The large screen is an ultra‑large touch surface. Interaction shifts from finger taps to arm gestures, and the visual focus is limited to a 0.5 m viewing distance. Information is divided into core, secondary, and peripheral zones to guide the user’s gaze.
Best Interaction Zone
Based on ergonomics, the optimal interaction area for a 55‑inch screen and a 1.68 m tall user is calculated, placing core cards within this zone.
Best Visual Zone
Vertical: optimal eye movement between +25° and –30° from the line of sight. Horizontal: within ±60°.
B. Visual Design of the Interactive Screen
Clear Text
Text size must be at least 36 px for primary content and 24 px for secondary notes. Contrast ratios follow WCAG recommendations (≥7:1 for main text, ≥4.5:1 for supporting text) to accommodate a broad user base.
Consistent Visual Focus
Conversation bubbles are fixed in position so the user’s gaze starts from the same area each time, reducing eye movement.
Reasonable Display of the Digital Human
The information area starts 10 cm below the avatar’s chin to avoid covering the face. When pop‑up cards appear, they may temporarily obscure the avatar, but the design ensures the avatar remains visible during normal operation.
Enhancing Secondary Visual Zone
Since users are less sensitive to secondary zones, high‑contrast colors and dynamic cues are used to highlight voice‑input prompts, ensuring they stand out.
3.0 Future Considerations
Future digital human applications will move toward real‑time natural interaction, leveraging advances in computer vision, speech understanding, and multimodal AI. Research will focus on balancing human‑centric emotional expression with task‑oriented efficiency, and on optimizing the interplay between the digital human and GUI across various scenarios.
Tencent Mobility Industry Design Center
The Tencent Mobility Industry Design Center (SMD) is Tencent's user experience team focused on the industrial internet.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
