Artificial Intelligence 18 min read

Intelligent NPC Practices in Tencent Games: Multi‑Modal LLM Solutions and System Optimizations

This article details Tencent Game's end‑to‑end approach to building intelligent NPCs, covering the opportunities brought by AI, the practical implementation of multimodal LLM‑driven dialogue, knowledge‑augmented retrieval, long‑context handling, safety measures, multimodal expression (voice and facial animation), and system‑level performance optimizations for real‑time deployment.

DataFunSummit
DataFunSummit
DataFunSummit
Intelligent NPC Practices in Tencent Games: Multi‑Modal LLM Solutions and System Optimizations

The presentation introduces Tencent Game's journey in creating intelligent NPCs, starting from early text‑based Q&A bots to voice‑interactive assistants and finally to the multimodal NPC "JueZhi An Nuan" in the game Tianya Mingyue Dao.

AI Opportunities : LLMs enable open‑ended dialogue, planning, reflection, and memory for NPCs, overcoming the high cost and repetitiveness of scripted dialogue trees.

Practical Cases : 2023 Nvidia ACE for Games provides richer character personalities. Stanford Virtual Town demonstrates multi‑agent planning and reflection. In‑game examples show LLM‑generated NPC conversations. These illustrate the need for accuracy, realism, and safety in NPC responses.

Our Implementation focuses on three pillars: Dialogue – core open‑domain conversation, enhanced with retrieval‑augmented generation (RAG) for factual accuracy and multi‑turn context handling using multi‑dimensional knowledge indexes and a fine‑tuned m3e encoder. Expression – real‑time voice synthesis (TTS) with custom voice prompts and high‑fidelity facial animation driven by speech‑to‑control‑parameter models, published in ICCV2023. System – parallel, streaming architecture and model inference acceleration (int8 quantization, prefix cache, flash decoding, dynamic batching) to reduce end‑to‑end latency from >10 s to ~1.5 s.

Dialogue Accuracy : RAG combines knowledge base retrieval with LLM prompting, raising answer accuracy from 62 % to 74 % after multi‑dimensional indexing and clustering. Context‑aware retrieval uses query rewriting or multi‑turn encoders, with the latter proving more effective.

Realism : NPCs must respect game world lore and exhibit long‑term memory, planning, and consistent personality. Prompt engineering and extended context windows (up to 24k tokens) are employed to support these advanced behaviors.

Safety : A three‑stage pipeline detects sensitive content via keyword filters and LLM classification, generates safe responses, and applies post‑moderation actions, achieving a harmful content rate below 0.5 %.

Performance : Parallel execution of independent modules (dialogue, TTS, action generation, safety checks) and streaming of short sentences reduce perceived latency. Model inference is optimized with int8 quantization, prefix caching, flash decoding, and hardware‑level operator acceleration. Metrics show online response accuracy of 94 %, persona compliance of 99 %, and zero harmful outputs.

Future Outlook : The team expects continued NPC innovation, cost‑effective inference (e.g., edge‑cloud collaboration), selective LLM usage, and new gameplay possibilities driven by AI.

Q&A session covers model selection (m3e, bge), world‑consistent training, comparison of LLM‑based vs. traditional systems, evaluation metrics beyond accuracy, and per‑character fine‑tuning strategies.

Performance OptimizationaiLLMRAGMultimodalgame AINPC
DataFunSummit
Written by

DataFunSummit

Official account of the DataFun community, dedicated to sharing big data and AI industry summit news and speaker talks, with regular downloadable resource packs.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.