Practical Application of Large Language Models in MaShang Consumer Finance: From Model Building to Deployment
This article details how MaShang Consumer Finance leverages large language models for sales, collection, and customer service, covering company background, AI research achievements, model training infrastructure, data‑quality and compliance challenges, prompt engineering, inference acceleration, evaluation methods, and lessons learned from real‑world deployment.
MaShang Consumer Finance is a technology‑driven financial institution with over 3,000 employees, more than 2,000 of whom are R&D staff, and a portfolio of 1,400+ patents, serving roughly 180 million users.
The company’s AI research institute has published nearly 30 top‑conference papers in the past two years and won several awards in domestic and international competitions.
Three major LLM‑driven scenarios have been deployed: (1) dialogue models for outbound sales, collection, and customer service, initially using a Chinese‑language chat model and later an English‑language version trained on 900k conversations with fast‑Transformer inference achieving 5.7 ms per token; (2) a document‑question‑answering model for knowledge‑base assistance; (3) an SQL‑generation model that first retrieves the relevant table before generating the query.
To support rapid experimentation, MaShang built an internal platform that automates model training, inference, and deployment, enabling quick validation of open‑source large models.
Key challenges identified include the high data‑quality requirements for customer‑facing scenarios, the complexity of financial marketing sub‑scenarios, strict accuracy demands for user information, stringent compliance and privacy regulations, and the need for agile model updates to keep pace with evolving marketing strategies.
The training pipeline consists of base‑model selection, data preprocessing, fine‑tuning with reinforcement learning, prompt‑engineering for alignment, and comprehensive evaluation covering both objective metrics (e.g., compliance filtering, PPL) and subjective assessments of dialogue usefulness.
Prompt engineering is treated as knowledge engineering: factual information is injected via prompts or KV‑pairs, especially for long‑tail items such as contract amounts that the model cannot generate reliably on its own.
Inference acceleration is achieved through fast‑Transformer adaptations, operator fusion, and the use of a 2.6 B parameter model, resulting in up to a 6× speedup on a single A800 GPU.
Hallucination detection combines online post‑processing (rule‑based filters, NL2SQL fallback) with offline compliance checks, and a hybrid approach using both a statistical n‑gram language model (KENLM) and the large model to ensure professional terminology and fluency.
Evaluation relies on a dual test‑set strategy: objective compliance scoring and subjective quality scoring, with continuous feedback loops for data filtering, post‑processing, and human review.
Experience highlights include the effectiveness of starting from a strong base model when sufficient high‑quality data is available, the limited benefit of reinforcement learning when data is abundant, and the importance of diverse sampling to cover long‑tail intents.
Open problems remain in handling specific hallucination cases, long‑tail intent recognition, maintaining logical consistency across dialogue turns, and rapidly adapting to regulatory changes without costly retraining.
The Q&A section addresses topics such as SQL‑generation accuracy verification, the (non‑)use of fallback scripts and function calls, the absence of RAG/Agent techniques in the marketing model, the rationale for using base models, knowledge‑update cycles, model count per scenario, and latency considerations.
DataFunSummit
Official account of the DataFun community, dedicated to sharing big data and AI industry summit news and speaker talks, with regular downloadable resource packs.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.