Artificial Intelligence 6 min read

Exploring Training and Alignment Techniques for Financial Large Models

The announcement details a DataFun Summit 2024 session where Du Xiaoman AI researcher Huo Liangyu will present on the challenges, development, and alignment methods of the Xuan Yuan financial large language model, highlighting RLHF techniques, data collection, and real‑world deployment insights for the finance sector.

DataFunSummit

Aug 8, 2024

Exploring Training and Alignment Techniques for Financial Large Models

On August 31, at DataFun Summit 2024, the Digital Intelligence Financial Technology Summit will invite Du Xiaoman AI algorithm researcher Huo Liangyu to share at the Financial Large Model Landing Forum the talk "Exploring Training and Alignment Techniques for Financial Large Models". Interested participants can scan the QR code to register for free and watch the live broadcast.

Scan the QR code to register for free and watch the live broadcast.

Detailed Introduction:

Huo Liangyu, Du Xiaoman AI Algorithm Researcher

Personal introduction: He earned a Ph.D. from Beihang University, focusing on deep reinforcement learning and imitation learning, with research published in TPAMI, TCOM, AAAI and other top venues. After his doctorate, he joined a joint post‑doctoral program between Du Xiaoman and the Chinese Academy of Sciences Automation Institute, working on AI algorithm development. He led the reinforcement learning from human feedback (RLHF) alignment work for the Xuan Yuan financial large model, building a complete RLHF training framework, addressing reward‑model challenges, and improving the model’s usefulness, safety, and financial capabilities, thereby significantly enhancing alignment with human values.

Talk Title: Exploration of Training and Alignment Techniques for Financial Large Models

Talk introduction: In recent years, large language models (LLMs) have become a research hotspot. By scaling model size and training on massive data, LLMs acquire extensive knowledge and demonstrate strong general abilities such as understanding, reasoning, and logic. While LLMs promise new value for the financial industry, generic LLMs lack specialized financial knowledge and capabilities, and their training and deployment costs are prohibitive. To address these issues, we developed the Xuan Yuan financial model, supplementing it with high‑quality financial data and employing innovative pre‑training and supervised fine‑tuning (SFT) methods, which markedly improve financial knowledge and ability while preserving generality. On this foundation, we applied RLHF to further align the model’s values with human preferences, reducing safety risks and enhancing user experience.

We have released 6B, 13B, and 70B versions of the Xuan Yuan financial model, which perform excellently across various benchmark tests, covering a full spectrum of model sizes and establishing a comprehensive capability matrix. This session will mainly discuss the technologies used in the model’s development, including pre‑training, SFT, preference‑reward training, and RLHF, as well as the open‑source status and real‑world application cases. The main contents include:

1. Challenges of adapting generic LLMs to financial domains

2. The birth of the Du Xiaoman Xuan Yuan model

3. Exploration of training and alignment experiences for financial models

4. Deployment and future outlook of financial models

Audience benefits:

1. Understanding the training process of financial large models

2. Methods and experiences of model alignment

3. Real‑world deployment cases of financial large models

Deployment challenges and key solutions:

1. Collection of high‑quality financial data

2. Ensuring stability in application scenarios

Scan the QR code to register for free and watch the live broadcast.

DataFunSummit

Official account of the DataFun community, dedicated to sharing big data and AI industry summit news and speaker talks, with regular downloadable resource packs.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.