Beyond General LLMs: Efficient Adaptation and Data Value Mining for Finance

The article details a systematic practice—starting from the “iceberg” challenges of finance, through data and knowledge engineering, reverse knowledge extraction with REER, multi‑dimensional synthetic data generation, prompt engineering (APO), cost‑aware fine‑tuning, inference acceleration, and emotion‑value evaluation—culminating in actionable guidelines for deploying large models in banking scenarios.

DataFunSummit
DataFunSummit
DataFunSummit
Beyond General LLMs: Efficient Adaptation and Data Value Mining for Finance

1. The "Iceberg" Challenge in Finance

Financial services present a deep‑water environment for large models: complex business logic, strict compliance, limited GPU budgets, and often no existing documentation or labeled data. The authors encountered a real case where customer‑service answers were purely oral, leaving no QA pairs or manuals to fine‑tune a model.

Three major deployment obstacles were identified:

Zero‑data dilemma: In a sales‑opportunity discovery scenario, neither input (X) nor label (Y) existed, preventing any supervised fine‑tuning.

Evaluation blind spot: Generative outputs such as marketing strategies lack a standard answer, making objective quantitative evaluation difficult.

Compute & compliance barrier: High inference cost, strict latency, and hardware budget constraints force on‑premise deployment; banks are unwilling to host dozens of separate models.

2. Data & Knowledge Engineering

Traditional QA retrieval relies on exact matching of a complete question bank—"rote memorization"—which fails on long‑tail or edge cases. Large‑model inference is likened to an "open‑book exam": feeding a high‑quality business manual as context enables flexible reasoning.

To construct such manuals, the team adopted the REER algorithm (originally from ByteDance) for reverse knowledge extraction:

Use a powerful LLM to generate a reasoning trace between input X and output Y.

Strip the trace of "inner monologue" to keep only objective facts, producing an intermediate variable SUM.

Aggregate SUM by business categories (≈10‑12 classes) to form an action‑manual article.

Iteratively refine using two metrics: Perplexity (lower perplexity indicates the trace helps forward reasoning) and end‑to‑end similarity between model answers and high‑quality human responses.

After the first manual version, continuous iteration follows the Anthropic Captain approach: LLM‑driven Wiki creation, Markdown storage, and version control with CLI and Git to maintain a living knowledge base.

Breaking the Zero‑Data Barrier with Multi‑Dimensional Synthesis

Because neither X nor XY existed, the team built a synthetic data pipeline:

Customer‑profile diversity: Simulate target enterprises (e.g., light‑asset firms), trade features, and funding urgency.

Recording‑scene diversity: Vary physical noise, casual chat, and adversarial negative samples (e.g., personal mortgage discussion).

Speaker‑behavior diversity: Contrast a "cautious rookie" (linear logic, self‑correction, stutter) with a "seasoned veteran" (conclusion‑first, rapid entry).

Testing mainstream LLMs such as Gemini and MiniMax 2.5/2.7 on a small handcrafted dataset yielded an F1 of only ~70 %, indicating that traditional distillation would not boost a smaller model.

3. Large‑Model Practice in Banking

APO (Automatic Prompt Optimization) is presented as a high‑quality baseline generator. Key observations:

System prompts should be concise; user prompts exert stronger constraints.

APO can produce a first draft; minimal human edits remove over‑fitting.

Batch processing is essential; APO behaves like a Monte‑Carlo process and lacks the stability of gradient descent.

Only fixing error samples does not improve generalization; full‑trace optimization (functions, tool descriptions) yields better results.

Cost‑performance trade‑offs for post‑training methods were measured: SFT (Supervised Fine‑Tuning) – low cost, mature toolchain, suited for structured tasks. RL pre‑inference – high cost, “canonical” RL, best for explainability but expensive. RL post‑inference – moderate cost, better cost‑effectiveness for reasoning generation.

Model architecture insights:

MOE models show higher throughput when many task cards exceed model count; deploying multiple Int8‑quantized Merge models can outperform a single dense model with several LoRA adapters.

In Chinese‑government‑approved hardware environments, vLLM compatibility issues must be considered.

Emotion‑Value Evaluation

In fintech, perceived emotional value often precedes measurable business impact. The authors built a psychology‑based evaluator to prioritize "looks‑right" emotional responses before optimizing hard metrics like conversion rate.

4. Conclusion

The authors argue that combining a general LLM with high‑quality knowledge engineering yields more commercial value than training a domain‑specific LLM from scratch. Small models (14B‑30B) equipped with curated knowledge can surpass the zero‑shot capability of larger general models. The key is breaking data silos, rebuilding a knowledge engine, and moving from generic inference to truly vertical financial applications.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

large language modelsinference accelerationmodel fine‑tuningPrompt Optimizationfinanceknowledge engineeringemotion evaluationreverse knowledge extraction
DataFunSummit
Written by

DataFunSummit

Official account of the DataFun community, dedicated to sharing big data and AI industry summit news and speaker talks, with regular downloadable resource packs.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.