Artificial Intelligence 13 min read

FactorMiner: Tsinghua’s Self‑Evolving Agent with Skill and Experience Memory for Alpha Factor Mining

FactorMiner is a lightweight, flexible self‑evolving agent framework that combines a modular skill architecture with structured experience memory, using a Ralph loop to guide search, reduce redundancy, and build a diverse, high‑quality alpha factor library that outperforms baselines across A‑share and cryptocurrency markets while leveraging GPU‑accelerated evaluation.

Bighead's Algorithm Notes

Apr 13, 2026

FactorMiner: Tsinghua’s Self‑Evolving Agent with Skill and Experience Memory for Alpha Factor Mining

Background

Alpha factor mining is a core task in quantitative trading and portfolio construction. In practice it faces three fundamental challenges: (1) high search complexity —the space of symbolic expressions grows combinatorially with operators and parameters; (2) poor knowledge accumulation —traditional search methods such as genetic programming or reinforcement learning cannot retain or reuse insights, leading to repeated experiments; (3) interpretability constraints —financial practitioners require transparent, auditable formulas with clear financial logic for compliance and risk management.

Problem Definition

The task is defined as follows: given a market data tensor D, an alpha factor α is a program composed of operators from a set Ω that maps market states to a cross‑sectional prediction signal s_t. Factor effectiveness is quantified by Information Coefficient (IC) and Information Ratio (ICIR); redundancy between factors is measured by time‑averaged cross‑sectional Spearman correlation. The goal is to construct a diversified factor library L = {α_1,…,α_K} that maximizes aggregate predictive quality while satisfying a global redundancy constraint.

Method

3.1 Factor Mining Skill Architecture

Operator layer: Over 60 carefully selected financial operators are implemented with a GPU‑accelerated backend, ensuring that proposed symbolic expressions are executable and computationally efficient.

Verification pipeline: A strict, standardized factor evaluation protocol—including IC filtering, correlation checks, and admission criteria—decouples generation from validation, preventing “computational hallucination,” improving portability, and allowing independent optimization of each component.

3.2 Experience Memory

Memory formation: At the end of each mining batch, the mining trajectory τ_t is analysed to extract successful patterns P_{succ} and prohibited regions P_{fail}.

Memory evolution: Candidate memories are merged into the existing knowledge base, redundant entries are consolidated, and low‑utility information is discarded.

Memory retrieval: During factor generation, a retrieval operator R fetches context‑relevant memory signals m_t, which are used as prompt‑level constraints for the LLM policy, shaping the sampling distribution π(α|m_t). The memory contains mining state, structural experience, and strategic experience, guiding the agent.

3.3 Ralph Loop – Self‑Evolving Factor Discovery

Global‑library view: Each candidate factor is evaluated for how it complements the current library L. A correlation constraint enforces diversity, and a replacement mechanism allows high‑quality factors to replace inferior ones.

Memory‑guided exploration: By maintaining P_{succ} and P_{fail}, the loop avoids redundant exploration of known failure zones and focuses on promising structural patterns.

Multi‑stage evaluation: A two‑stage pipeline first performs rapid screening on a small asset subset, then conducts full validation with correlation constraints.

Self‑evolution: After each iteration, an evolution operator E updates the memory, forming a feedback loop that continuously improves the search strategy.

Experiments

4.1 Experimental Setup

Datasets: A‑share CSI‑500, CSI‑1000, HS300 constituents (10‑minute intraday) and Binance’s 64 major crypto assets (10‑minute). Training covers Q1‑Q4 2024; testing uses 2025 data. The prediction target is the next 10‑minute open‑to‑close price‑change ratio.

Baselines: Five representative methods—Alpha101 (classic), Alpha101‑adapt, Random Formula (RF), GPLearn, and AlphaAgent—share the same operator library Ω, data fields, and evaluation/admission protocol.

4.2 Main Results

Factor quality & diversity: Under the strict protocol, FactorMiner outperforms all baselines on all four markets. The selected factor set exhibits moderate pairwise dependence, with average absolute correlation 0.30‑0.31 for A‑share and 0.25 for crypto, indicating that performance gains are not driven by near‑duplicate signals.

Cross‑market robustness: Factors discovered on A‑share retain competitive performance on crypto markets, suggesting that the framework captures fundamental price‑volume dynamics that generalize across asset classes.

Integration & learning selection: Adding a downstream learning model yields noticeable improvement for most baselines, but provides limited extra gain for FactorMiner, implying that the raw factor set already captures most usable predictive information.

Impact of experience memory: An ablation study comparing FactorMiner with a memory‑less variant shows that experience memory improves both productive search (higher hit‑rate) and diversity (lower redundancy), effectively guiding exploration and filtering.

Mining efficiency: Benchmarking three execution back‑ends—standard Python, compiled C, and GPU—demonstrates that the GPU backend delivers significant speed‑ups at both operator‑level and factor‑level evaluation, making large‑scale iterative mining feasible.

4.3 Discussion

FactorMiner not only achieves strong predictive performance but also produces a curated library of 110 interpretable high‑frequency alpha factors together with a standardized evaluation protocol, providing a reproducible tool for hypothesis‑driven analysis of market microstructure. The experience‑memory component acts as a continual‑learning mechanism, accumulating institutional knowledge across mining sessions and enabling meta‑learning.