Bighead's Algorithm Notes
Bighead's Algorithm Notes
Apr 20, 2026 · Artificial Intelligence

Exploring CSMD: A China‑Specific Multimodal Stock Dataset and the LightQuant Quantitative Framework

The article introduces CSMD, a high‑quality multimodal dataset built from Chinese financial news for the CSI‑300 and SSE‑50 stocks, describes LLM‑enhanced factor extraction and rigorous data validation, presents the modular LightQuant framework, and shows through extensive experiments that CSMD and LightQuant outperform existing resources such as CMIN‑CN in stock trend prediction and backtesting.

CSMDLLM factor extractionLightQuant
0 likes · 12 min read
Exploring CSMD: A China‑Specific Multimodal Stock Dataset and the LightQuant Quantitative Framework