How MetaTrader Uses Reinforcement Learning to Boost Trading Strategy Generalization
The article reviews the MetaTrader method, which formulates sequential portfolio optimization as a partially offline reinforcement‑learning problem, introduces a double‑layer RL algorithm and a conservative TD objective to improve out‑of‑distribution generalization, and demonstrates superior performance on CSI‑300 and NASDAQ‑100 datasets compared with existing baselines.
