Which Loss Function Ranks Stocks Best? An Empirical Study with Transformer Models
This paper evaluates point‑wise, pair‑wise, and list‑wise loss functions for Transformer‑based stock‑return prediction on 110 S&P 500 stocks, showing that Margin loss achieves the highest annual return (16.23%) and Sharpe ratio (0.75), ListNet delivers strong returns with low volatility, and BPR minimizes maximum drawdown, highlighting how loss design critically shapes ranking‑driven portfolio performance.
Background
Quantitative trading relies on accurate stock ranking to allocate capital, and while traditional statistical models like ARIMA have long been used, Transformer architectures excel at capturing long‑range dependencies in financial time series. However, the impact of different training loss functions on a Transformer's ability to produce profitable rankings remains unclear.
Problem Definition
The study aims to assess how various loss functions affect a Transformer's learning of stock return patterns and its downstream portfolio decisions. Daily returns of 110 S&P 500 stocks (selected from the top‑10 market‑cap stocks in each of the 11 GICS sectors) are predicted, then ranked to construct equal‑weight long‑only portfolios of the top k (k=5) stocks.
Method
3.1 Model Architecture
The PortfolioMASTER model combines alternating temporal self‑attention (processing each stock’s history independently) and spatial self‑attention (modeling inter‑stock relationships at each timestep). Input features (daily return and turnover) over a 20‑day look‑back window are projected to dimension D, enriched with positional encodings, and processed by a stack of encoder layers. The final attention‑based aggregation yields per‑stock representations used to predict next‑day returns.
3.2 Loss Functions
The paper evaluates loss functions grouped into point‑wise, point‑wise + pair‑wise, and list‑wise categories:
Point‑wise: Mean Squared Error (MSE) as baseline.
Point‑wise + pair‑wise: MSE combined with a pairwise component L_{PairwiseComponent} weighted by λ, including Hinge loss, Margin loss (with margin m), Bayesian Personalized Ranking (BPR), RankNet (with scaling α), and weighted Hinge variants (WHR1/WHR2).
List‑wise: ListNet loss, which converts true scores and predictions into probability distributions using a temperature parameter τ.
Dataset and Features
Data span from 2015‑01‑03 to 2024‑12‑03, covering daily returns and turnover for the selected 110 stocks. Features are normalized per stock using the training set scaler.
Training and Evaluation
Data are split chronologically: 70% training, 15% validation, 15% test. Models are trained for up to 50 epochs with AdamW optimizer, weight decay, early stopping on validation loss, and learning‑rate scheduling. Hyper‑parameters (including dropout, model dimensions, learning rate, and loss‑specific parameters λ, m, α, τ) are tuned via grid search for each loss.
Portfolio Simulation and Metrics
Daily rebalancing constructs equal‑weight long‑only portfolios of the top 5 ranked stocks. Performance is measured by Cumulative Return (CR), Annualized Return (AR), Annualized Volatility (AV), Sharpe Ratio (SR, risk‑free rate 4.3%), and Maximum Drawdown (MDD). Prediction quality is assessed by Information Coefficient (IC), ICIR, and Precision@5 (P@5), with test‑set MSE also reported.
Experimental Results
4.4.1 Portfolio Performance Analysis
Margin loss achieves the highest AR (16.23%) and SR (0.7529). ListNet follows closely with AR = 16.00% and SR = 0.7407, while also yielding the lowest AV (15.79%). BPR produces the smallest MDD (‑15.77%), indicating better risk control despite a slightly lower SR (0.7200). The MSE baseline is outperformed by all ranking‑oriented losses in risk‑adjusted returns.
4.4.2 Prediction Quality vs. Portfolio Results
IC values are similar across losses (0.073–0.077) and P@5 remains around 0.358–0.359. RankNet attains the highest IC (0.0767) but its portfolio AR and SR are only moderate. Conversely, Margin and ListNet deliver superior portfolio metrics without markedly higher IC, suggesting that loss design influences how ranking errors are penalized and thus impacts downstream profitability.
4.4.3 Impact of Loss Design
Pairwise losses that explicitly model stock preferences (Margin, BPR) prove effective: Margin’s margin encourages confident separation of top stocks, while BPR’s focus on correctly ordering preferred items reduces drawdowns. ListNet’s list‑wise optimization captures global ranking patterns beneficial for portfolio construction, even though its test‑set MSE is higher because it does not directly optimize point‑wise return accuracy.
Conclusion
The choice of loss function substantially affects both ranking quality and portfolio performance when using Transformers for stock return prediction. Ranking‑oriented losses, especially Margin and ListNet, outperform plain MSE, and incorporating pairwise preferences can improve risk characteristics.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
