Boost Kaggle Scores: Metric Optimization Tricks for Multi‑Class F1 and Spearman
This article reveals practical tricks for dramatically improving Kaggle competition metrics—such as multi‑class F1 and Spearman correlation—by applying weight scaling, non‑linear threshold optimization, and custom rounding techniques, complete with Python code examples and step‑by‑step guidance.
Background: KDD2019 and the power of metric tricks
The 2019 KDD competition highlighted how a simple metric‑optimization insight can yield score jumps larger than switching from LSTM to BERT, making metric tuning a critical skill for Kaggle participants.
Multi‑class F1 optimization
When class distribution is imbalanced, optimizing cross‑entropy alone does not guarantee the best F1. By introducing class‑specific scaling weights w and searching for the weight vector that maximizes the average F1, practitioners can achieve noticeable gains (e.g., from 0.726 to 0.738 in a sentiment‑analysis contest). The problem can be formalized as argmax_w (w·logits) with the objective of maximizing F1, and solved using non‑linear optimizers from scipy.
from functools import partial
import numpy as np
import scipy as sp
class OptimizedRounder(object):
def __init__(self):
self.coef_ = 0
def _kappa_loss(self, coef, X, y):
X_p = np.copy(X)
for i, pred in enumerate(X_p):
if pred < coef[0]:
X_p[i] = 0
elif pred >= coef[0] and pred < coef[1]:
X_p[i] = 1
elif pred >= coef[1] and pred < coef[2]:
X_p[i] = 2
elif pred >= coef[2] and pred < coef[3]:
X_p[i] = 3
else:
X_p[i] = 4
ll = quadratic_weighted_kappa(y, X_p)
return -ll
def fit(self, X, y):
loss_partial = partial(self._kappa_loss, X=X, y=y)
initial_coef = [0.5, 1.5, 2.5, 3.5]
self.coef_ = sp.optimize.minimize(loss_partial, initial_coef, method='nelder-mead')
def predict(self, X, coef):
X_p = np.copy(X)
for i, pred in enumerate(X_p):
if pred < coef[0]:
X_p[i] = 0
elif pred >= coef[0] and pred < coef[1]:
X_p[i] = 1
elif pred >= coef[1] and pred < coef[2]:
X_p[i] = 2
elif pred >= coef[2] and pred < coef[3]:
X_p[i] = 3
else:
X_p[i] = 4
return X_p
def coefficients(self):
return self.coef_['x']Ordered discrete label optimization (e.g., sentiment score 1‑5)
For tasks where the target is an ordered set of discrete values, such as a 1‑5 sentiment rating, the same non‑linear optimization framework can be applied. By treating the rating thresholds as learnable coefficients, the model learns optimal cut‑points that respect the ordinal nature of the labels, improving metrics like quadratic weighted kappa.
# The same OptimizedRounder class can be reused with different initial_coef values
# tailored to the number of rating levels.Spearman correlation optimization for multi‑label classification
In the Google QUEST Q&A Labeling competition, the evaluation metric is Spearman’s rank correlation, which measures ordering consistency. Because predictions are continuous logits that rarely match the limited set of discrete annotation values, discretizing predictions via learned thresholds dramatically improves the Spearman score.
a = np.array([0.5, 0.5, 0.7, 0.7])
b = np.array([4., 5., 6., 7.])
print_spearmanr(a, b) # -> 0.89
b2 = np.array([4., 4., 6., 6.])
print_spearmanr(a, b2) # -> 1.By applying the threshold‑search technique described above to map logits onto the limited annotation set, participants can avoid large penalties and achieve higher Spearman correlations.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Baobao Algorithm Notes
Author of the BaiMian large model, offering technology and industry insights.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
