From Low‑Resource Large Model Training to Dynamic Margin Selection: A JD Engineer’s Journey
The article recounts a JD retail engineer’s rapid growth through tackling low‑resource large‑model training, developing a margin‑based dynamic data selection method (DynaMS) that earned an ICLR paper, and sharing practical insights on aligning business needs with cutting‑edge AI research.
Jia Xing, a 2021 JD DMT PhD management trainee who graduated from the Institute of Automation, Chinese Academy of Sciences, joined JD Retail’s technology team and immediately focused on the challenging problem of low‑resource large‑model training and scalable deployment.
He published four top‑conference papers, filed ten patents, and was recognized as an outstanding talent in Beijing. His story is presented to inspire others.
01. From the Back to the Front – Expanding Technical Space
Technical staff must balance rapid business delivery with long‑term innovation. By deeply understanding business logic, Jia Xing identified a high‑volume “same‑product” identification task where manual review of 10⁴ comparisons per 100 items was infeasible. He built an automatic pre‑screening filter using a Llama‑7b model adapted into a MultiChoice classifier that outputs only “yes” or “no”. High‑confidence predictions are accepted automatically, while low‑confidence cases are sent for human review, cutting manual effort by over 50%.
To address the heavy computational cost of large language models, the classifier was distilled, achieving more than a six‑fold size reduction and roughly a four‑fold inference speedup with negligible loss in accuracy. The solution has been deployed across multiple business scenarios, saving substantial annotation costs.
Figure: System diagram of automatic pre‑screening for same‑product identification.
02. AI Research Accepted at ICLR – Dynamic Margin Selection (DynaMS)
His research goal is to explore more efficient model training paradigms for low‑resource settings. Inspired by support vector machines, he proposes selecting training samples based on their distance to the decision boundary, termed Margin Selection (MS). Samples near the boundary carry more information and should be retained.
Since the decision boundary shifts during training, he introduces a dynamic data selection strategy: every N epochs the core sample set is refreshed, forming the Dynamic Margin Selection (DynaMS) method. This approach maintains model convergence to the original loss while reducing data volume.
Figure: Illustration of the Dynamic Margin Selection method based on classification margin.
The work was accepted at the top‑tier conference ICLR, demonstrating that carefully chosen, information‑rich samples can bridge the gap between the power‑law scaling of data size and model performance.
03. Professional Growth – Self‑Discipline for Reliable Results
The author reflects on a diverse academic background—from electric‑vehicle wireless charging to robotics and finally AI—emphasizing that true innovation comes from deep, sustained focus on core problems rather than superficial breadth.
He advocates continuous learning in the fast‑moving AI field, active participation in academic activities, and translating research breakthroughs into real‑world product impact.
His four‑year journey at JD Retail illustrates a model for rapid technical growth within a large e‑commerce company.
Paper: “DynaMS: Dynamic Margin Selection for Efficient Deep Learning” – Download PDF
JD Tech
Official JD technology sharing platform. All the cutting‑edge JD tech, innovative insights, and open‑source solutions you’re looking for, all in one place.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.