How JD’s NR‑Rino Model Cracked the DROP Benchmark with 90% Accuracy
The JD Intelligent Customer Service team’s NR‑Rino model topped the DROP leaderboard at 90.26% accuracy by enhancing multi‑head predictor architecture and training strategies, showcasing advanced discrete reasoning for machine reading comprehension and promising broader AI applications in finance, logistics, and health.
Recently, the JD Intelligent Customer Service (JDAI) research team announced that their NR‑Rino model achieved a 90.26% accuracy on the DROP leaderboard, the leading benchmark for discrete‑reasoning reading comprehension.
Background on DROP
Machine reading comprehension requires understanding natural language text, and the DROP dataset pushes this further by demanding numerical reasoning such as addition, subtraction, counting, and sorting. Many top institutions, including Google Research, Ping An, and Tencent, have competed on this challenging benchmark.
Challenges of Discrete Reasoning
While transformer‑based models like BERT have surpassed human performance on simpler QA datasets (e.g., SQuAD), they struggle when questions involve multiple constraints, logical steps, or numeric calculations.
Existing Approaches
Two main families of methods have been explored:
Semantic parsing: converting unstructured text into structured tables or programs (e.g., NeRd) to enable interpretable reasoning, though it requires costly rule design or annotation.
Multi‑head predictors: models such as NAQANET, NumNet, and QDGAT treat the task as multiple prediction heads, handling span extraction, counting, and arithmetic expressions, often using graph neural networks to model numeric relationships.
NR‑Rino Innovations
NR‑Rino builds on the multi‑head predictor paradigm and introduces two key improvements:
Model architecture : a three‑stage design comprising an encoding layer (ALBERT‑xxlarge), a numeric reasoning layer (multi‑layer Transformer that explicitly models digit positions), and a prediction layer. The numeric reasoning layer replaces previous graph‑based modules, allowing richer context‑aware digit representations.
Training strategies : regularization to preserve pretrained language‑model knowledge (adding a parameter‑wise penalty toward the original ALBERT weights) and dropout‑based consistency loss that feeds each sample twice and penalizes divergent outputs.
The model encodes the passage and question, enriches digit tokens with positional information, fuses representations at various granularities (number, passage, question), and finally predicts the answer type (span, count, arithmetic expression, etc.) before generating the final answer.
Impact and Future Directions
By topping the DROP leaderboard, NR‑Rino demonstrates that enhanced numeric reasoning can significantly close the gap between machines and humans on complex reading tasks. The underlying capabilities are expected to benefit JD’s retail, logistics, health services, as well as external domains such as financial report analysis, sports data analytics, and intelligent RPA.
JD Cloud Developers
JD Cloud Developers (Developer of JD Technology) is a JD Technology Group platform offering technical sharing and communication for AI, cloud computing, IoT and related developers. It publishes JD product technical information, industry content, and tech event news. Embrace technology and partner with developers to envision the future.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.