Artificial Intelligence 9 min read

How JD’s NR‑Rino Model Cracked the DROP Benchmark with 90% Accuracy

The JD Intelligent Customer Service team’s NR‑Rino model topped the DROP leaderboard at 90.26% accuracy by enhancing multi‑head predictor architecture and training strategies, showcasing advanced discrete reasoning for machine reading comprehension and promising broader AI applications in finance, logistics, and health.

JD Cloud Developers

Mar 11, 2022

How JD’s NR‑Rino Model Cracked the DROP Benchmark with 90% Accuracy

Recently, the JD Intelligent Customer Service (JDAI) research team announced that their NR‑Rino model achieved a 90.26% accuracy on the DROP leaderboard, the leading benchmark for discrete‑reasoning reading comprehension.

Background on DROP

Machine reading comprehension requires understanding natural language text, and the DROP dataset pushes this further by demanding numerical reasoning such as addition, subtraction, counting, and sorting. Many top institutions, including Google Research, Ping An, and Tencent, have competed on this challenging benchmark.

Challenges of Discrete Reasoning

While transformer‑based models like BERT have surpassed human performance on simpler QA datasets (e.g., SQuAD), they struggle when questions involve multiple constraints, logical steps, or numeric calculations.

Existing Approaches

Two main families of methods have been explored:

Semantic parsing: converting unstructured text into structured tables or programs (e.g., NeRd) to enable interpretable reasoning, though it requires costly rule design or annotation.

Multi‑head predictors: models such as NAQANET, NumNet, and QDGAT treat the task as multiple prediction heads, handling span extraction, counting, and arithmetic expressions, often using graph neural networks to model numeric relationships.

NR‑Rino Innovations

NR‑Rino builds on the multi‑head predictor paradigm and introduces two key improvements:

Model architecture : a three‑stage design comprising an encoding layer (ALBERT‑xxlarge), a numeric reasoning layer (multi‑layer Transformer that explicitly models digit positions), and a prediction layer. The numeric reasoning layer replaces previous graph‑based modules, allowing richer context‑aware digit representations.

Training strategies : regularization to preserve pretrained language‑model knowledge (adding a parameter‑wise penalty toward the original ALBERT weights) and dropout‑based consistency loss that feeds each sample twice and penalizes divergent outputs.

The model encodes the passage and question, enriches digit tokens with positional information, fuses representations at various granularities (number, passage, question), and finally predicts the answer type (span, count, arithmetic expression, etc.) before generating the final answer.

Impact and Future Directions

By topping the DROP leaderboard, NR‑Rino demonstrates that enhanced numeric reasoning can significantly close the gap between machines and humans on complex reading tasks. The underlying capabilities are expected to benefit JD’s retail, logistics, health services, as well as external domains such as financial report analysis, sports data analytics, and intelligent RPA.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

AI language models DROP Reading Comprehension discrete reasoning NR-Rino

Written by

JD Cloud Developers

JD Cloud Developers (Developer of JD Technology) is a JD Technology Group platform offering technical sharing and communication for AI, cloud computing, IoT and related developers. It publishes JD product technical information, industry content, and tech event news. Embrace technology and partner with developers to envision the future.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.