Artificial Intelligence 18 min read

Kuaishou Tech Team Highlights Multiple ICML 2026 Papers Across AI Domains

The Kuaishou technology team reports that several of its papers were accepted at the prestigious ICML 2026 conference—including a spotlight paper on metaphor video understanding, works on causal discovery for irregular time series, image super‑resolution, large‑scale notification dispatch, full‑order ranking, phase‑aware MoE for RL, end‑to‑end e‑commerce search, spatial‑reasoning rewards, a unified SWE benchmark, video temporal grounding, and interpretable transformers—while also inviting attendees to visit their booth B101 in Seoul.

Kuaishou Tech

Jun 18, 2026

Selected Papers

MetaphorVU: Towards Metaphorical Video Understanding (Spotlight)

Paper: https://openreview.net/forum?id=yKcBAJMPXZ Code: https://github.com/icip-cas/MetaphorVU Abstract: Metaphorical videos convey complex ideas but are difficult for current multimodal large language models (MLLMs) to interpret. The authors introduce MetaphorVU‑Bench, the first systematic benchmark for this task, and demonstrate a large performance gap between MLLMs and humans caused by poor cross‑domain mapping. To address this, they construct a metaphor knowledge graph and propose MetaphorBoost, a reasoning‑time framework that consistently improves performance.

Causal Discovery for Irregularly Time Series with Consistency Guarantees

Paper: https://openreview.net/forum?id=y5GiPedJPV Abstract: The work tackles causal discovery on irregularly sampled time series, a challenge in finance, healthcare, and climate science. Existing two‑stage pipelines (impute then discover) or joint neural approaches lack explicit mechanisms to ensure consistency between imputation and structure learning. The authors propose ReTimeCausal, an EM‑based framework that alternates between data imputation (kernel‑sparse regression) and causal graph estimation with structural constraints. The method provides theoretical consistency guarantees even under high missing rates. Experiments on challenging irregular‑sampling scenarios show ReTimeCausal outperforms prior methods.

Coloring the Noise: Adversarial Sobolev Alignment for Faithful Image Super‑Resolution

Paper: https://arxiv.org/abs/2605.23264 Code: https://github.com/wafer-bob/ASASR Abstract: Existing isotropic loss functions misalign with the natural image manifold, leading to spectral mismatch and artifacts in super‑resolution. The authors propose ASASR, a Sobolev‑induced framework that colors noise to match natural image spectra. A Riesz‑based adversarial module generates worst‑case Sobolev gradients as negative samples, guiding optimization toward spectrally consistent solutions. Extensive experiments show ASASR achieves superior fidelity and artifact suppression on major generative baselines.

Large‑Scale Notification Dispatch with Bundle Treatments and Multi‑Outcome Uplift Optimization

Paper: https://icml.cc/virtual/2026/poster/65977 Abstract: Push‑notification allocation is modeled as a constrained optimization over bundle‑level interventions. BUOPLR decouples uplift estimation from constrained decision making via a two‑stage pipeline: (1) a cross‑treatment network learns multi‑outcome uplift for each bundle; (2) Lagrangian relaxation and decision‑space pruning solve the large‑scale constrained problem. Offline and online experiments report superior key metrics and full deployment in a billion‑user system.

Learning to Rank by Directly Optimizing Full‑Order Probabilities

Paper: https://openreview.net/forum?id=fch6yT64ZH Code: https://github.com/tyxaaron/FOB Abstract: Ranking is formulated as estimating the likelihood of a total ordering, which is factorially large. The authors introduce the Full‑Order Bound (FOB), a tractable lower bound on the full‑order likelihood constructed from per‑item ranking constraints. Under a log‑concave density assumption, FOB yields a convex optimization problem solved by Safe‑Region Gradient Ascent (SRGA). Experiments on synthetic and large‑scale benchmarks show FOB improves NDCG and overall ranking performance.

Phase‑Aware Mixture of Experts for Agentic Reinforcement Learning

Paper: https://arxiv.org/pdf/2602.17038 Code: https://github.com/YsTvT/PA-MoE Abstract: Token‑level MoE routing fragments temporal coherence in RL agents. PA‑MoE introduces a lightweight phase router that groups tokens belonging to the same temporal phase and assigns them to a shared expert, preserving phase‑specific expertise. Experiments demonstrate improved performance on RL benchmarks compared to standard token‑level MoE.

OneSearch: A Unified End‑to‑End Generative Framework for E‑commerce Search

Paper: https://icml.cc/virtual/2026/poster/64836 Code: https://github.com/benchen4395/onesearch-family Abstract: Traditional multi‑stage cascade search suffers from fragmented computation and conflicting objectives. OneSearch replaces it with a generative model that jointly encodes hierarchical query representations, multi‑view user behavior sequences, and a preference‑aware reward system. Offline evaluation and online A/B tests report statistically significant lifts: CTR + 1.67 %, buyer count + 2.40 %, order volume + 3.22 %, and a 75.4 % reduction in operational cost.

SpatialReward: Bridging the Perception Gap in Online RL for Image Editing

Paper: https://arxiv.org/pdf/2602.07458 Code: https://lorangan-ddup.github.io/SpatialReward Abstract: Online RL for image editing suffers from “attention collapse”, where models ignore cross‑image details. SpatialReward injects explicit spatial reasoning via a “Think‑with‑Boxes” mechanism and a two‑stage SFT + GRPO training pipeline. A curated 260k dataset of spatial reasoning trajectories supports training. Experiments show state‑of‑the‑art gains on EditReward‑Bench (+11.3 %) and MMRB2 (+9.1 %).

SWE‑Compass: Towards Unified Evaluation of Agentic Coding Abilities for Large Language Models

Paper: https://arxiv.org/abs/2511.05459 Dataset: https://huggingface.co/datasets/Kwaipilot/SWE-Compass Abstract: Existing coding benchmarks are narrow and language‑biased. SWE‑Compass introduces a three‑dimensional evaluation matrix covering eight task types, eight scenarios, and ten languages, built from 2,000 carefully curated GitHub PRs. Experiments with ten LLMs reveal sharp performance drops on functional implementation and optimization tasks, a strong framework‑model interaction effect, and significant multilingual robustness gaps.

VideoTemp‑o3: Harmonizing Temporal Grounding and Video Understanding in Agentic Thinking‑with‑Videos

Paper: https://arxiv.org/pdf/2602.07801 Code: https://liuwq-bit.github.io/VideoTemp-o3 Abstract: Video QA models use uniform frame sampling, missing key evidence. VideoTemp‑o3 unifies video QA and temporal grounding in a single agentic model. It employs masked‑SFT warm‑start, IoU‑based RL rewards, and a multi‑round data pipeline. The model sets new records on VideoMME (+2.4 %), LVBench (+1.7 %), Charades‑STA mIoU 57.8 %, and NextGQA mIoU 33.4 %.

Weights to Code: Extracting Interpretable Algorithms from the Discrete Transformer

Paper: https://arxiv.org/abs/2601.05770 Abstract: The Discrete Transformer separates routing from arithmetic, enabling extraction of Python‑style algorithms via temperature annealing, hypothesis testing, and symbolic regression. The method matches RNN‑based MIPS performance on algorithmic reasoning tasks and supports continuous‑variable dynamics, offering a more controllable and transparent framework for transformer interpretability.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

computer vision Large Language Models ranking Reinforcement Learning agentic AI Kuaishou causal discovery ICML 2026

Written by

Kuaishou Tech

Official Kuaishou tech account, providing real-time updates on the latest Kuaishou technology practices.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.