Tagged articles
82 articles
Page 1 of 1
Machine Heart
Machine Heart
May 11, 2026 · Artificial Intelligence

Why Visual Perception Limits STEM Large Models and How CodePercept Breaks the Barrier

The authors demonstrate that visual perception, not reasoning, is the primary bottleneck for STEM multimodal large language models, introduce the CodePercept paradigm and the ICC-1M dataset, and show that code‑driven perception dramatically improves performance, surpassing much larger models on new benchmarks.

BenchmarkCVPR2026CodePercept
0 likes · 9 min read
Why Visual Perception Limits STEM Large Models and How CodePercept Breaks the Barrier
Machine Learning Algorithms & Natural Language Processing
Machine Learning Algorithms & Natural Language Processing
May 4, 2026 · Artificial Intelligence

SignThought: A New Gloss‑Free Sign Language Translation Framework for the Deaf Community

The paper introduces SignThought, a gloss‑free sign language translation model that inserts an ordered latent‑thought chain between video encoding and text generation, uses a plan‑then‑ground decoding strategy, and is evaluated on five benchmarks and a newly built 1,311‑hour LC‑HKSLT dataset, achieving state‑of‑the‑art BLEU‑4 and ROUGE scores.

ACL2026DatasetGloss-Free
0 likes · 11 min read
SignThought: A New Gloss‑Free Sign Language Translation Framework for the Deaf Community
Machine Heart
Machine Heart
Apr 17, 2026 · Artificial Intelligence

Can LLMs Truly Mimic Human Shopping Behavior? The OPeRA Dataset and Evaluation

The paper introduces OPeRA, a step‑wise online‑shopping dataset capturing observations, personas, rationales, and actions from real users, and uses it to benchmark LLMs on next‑action prediction, revealing that even top models like GPT‑4.1 achieve only about 20 % accuracy on fine‑grained actions, with persona information offering limited benefit while rationales prove crucial.

AIDatasetLLM
0 likes · 9 min read
Can LLMs Truly Mimic Human Shopping Behavior? The OPeRA Dataset and Evaluation
SuanNi
SuanNi
Apr 6, 2026 · Artificial Intelligence

How OmniLottie Turns Text and Images into High‑Quality Vector Animations

OmniLottie, a collaborative framework from Fudan, HKU, and Queensland University, uses a specialized tokenizer and a large multimodal model to compress Lottie files, generate vector animations from text, images or video, and achieves state‑of‑the‑art performance on custom benchmarks and extensive evaluations.

AIDatasetLottie
0 likes · 11 min read
How OmniLottie Turns Text and Images into High‑Quality Vector Animations
AI Frontier Lectures
AI Frontier Lectures
Mar 16, 2026 · Artificial Intelligence

Can Multimodal LLMs Truly Understand Human Emotions? Introducing the MME-Emotion Benchmark

This article presents MME-Emotion, a large‑scale multimodal benchmark that evaluates both emotion recognition and reasoning abilities of multimodal large language models across 27 real‑world scenarios, revealing current models’ significant gaps in emotional intelligence and outlining future research directions.

AIBenchmarkDataset
0 likes · 9 min read
Can Multimodal LLMs Truly Understand Human Emotions? Introducing the MME-Emotion Benchmark
SuanNi
SuanNi
Mar 9, 2026 · Artificial Intelligence

How UniScientist Beats GPT‑5.4 on FrontierScience Benchmarks

UniScientist, a 30B‑parameter AI model co‑developed by UniPat AI and Peking University, leverages a meticulously curated scientific dataset and a powerful code interpreter to achieve 33.3% success on the FrontierScience‑Research benchmark, surpassing the newly released GPT‑5.4 and demonstrating superior multi‑disciplinary research capabilities.

AIDatasetlarge language model
0 likes · 12 min read
How UniScientist Beats GPT‑5.4 on FrontierScience Benchmarks
AIWalker
AIWalker
Mar 5, 2026 · Artificial Intelligence

How ViDA-UGC Leverages Large Multimodal Models for Fine-Grained Visual Quality Assessment

The article introduces ViDA-UGC, a large‑scale UGC visual‑quality dataset and its companion benchmark ViDA‑Bench, explains the MILP‑driven sampling, expert annotation pipeline, and CoT‑based evaluation framework, and shows how fine‑tuning popular multimodal LLMs on this data markedly improves low‑level quality perception, grounding, and description capabilities.

BenchmarkDatasetchain-of-thought
0 likes · 12 min read
How ViDA-UGC Leverages Large Multimodal Models for Fine-Grained Visual Quality Assessment
Amap Tech
Amap Tech
Feb 13, 2026 · Artificial Intelligence

How ABot‑M0 Achieves Generalist Robot Intelligence with Action Manifold Learning

ABot‑M0 tackles the three long‑standing "Babel Tower" challenges of embodied AI—data fragmentation, inconsistent representations, and training mismatches—by releasing the massive UniACT dataset, introducing Action Manifold Learning for direct action prediction, and designing a plug‑and‑play dual‑path perception architecture that outperforms prior models on multiple robot benchmarks.

DatasetEmbodied AIRobotics
0 likes · 14 min read
How ABot‑M0 Achieves Generalist Robot Intelligence with Action Manifold Learning
HyperAI Super Neural
HyperAI Super Neural
Feb 5, 2026 · Artificial Intelligence

16 Embodied AI Datasets Covering Grasping, QA, Logical and Trajectory Reasoning

This article compiles sixteen high‑quality embodied AI datasets—including simulation assets, robot motion retargeting, indoor scenes, multimodal benchmarks, grasping, question answering, trajectory reasoning and large‑scale robot learning collections—detailing their scope, size, and download links to support research on agents that perceive, decide, and act in the physical world.

DatasetEmbodied AIRobotics
0 likes · 15 min read
16 Embodied AI Datasets Covering Grasping, QA, Logical and Trajectory Reasoning
Bighead's Algorithm Notes
Bighead's Algorithm Notes
Jan 11, 2026 · Artificial Intelligence

FinRpt: A Multi‑Agent Framework for Automatic Generation and Evaluation of Stock Research Reports

FinRpt introduces a novel multi‑agent pipeline that builds a high‑quality stock research report (ERR) dataset from six financial data sources, defines a comprehensive 11‑metric evaluation suite, and demonstrates that supervised‑fine‑tuned and reinforcement‑learned LLM agents significantly outperform single LLM baselines in both accuracy and efficiency.

DatasetFinRptLLM
0 likes · 14 min read
FinRpt: A Multi‑Agent Framework for Automatic Generation and Evaluation of Stock Research Reports
Kuaishou Tech
Kuaishou Tech
Dec 4, 2025 · Artificial Intelligence

Can a Tree‑Reasoned Model Master Video Emotion Understanding?

The paper introduces VidEmo, a multimodal video foundation model that uses a two‑stage emotion‑clue‑guided reasoning framework and a large emotion‑centric dataset (Emo‑CFG) to achieve state‑of‑the‑art performance on facial attribute, expression, and fine‑grained emotion tasks, surpassing Gemini 2.0.

AIComputer VisionDataset
0 likes · 15 min read
Can a Tree‑Reasoned Model Master Video Emotion Understanding?
Aikesheng Open Source Community
Aikesheng Open Source Community
Oct 29, 2025 · Artificial Intelligence

What Makes BiomedSQL and LogicCat the Toughest Text‑to‑SQL Benchmarks for LLMs?

BiomedSQL and LogicCat are two newly released Text‑to‑SQL datasets that challenge large language models with complex biomedical reasoning, multi‑step logical inference, and domain‑specific knowledge, offering detailed analyses of query types, scientific reasoning categories, and performance gaps that highlight current LLM limitations.

BiomedicalDatasetLLM
0 likes · 9 min read
What Makes BiomedSQL and LogicCat the Toughest Text‑to‑SQL Benchmarks for LLMs?
Bighead's Algorithm Notes
Bighead's Algorithm Notes
Oct 12, 2025 · Artificial Intelligence

Trading-R1: Open-Source LLM Framework for Explainable Financial Trading

This article reviews Trading‑R1, an open‑source LLM inference framework that integrates multimodal financial data, three‑stage supervised‑fine‑tuning and reinforcement learning to generate structured investment arguments and risk‑adjusted trade decisions, achieving superior Sharpe ratio and drawdown performance on real‑world stock and ETF tests.

DatasetFinancial TradingLLM
0 likes · 11 min read
Trading-R1: Open-Source LLM Framework for Explainable Financial Trading
Data Party THU
Data Party THU
Oct 9, 2025 · Artificial Intelligence

Can One Model Master All Audio‑Visual Tasks? Introducing Crab’s Unified Approach

This article presents Crab, a unified audio‑visual scene understanding model that leverages a novel display‑cooperation learning paradigm, introduces the AV‑UIE dataset with explicit reasoning steps, and demonstrates superior performance across temporal, spatial, pixel‑level, and spatio‑temporal tasks through extensive experiments and ablations.

BenchmarkDatasetLoRA
0 likes · 12 min read
Can One Model Master All Audio‑Visual Tasks? Introducing Crab’s Unified Approach
Fun with Large Models
Fun with Large Models
Sep 2, 2025 · Artificial Intelligence

How to Improve Agent Performance with Fine‑Tuning: Key Strategies for AI Interviews

This article explains how to boost large‑model agent performance for interview questions by using efficient fine‑tuning—building multi‑tool parallel and chain‑call datasets—and reinforcement‑learning fine‑tuning with reward functions that target tool accuracy, task completion, and call efficiency, illustrated with concrete JSON examples and open‑source references.

AgentDatasetFine-tuning
0 likes · 9 min read
How to Improve Agent Performance with Fine‑Tuning: Key Strategies for AI Interviews
Data Party THU
Data Party THU
Jul 29, 2025 · Artificial Intelligence

How High‑Low UAV Collaboration Beats Solo Drones in Complex Navigation

Researchers from Beihang University present a high‑low UAV collaborative navigation paradigm, introducing the HaL‑13k dataset and AeroDuo framework, detailing high‑altitude planning with Pilot‑LLM, low‑altitude three‑stage search, and demonstrating superior target finding in complex environments.

AeroDuoDatasetPilot-LLM
0 likes · 7 min read
How High‑Low UAV Collaboration Beats Solo Drones in Complex Navigation
Amap Tech
Amap Tech
Jul 24, 2025 · Artificial Intelligence

FingER: Fine-Grained Evaluation and Reasoning for AI-Generated Videos

The paper introduces FingER, an entity-level evaluation framework and the FingER-Instruct-60k dataset for assessing AI-generated video quality with fine-grained reasoning, and demonstrates state-of-the-art zero-shot performance on multiple benchmarks using novel training strategies.

AI-generated videoDatasetfine-grained evaluation
0 likes · 9 min read
FingER: Fine-Grained Evaluation and Reasoning for AI-Generated Videos
AI Frontier Lectures
AI Frontier Lectures
Jul 17, 2025 · Artificial Intelligence

Top 8 Tencent Youtu Papers Accepted at ICCV 2025: Innovations in AI and Vision

The 20th ICCV conference announced 8 papers from Tencent Youtu Lab covering stylized face recognition, AI‑generated image detection, heterogeneous knowledge distillation, multi‑conditional diffusion, multimodal LLM distillation, palmprint recognition, low‑light vision, and oracle bone script decipherment, each pushing the frontier of computer vision and AI research.

Computer VisionDatasetICCV 2025
0 likes · 17 min read
Top 8 Tencent Youtu Papers Accepted at ICCV 2025: Innovations in AI and Vision
Kuaishou Tech
Kuaishou Tech
Jul 16, 2025 · Artificial Intelligence

How KuaiMM Conversation Revolutionizes Multimodal Dialogue on Short‑Video Platforms

The KuaiMM Conversation project introduces a multimodal large‑model‑driven dialogue system for Kuaishou, featuring the world‑first short‑video mixed‑dialogue dataset, a Chain‑of‑Thought interaction framework, and large‑scale industrial deployments that dramatically improve live‑stream comments and intelligent customer service.

DatasetKuaishouchain-of-thought
0 likes · 11 min read
How KuaiMM Conversation Revolutionizes Multimodal Dialogue on Short‑Video Platforms
AIWalker
AIWalker
Jun 30, 2025 · Artificial Intelligence

ICCV 2025 MIPI Workshop Launches ViDA-UGC: A New UGC Image Quality Assessment Challenge

The ICCV MIPI workshop introduces the ViDA-UGC competition, presenting a richly annotated UGC image quality dataset, a benchmark suite covering degradation detection, region perception, and quality description, detailed evaluation metrics, submission formats, prize information, and open participation for researchers worldwide.

BenchmarkDatasetICCV
0 likes · 15 min read
ICCV 2025 MIPI Workshop Launches ViDA-UGC: A New UGC Image Quality Assessment Challenge
Alibaba Cloud Big Data AI Platform
Alibaba Cloud Big Data AI Platform
Jun 30, 2025 · Artificial Intelligence

Unlocking Small LLM Power: Variable‑Length Chain Distillation with DistillQwen‑ThoughtY

This article introduces a variable‑length chain‑of‑thought distillation technique built on Alibaba Cloud PAI’s EasyDistill toolkit, presents the high‑quality OmniThought‑0528 dataset, details the training of the DistillQwen‑ThoughtY 4B/8B/32B models, and provides code and usage examples for researchers and practitioners.

DatasetDistillationLLM
0 likes · 15 min read
Unlocking Small LLM Power: Variable‑Length Chain Distillation with DistillQwen‑ThoughtY
AIWalker
AIWalker
May 29, 2025 · Artificial Intelligence

ImgEdit-Bench Exposes Weak Image Editing Models – A ‘Death Test’ Reveals Who’s Struggling

ImgEdit introduces a large‑scale, high‑quality editing dataset and the ImgEdit‑Bench benchmark, detailing a robust data‑generation pipeline, multi‑round editing tasks, and a specialized evaluation model, and demonstrates through extensive experiments that its ImgEdit‑E1 model outperforms existing open‑source editors and narrows the gap with closed‑source systems.

AIBenchmarkDataset
0 likes · 20 min read
ImgEdit-Bench Exposes Weak Image Editing Models – A ‘Death Test’ Reveals Who’s Struggling
Alibaba Cloud Big Data AI Platform
Alibaba Cloud Big Data AI Platform
May 29, 2025 · Artificial Intelligence

How OmniThought Enables Adaptive Reasoning Chains for Better LLM Performance

This article introduces the OmniThought dataset, which annotates over two million chain‑of‑thought reasoning steps with Reasoning Verbosity and Cognitive Difficulty scores, and explains how these metrics guide the training of DistilQwen‑ThoughtX models that adapt chain length to task difficulty, achieving superior performance compared to existing distilled LLMs.

CoTDatasetDistillation
0 likes · 16 min read
How OmniThought Enables Adaptive Reasoning Chains for Better LLM Performance
Bilibili Tech
Bilibili Tech
May 16, 2025 · Artificial Intelligence

How FineVQ Sets New Standards for Fine‑Grained UGC Video Quality Assessment

The article introduces FineVD, the first large‑scale multi‑dimensional UGC video quality dataset, and presents FineVQ, a unified model that predicts quality scores, attributes, and distortion types across six dimensions, achieving state‑of‑the‑art performance on multiple benchmarks and cross‑dataset evaluations.

Computer VisionDatasetDeep Learning
0 likes · 9 min read
How FineVQ Sets New Standards for Fine‑Grained UGC Video Quality Assessment
Volcano Engine Developer Services
Volcano Engine Developer Services
Apr 14, 2025 · Artificial Intelligence

Introducing Multi‑SWE‑bench: The First Multilingual Code‑Fix Benchmark for LLMs

ByteDance’s Doubao model team has open‑sourced Multi‑SWE‑bench, a multilingual benchmark covering seven major programming languages with 1,632 real‑world bug‑fix tasks, complete Docker environments, difficulty grading, and strict human validation, aiming to evaluate and advance large‑language‑model code‑repair capabilities beyond Python.

DatasetLLM BenchmarkSoftware Engineering
0 likes · 11 min read
Introducing Multi‑SWE‑bench: The First Multilingual Code‑Fix Benchmark for LLMs
JD Tech
JD Tech
Apr 7, 2025 · Artificial Intelligence

Embodied Intelligence: From Data Scarcity to Real-World Robotic Manipulation – JD Explore Academy’s System Architecture and Research Advances

The article outlines JD Explore Academy’s recent embodied‑intelligence research, describing the challenges of data scarcity and precise manipulation, their ROS‑based high‑extensibility system architecture, dual‑arm teleoperation technology, a data‑efficient end‑effector imitation method, and the open JD ManiData dataset that together push robots from lab demos to practical tasks such as coffee‑making.

AIDatasetEmbodied Intelligence
0 likes · 7 min read
Embodied Intelligence: From Data Scarcity to Real-World Robotic Manipulation – JD Explore Academy’s System Architecture and Research Advances
Meituan Technology Team
Meituan Technology Team
Mar 27, 2025 · Artificial Intelligence

Q-Eval-100K Dataset and Q-Eval-Score Evaluation Framework for Text-to-Visual Generation

The Q‑Eval‑100K dataset, comprising 100 k AIGC images and videos with separate visual‑quality and textual‑consistency annotations, powers the open‑source Q‑Eval‑Score framework that fine‑tunes multimodal models to deliver state‑of‑the‑art, scalable, and objective evaluation—including a “vague‑to‑specific” strategy for long prompts—surpassing existing benchmarks.

AIGCDatasetevaluation
0 likes · 9 min read
Q-Eval-100K Dataset and Q-Eval-Score Evaluation Framework for Text-to-Visual Generation
Amap Tech
Amap Tech
Mar 19, 2025 · Artificial Intelligence

Driving by the Rules: Integrating Lane-Level Traffic Regulations into Online HD Maps

Gaode Map and Xi'an Jiaotong University introduce the “Driving by the Rules” task, releasing the MapDR benchmark that integrates lane‑level traffic‑sign regulations into online‑constructed HD maps, and provide modular (VLE‑MEE) and end‑to‑end (RuleVLM) baselines to evaluate rule extraction and lane association.

AIBenchmarkDataset
0 likes · 8 min read
Driving by the Rules: Integrating Lane-Level Traffic Regulations into Online HD Maps
Kuaishou Tech
Kuaishou Tech
Feb 20, 2025 · Artificial Intelligence

Second Short-Form Video Quality Assessment and Enhancement Challenge (CVPR NTIRE 2025)

The second short-form video quality assessment and enhancement challenge, co‑organized by Kuaishou's audio‑video team and the Intelligent Media Computing Lab, invites global researchers to develop efficient quality assessment models and diffusion‑based super‑resolution methods using the new KwaiSR dataset, with prize money and potential CVPR workshop paper invitations.

AI competitionCVPR NTIREDataset
0 likes · 9 min read
Second Short-Form Video Quality Assessment and Enhancement Challenge (CVPR NTIRE 2025)
DataFunTalk
DataFunTalk
Feb 18, 2025 · Artificial Intelligence

CODEI/O: Leveraging Code to Train Large Language Models for Enhanced Reasoning

The DeepSeek team introduced CODEI/O, a massive dataset that converts code into natural‑language reasoning chains, and demonstrated that training large language models on this data markedly improves their performance on diverse inference tasks, including non‑code domains, through a two‑stage training strategy.

CODEI/ODatasetcode reasoning
0 likes · 8 min read
CODEI/O: Leveraging Code to Train Large Language Models for Enhanced Reasoning
Meituan Technology Team
Meituan Technology Team
Feb 9, 2025 · Artificial Intelligence

NTIRE 2025 XGC AI-Generated Video Quality Assessment Challenge

The NTIRE 2025 XGC AI‑Generated Video Quality Assessment Challenge, hosted at the CVPR workshop, invites participants to build VQA models that predict mean opinion scores for 34,029 AI‑generated videos created from 4,689 prompts using 14 generation models, with training, validation, and test splits provided as JSON, and submissions evaluated by the average of PLCC and SROCC, while key dates run from February 5 to June 15 2025 and prize money up to $1,200 is offered.

AI videoCVPRChallenge
0 likes · 6 min read
NTIRE 2025 XGC AI-Generated Video Quality Assessment Challenge
DataFunSummit
DataFunSummit
Jan 1, 2025 · Artificial Intelligence

Challenges and Evaluation Strategies for LLM Agents in 2024

The article outlines the rapid progress of LLM agents in 2024 while highlighting key difficulties in planning capabilities, evaluation methods, dataset generation, and metric design, and suggests practical combinations and product‑level enhancements to improve efficiency, accuracy, and usability.

AIAgentDataset
0 likes · 3 min read
Challenges and Evaluation Strategies for LLM Agents in 2024
Alibaba Cloud Big Data AI Platform
Alibaba Cloud Big Data AI Platform
Nov 7, 2024 · Artificial Intelligence

How VideoCLIP‑XL Boosts Long‑Description Understanding in Video CLIP Models

VideoCLIP‑XL, a new video CLIP model introduced by Alibaba Cloud AI Platform and Sun Yat‑sen University, enhances long‑text description comprehension through a large‑scale VILD dataset, a text‑similarity guided principal component matching method, and novel DDR and HDR ranking tasks, achieving superior performance on multiple video‑text benchmarks.

BenchmarkDatasetLong Description
0 likes · 13 min read
How VideoCLIP‑XL Boosts Long‑Description Understanding in Video CLIP Models
Alimama Tech
Alimama Tech
Nov 6, 2024 · Artificial Intelligence

How AI Generates Synchronized Video Narrations for E‑Commerce

This article presents the research behind Synchronized Video Storytelling, introducing the E‑SyncVidStory dataset, the VideoNarrator multimodal architecture, and extensive experiments that demonstrate high‑quality, product‑aware video narration generation for e‑commerce applications.

DatasetLLMMultimodal AI
0 likes · 12 min read
How AI Generates Synchronized Video Narrations for E‑Commerce
AntTech
AntTech
Sep 3, 2024 · Artificial Intelligence

2024 Inclusion Bund Conference AI Innovation Competition and Deepfake Challenge Results

The 2024 Inclusion Bund Conference in Shanghai announced the winners of its newly added AI Innovation Competition, including the AFAC Financial Intelligence Contest and the Global Deepfake Attack‑Defense Challenge, highlighting participation from over 7,000 teams across more than 20 countries and showcasing cutting‑edge deepfake detection achievements.

AIComputer VisionDataset
0 likes · 7 min read
2024 Inclusion Bund Conference AI Innovation Competition and Deepfake Challenge Results
Kuaishou Tech
Kuaishou Tech
Jul 1, 2024 · Artificial Intelligence

Short-Form Video Quality Assessment Competition at CVPR NTIRE 2024: Dataset, Challenge Overview, and Top Winning Solutions

The CVPR NTIRE 2024 short-form video quality assessment competition introduced the KVQ dataset, attracted over 200 teams, evaluated submissions using SROCC and PLCC metrics, and highlighted the winning approaches of SJTU MMLab, IH‑VQA, and TVQE, showcasing advances in AI‑driven video quality evaluation.

AI competitionComputer VisionDataset
0 likes · 9 min read
Short-Form Video Quality Assessment Competition at CVPR NTIRE 2024: Dataset, Challenge Overview, and Top Winning Solutions
Baobao Algorithm Notes
Baobao Algorithm Notes
Mar 18, 2024 · Industry Insights

Inside the 2024 KDD Cup ShopBench Challenge: Tasks, Data, and Evaluation Metrics

The 2024 KDD Cup introduces the ShopBench benchmark, a large‑scale LLM competition that simulates real‑world online shopping with 57 tasks, over 20,000 questions, and multiple tracks covering concept understanding, knowledge reasoning, user‑behavior alignment, multilingual ability, and an all‑round track, all evaluated with task‑specific metrics and a hidden test set.

BenchmarkDatasetEvaluation Metrics
0 likes · 11 min read
Inside the 2024 KDD Cup ShopBench Challenge: Tasks, Data, and Evaluation Metrics
Kuaishou Tech
Kuaishou Tech
Mar 6, 2024 · Artificial Intelligence

Short Video Quality Assessment Competition (KVQ) at CVPR NTIRE 2024

The CVPR NTIRE 2024 workshop hosts the first short‑video quality assessment competition, introducing the KVQ dataset of 4,200 videos across nine scenes, providing training/validation data, a baseline 3D Swin‑Transformer model, detailed competition rules, rewards, and organizer contacts.

AIComputer VisionDataset
0 likes · 7 min read
Short Video Quality Assessment Competition (KVQ) at CVPR NTIRE 2024
Rare Earth Juejin Tech Community
Rare Earth Juejin Tech Community
Dec 29, 2023 · Artificial Intelligence

Overview of Major Benchmark Datasets for Evaluating Large Language Models

This article provides a comprehensive overview of major benchmark datasets—including CMMLU, MMLU, C‑Eval, GSM8K, Gaokao‑Bench, AGIEval, MATH, BBH, HumanEval, and MBPP—used to evaluate large language models' knowledge, reasoning, and coding abilities, and summarizes related leaderboards and evaluation tools.

DatasetLLMartificial intelligence
0 likes · 14 min read
Overview of Major Benchmark Datasets for Evaluating Large Language Models
AntTech
AntTech
Dec 19, 2023 · Artificial Intelligence

RJUA‑QA: A Comprehensive Urology QA Dataset for Large Language Model Evaluation

RJUA‑QA is a newly released, large‑scale urology question‑answer dataset constructed from virtual patient records based on clinical experience, featuring 2,132 QA pairs with extensive context, designed to benchmark and improve large language models’ medical reasoning, diagnosis, and treatment recommendation capabilities.

DatasetQA datasetUrology
0 likes · 12 min read
RJUA‑QA: A Comprehensive Urology QA Dataset for Large Language Model Evaluation
Kuaishou Tech
Kuaishou Tech
Oct 16, 2023 · Artificial Intelligence

Top 5 CIKM 2023 Papers on Recommender Systems, Search & Datasets

The article highlights five CIKM 2023 papers covering a lightweight model‑compression framework for recommender systems, a query‑dominant user‑interest network for large‑scale search ranking, a causal watch‑time labeling approach for short‑video recommendation, implicit negative‑feedback optimization for short‑video feeds, and the KuaiSAR unified search‑and‑recommendation dataset, each with download links, author lists, and key findings.

DatasetKuaishoumodel compression
0 likes · 12 min read
Top 5 CIKM 2023 Papers on Recommender Systems, Search & Datasets
Kuaishou Tech
Kuaishou Tech
Sep 26, 2023 · Artificial Intelligence

Cross-Domain Product Representation (COPE): A Large-Scale Dataset and Baseline Model for Rich‑Content E‑Commerce

The paper introduces ROPE, the first large‑scale cross‑domain product recognition dataset covering detail pages, short videos and live streams, and proposes COPE, a dual‑tower multimodal model that learns unified product embeddings using contrastive and classification losses, achieving superior retrieval and few‑shot classification performance across domains.

DatasetDeep Learningcontrastive learning
0 likes · 13 min read
Cross-Domain Product Representation (COPE): A Large-Scale Dataset and Baseline Model for Rich‑Content E‑Commerce
AntTech
AntTech
Sep 21, 2023 · Artificial Intelligence

AFAC2023 Financial Intelligence Challenge Highlights and the Release of the Fin‑Eval Dataset

The inaugural AFAC2023 Financial Intelligence Challenge, co‑organized by the China Computer Federation and Ant Group, attracted over 4,700 teams, showcased cutting‑edge AI solutions for finance such as market opinion generation, compliance detection, and pet‑age recognition, and culminated in the public launch of the Fin‑Eval benchmark dataset for financial large‑model evaluation.

AIDatasetFin-Eval
0 likes · 12 min read
AFAC2023 Financial Intelligence Challenge Highlights and the Release of the Fin‑Eval Dataset
DataFunTalk
DataFunTalk
Sep 21, 2023 · Artificial Intelligence

2023 Chinese Continuous Visual Speech Recognition Challenge (CNVSRC) Overview

The 2023 Chinese Continuous Visual Speech Recognition Challenge (CNVSRC), organized by Tsinghua University and partners, introduces the large-scale CN-CVS dataset, defines single- and multi-speaker lip‑reading tasks, provides baseline Conformer models, outlines registration, data access, evaluation metrics, and competition schedule.

AIChallengeConformer
0 likes · 7 min read
2023 Chinese Continuous Visual Speech Recognition Challenge (CNVSRC) Overview
Kuaishou Large Model
Kuaishou Large Model
Jul 7, 2023 · Artificial Intelligence

How HairStep Revolutionizes Single-View 3D Hair Reconstruction

This paper introduces HairStep, a novel intermediate representation combining Strand Maps and Depth Maps, and demonstrates how it reduces domain gap and improves single‑view 3D hair reconstruction accuracy across multiple algorithms, supported by new annotated datasets (HiSa, HiDa) and fair evaluation metrics.

3D hair reconstructionComputer VisionDataset
0 likes · 11 min read
How HairStep Revolutionizes Single-View 3D Hair Reconstruction
DataFunTalk
DataFunTalk
Mar 1, 2023 · Artificial Intelligence

ACL 2023 Multi‑lingual Document‑grounded Dialogue Competition Overview

The ACL 2023 Multi‑lingual Document‑grounded Dialogue Competition, hosted by Alibaba DAMO Academy and Nanjing University, introduces the first multilingual document‑dialogue dataset, provides a baseline system, offers a $7,000 prize pool, and invites participants to submit papers to the Doc2dial Workshop for Best Paper awards.

ACL2023DatasetNLP
0 likes · 6 min read
ACL 2023 Multi‑lingual Document‑grounded Dialogue Competition Overview
Meituan Technology Team
Meituan Technology Team
Feb 23, 2023 · Artificial Intelligence

Food2K: A Large-Scale Food Image Dataset and Progressive Region Enhancement Network

This article reviews the Food2K dataset and the proposed Progressive Region Enhancement Network for large‑scale food image recognition, detailing dataset construction, method design, extensive experiments, ablation studies, visualizations, and future research directions, all validated on the IEEE T‑PAMI 2023 paper.

Computer VisionDatasetFine-Grained Classification
0 likes · 31 min read
Food2K: A Large-Scale Food Image Dataset and Progressive Region Enhancement Network
DataFunTalk
DataFunTalk
Feb 10, 2023 · Artificial Intelligence

ICDAR 2023 BDVT-QA Competition: Born Digital Video Text Question Answering

The ICDAR 2023 BDVT-QA competition, organized by Alibaba DAMO Academy, introduces a novel dataset of 1,000 born‑digital video clips for end‑to‑end video text recognition and video text question answering, offering cash prizes, detailed dataset access, and a lineup of leading academic and industry experts.

AIDatasetICDAR
0 likes · 5 min read
ICDAR 2023 BDVT-QA Competition: Born Digital Video Text Question Answering
DataFunTalk
DataFunTalk
Feb 2, 2023 · Artificial Intelligence

ICDAR 2023 Competition on Detecting Tampered Text in Images (Jointly Organized by Alibaba Security)

The ICDAR 2023 Competition, co‑hosted by Alibaba Security and leading Chinese universities, offers a large‑scale e‑commerce text‑tampering dataset of 19,000 images, two challenge tracks (detection and localization), generous prize money up to 100,000 RMB, and a detailed schedule for registration, submissions, and final rankings.

AIDatasetICDAR 2023
0 likes · 6 min read
ICDAR 2023 Competition on Detecting Tampered Text in Images (Jointly Organized by Alibaba Security)
Alimama Tech
Alimama Tech
Feb 1, 2023 · Artificial Intelligence

CapOnImage: Context-driven Dense Captioning on Images

The paper presents CapOnImage, a novel image‑on‑image captioning task that generates location‑specific decorative text for product images, introduces the 2.1‑million‑image CapOnImage2M dataset, and proposes a mixed‑modality transformer with position‑aware pre‑training and progressive training, achieving superior accuracy and diversity and already deployed in Alibaba’s advertising platforms for measurable business impact.

Context-AwareDatasetDeep Learning
0 likes · 9 min read
CapOnImage: Context-driven Dense Captioning on Images
Alimama Tech
Alimama Tech
Feb 1, 2023 · Artificial Intelligence

Video Object of Interest Segmentation (VOIS): Task, Dataset, and Dual-Path Transformer Approach

The paper presents Video Object of Interest Segmentation (VOIS), a new e‑commerce task that locates and segments video instances matching a given product image, introduces the LiveVideos dataset of 2,418 Taobao live‑stream clips, and proposes a dual‑path Swin‑Transformer with cross‑fusion modules that outperforms existing VOS/VIS baselines.

DatasetTransformerinstance segmentation
0 likes · 11 min read
Video Object of Interest Segmentation (VOIS): Task, Dataset, and Dual-Path Transformer Approach
Kuaishou Tech
Kuaishou Tech
Dec 26, 2022 · Artificial Intelligence

ICDAR 2023-DSText Video Text Reading Competition Overview

The ICDAR 2023-DSText competition, launching on February 15, 2023, focuses on dense and small text detection and recognition in video, providing a YouTube‑sourced dataset of 100 videos, two challenge tasks, a detailed timeline, eligibility rules, and a list of international sponsoring institutions.

Computer VisionDatasetICDAR
0 likes · 6 min read
ICDAR 2023-DSText Video Text Reading Competition Overview
DataFunTalk
DataFunTalk
Jul 8, 2022 · Artificial Intelligence

Civil Aviation QA Competition (CCL2022‑DQAB): Task Description, Data, Evaluation Metrics, and Prizes

The CCL2022‑DQAB competition, organized by Beihang University and AVIC Mobile Technology, invites participants to develop reading‑comprehension models for extracting accurate question‑answer pairs from civil aviation texts, offering detailed task definitions, evaluation criteria, dataset statistics, a prize structure, and a competition schedule.

AICivil AviationDataset
0 likes · 5 min read
Civil Aviation QA Competition (CCL2022‑DQAB): Task Description, Data, Evaluation Metrics, and Prizes
ITPUB
ITPUB
Jun 25, 2022 · Big Data

How Spark SQL’s Catalyst Optimizer Accelerates Big Data Queries

This article explains Apache Spark’s role in large‑scale data processing, traces the evolution from Shark to Spark SQL’s DataFrame and Dataset APIs, and details the internal Catalyst optimizer—including its rule‑based and cost‑based strategies—through step‑by‑step examples and code snippets.

CatalystDatasetSQL
0 likes · 11 min read
How Spark SQL’s Catalyst Optimizer Accelerates Big Data Queries
Xiaohongshu Tech REDtech
Xiaohongshu Tech REDtech
Jun 20, 2022 · Artificial Intelligence

Action Sequence Verification in Videos with CosAlignment Transformer (CAT)

The paper introduces Action Sequence Verification (ASV), a task that determines whether two videos follow the same ordered actions, provides the Chemical Sequence Verification dataset and re‑annotated COIN‑SV and Diving48‑SV sets, and proposes the CosAlignment Transformer (CAT) with intra‑step feature extraction, a Transformer‑based inter‑step encoder, and a sequence‑alignment loss that outperforms prior baselines and serves as a pre‑training model for video retrieval and classification.

Action VerificationComputer VisionDataset
0 likes · 7 min read
Action Sequence Verification in Videos with CosAlignment Transformer (CAT)
Youku Technology
Youku Technology
May 18, 2022 · Artificial Intelligence

Subjective and Objective Quality of Experience of Free Viewpoint Videos – Paper Overview

This IEEE TIP paper presents a large‑scale subjective‑objective study of Free Viewpoint Video quality, introducing a cost‑saving two‑stage labeling workflow, a sparse‑frame benchmark model, and publicly releasing the dataset and code, with contributions from Alibaba’s Moku Lab and Jiangxi University researchers.

Computer VisionDatasetFree Viewpoint Video
0 likes · 5 min read
Subjective and Objective Quality of Experience of Free Viewpoint Videos – Paper Overview
Big Data Technology & Architecture
Big Data Technology & Architecture
Dec 28, 2021 · Big Data

Comprehensive Guide to Spark SQL: Concepts, DataSet/DataFrame, Functions, Optimization and Common Pitfalls

This article provides an in‑depth overview of Spark SQL, covering its architecture, DataSet/DataFrame creation, DSL and SQL usage, integration with Hive, custom UDF/UDAF/Aggregator implementations, handling of small files, Cartesian product detection, and a catalog of useful built‑in functions and window operations.

Big DataDatasetHive
0 likes · 29 min read
Comprehensive Guide to Spark SQL: Concepts, DataSet/DataFrame, Functions, Optimization and Common Pitfalls
Meituan Technology Team
Meituan Technology Team
Oct 14, 2021 · Artificial Intelligence

LargeFineFoodAI: ICCV 2021 Food Vision Seminar and Challenge Overview

The ICCV 2021 LargeFineFoodAI online seminar, co‑hosted by Meituan Vision Intelligence, CAS Computing Institute, Beijing Zhiyuan and the University of Barcelona, featured talks on personalized food models, multimedia dietary guidance, and uncertainty‑aware recognition, introduced a 1,000‑category, 500k‑image dataset, and ran a challenge with 143 teams competing in fine‑grained food recognition and retrieval, highlighting top‑ranked entries from Joyy‑cv, NJUST‑PCALab, OPPO, DeepBlueAI and others.

AIChallengeDataset
0 likes · 6 min read
LargeFineFoodAI: ICCV 2021 Food Vision Seminar and Challenge Overview
DataFunTalk
DataFunTalk
Sep 6, 2021 · Artificial Intelligence

Medical NLP at Alibaba: Data, Algorithms, and Knowledge for Smart Healthcare

This article reviews Alibaba Cloud senior algorithm expert Chen Mosha's presentation on medical NLP, covering Alibaba's healthcare business, data types, electronic medical record quality inspection, span‑based and nested NER models, term normalization, clinical trial outcome prediction, knowledge‑enhanced language models, and the CBLUE benchmark dataset.

Alibaba CloudClinical Trial PredictionDataset
0 likes · 22 min read
Medical NLP at Alibaba: Data, Algorithms, and Knowledge for Smart Healthcare
Big Data Technology & Architecture
Big Data Technology & Architecture
Aug 15, 2021 · Big Data

Spark SQL Interview Guide: Concepts, APIs, Optimization and Common Pitfalls

This article provides a comprehensive overview of Spark SQL, covering its architecture, DataSet/DataFrame APIs, code examples for creating and querying datasets, join strategy selection, handling Hive tables, small‑file issues, inefficient NOT‑IN subqueries, Cartesian products, and a catalog of useful built‑in functions.

DatasetHive IntegrationPerformance Optimization
0 likes · 40 min read
Spark SQL Interview Guide: Concepts, APIs, Optimization and Common Pitfalls
Meituan Technology Team
Meituan Technology Team
Jun 24, 2021 · Artificial Intelligence

CCKS 2021 Life Service Domain Knowledge Graph Question Answering Competition

The CCKS 2021 Life Service Domain Knowledge Graph QA competition challenges participants to build Chinese question‑answering systems that retrieve factual answers from a combined OpenKG and Meituan life‑service graph, covering tasks such as entity recognition, relation extraction and semantic parsing, with registration May‑July, cash prizes up to ¥20 000 and internship offers for top student teams.

DatasetKBQAartificial intelligence
0 likes · 6 min read
CCKS 2021 Life Service Domain Knowledge Graph Question Answering Competition
Meituan Technology Team
Meituan Technology Team
Jun 3, 2021 · Artificial Intelligence

LargeFineFoodAI Workshop and Challenge at ICCV 2021

At ICCV 2021 in Montreal, the LargeFineFoodAI workshop—co‑organized by Meituan Vision Intelligence Center, the Chinese Academy of Sciences, Beijing Zhiyuan and the University of Barcelona—will showcase state‑of‑the‑art fine‑grained food image research, feature invited speakers Jain, Aizawa and Radeva, and host a $12,000 prize challenge on Food2K across recognition and retrieval tracks.

ChallengeComputer VisionDataset
0 likes · 7 min read
LargeFineFoodAI Workshop and Challenge at ICCV 2021
iQIYI Technical Product Team
iQIYI Technical Product Team
Oct 16, 2020 · Artificial Intelligence

Cartoon Face Recognition: Introducing the iCartoonFace Benchmark Dataset

iQIYI’s ACM Multimedia‑accepted paper unveils iCartoonFace, the world’s largest manually annotated cartoon‑face dataset—over 5,000 characters and 400,000 real‑scene images—accompanied by a semi‑automatic collection pipeline and multi‑person training framework, now powering AI services, large‑scale contests and accelerating cartoon‑character recognition research.

Cartoon Face RecognitionComputer VisionDataset
0 likes · 4 min read
Cartoon Face Recognition: Introducing the iCartoonFace Benchmark Dataset
58 Tech
58 Tech
Aug 3, 2020 · Artificial Intelligence

Intelligent Customer Service Competition: Leveraging AI for Text Matching and Classification

This announcement describes the rise of AI‑driven intelligent customer service, highlights 58.com’s long‑standing system, and introduces a competition that provides real‑world data for participants to develop advanced text‑matching and classification models using state‑of‑the‑art NLP techniques.

DatasetNLPartificial intelligence
0 likes · 3 min read
Intelligent Customer Service Competition: Leveraging AI for Text Matching and Classification
58 Tech
58 Tech
Jul 22, 2020 · Artificial Intelligence

Intelligent Customer Service Competition: Leveraging AI for Text Matching and Classification

The announcement introduces an AI‑driven intelligent customer service competition, highlighting the importance of text matching and classification in NLP, describing 58.com’s existing system, providing a real‑world dataset, and inviting participants to develop precise models using the latest deep‑learning techniques.

AIDatasetNLP
0 likes · 4 min read
Intelligent Customer Service Competition: Leveraging AI for Text Matching and Classification
Alibaba Cloud Developer
Alibaba Cloud Developer
Jul 13, 2020 · Artificial Intelligence

Master Dynamic Road Condition Analysis with Car Video – AMAP-TECH Competition Overview

The AMAP-TECH algorithm competition invites participants to develop AI models that analyze in-vehicle video sequences to determine dynamic road conditions, offering detailed dataset specifications, evaluation metrics, expert judges, schedule, and prize information for researchers in computer vision and traffic analytics.

AIComputer VisionDataset
0 likes · 9 min read
Master Dynamic Road Condition Analysis with Car Video – AMAP-TECH Competition Overview
Amap Tech
Amap Tech
Jul 9, 2020 · Artificial Intelligence

AMAP-TECH Algorithm Competition: Dynamic Road‑Condition Analysis from In‑Vehicle Video Images

Alibaba Amap’s AMAP‑TECH competition invites participants to develop AI computer‑vision models that classify real‑time road conditions—smooth, slow, or congested—from short sequences of dash‑cam images, using a labeled dataset of 1,500 training sequences and a weighted F1‑score evaluation, with cash prizes up to ¥60,000.

AIComputer VisionDataset
0 likes · 8 min read
AMAP-TECH Algorithm Competition: Dynamic Road‑Condition Analysis from In‑Vehicle Video Images
DataFunTalk
DataFunTalk
Feb 14, 2020 · Artificial Intelligence

OpenKG COVID‑19 Knowledge Graphs: Datasets, Schemas, and Applications

The OpenKG initiative, together with dozens of university and industry partners, has released a series of open‑source COVID‑19 knowledge graphs—including encyclopedia, research, clinical, hero, hotspot‑event, and upcoming prevention and resource graphs—detailing their data sources, scale, schema designs, and potential AI‑driven applications such as semantic search and intelligent question answering.

AICOVID-19Dataset
0 likes · 11 min read
OpenKG COVID‑19 Knowledge Graphs: Datasets, Schemas, and Applications
iQIYI Technical Product Team
iQIYI Technical Product Team
Jul 5, 2019 · Artificial Intelligence

iQIYI Multimodal Person Recognition Competition: 91.14% Accuracy Achieved by BUPT Team

After a three‑month contest co‑hosted by iQIYI and ACM MM, 255 teams competed on the challenging iQIYI‑VID‑2019 multimodal dataset, and the BUPT Automation School team won with a 91.14% person‑recognition accuracy, advancing the field and enhancing iQIYI’s video recommendation and AI services.

AI competitionComputer VisionDataset
0 likes · 6 min read
iQIYI Multimodal Person Recognition Competition: 91.14% Accuracy Achieved by BUPT Team
Didi Tech
Didi Tech
Jun 22, 2019 · Artificial Intelligence

Didi’s Achievements and Innovations at CVPR 2019 AI City Challenge

At CVPR 2019, Didi’s technology team co‑hosted an autonomous‑driving workshop, showcased the D²‑City dataset, and secured second place in the AI City Challenge by introducing a modular multi‑camera tracking framework, a CNN‑based single‑camera tracker, and a staged aggregation strategy, while outlining its hybrid dispatch commercial plan.

AI City ChallengeCVPRDataset
0 likes · 6 min read
Didi’s Achievements and Innovations at CVPR 2019 AI City Challenge
Youku Technology
Youku Technology
Apr 11, 2019 · Artificial Intelligence

YOUKU-VSRE 2019 Video Enhancement and Super-Resolution Challenge Announcement

The YOUKU‑VSRE 2019 challenge invites researchers to develop state‑of‑the‑art video enhancement and super‑resolution models using the largest, most diverse simulated‑noise dataset, with three competition stages (preliminary, semi‑final, final), cash prizes up to ¥100,000, certificates, and fast‑track recruitment opportunities at Alibaba (Youku).

AI challengeComputer VisionDataset
0 likes · 3 min read
YOUKU-VSRE 2019 Video Enhancement and Super-Resolution Challenge Announcement
Alibaba Cloud Developer
Alibaba Cloud Developer
Dec 20, 2018 · Big Data

Unlocking Alibaba’s Massive Cluster Data V2018: A Treasure Trove for Big‑Data Research

Alibaba has released the comprehensive Cluster Data V2018 dataset, detailing eight days of operation for 4,000 servers and their mixed online and offline workloads, including DAG information, enabling researchers to study large‑scale data‑center performance, resource utilization, scheduling algorithms, and derive new insights.

Big DataDAGDataset
0 likes · 7 min read
Unlocking Alibaba’s Massive Cluster Data V2018: A Treasure Trove for Big‑Data Research
Meituan Technology Team
Meituan Technology Team
Sep 6, 2018 · Industry Insights

Inside the 2018 AI Challenger: Datasets, Tracks, and Real‑World Impact

The 2018 AI Challenger, co‑hosted by Meituan, Innovation Works, Sogou and Meitu, launched with over 3 million RMB in prizes, featured two flagship tracks—fine‑grained restaurant review sentiment analysis and autonomous‑driving visual perception—offering massive new datasets, multi‑task learning challenges, and concrete applications that illustrate how AI can reshape everyday services.

AI competitionDatasetMeituan
0 likes · 12 min read
Inside the 2018 AI Challenger: Datasets, Tracks, and Real‑World Impact
Meitu Technology
Meitu Technology
Aug 30, 2018 · Artificial Intelligence

Meitu Introduces a Multi‑Label Short‑Video Classification Dataset for the 2018 AI Challenger

In the 2018 AI Challenger, Meitu co‑organized a new “Real‑Time Short‑Video Classification” track and released the industry’s first multi‑label short‑video dataset of 200,000 mobile‑captured, vertically oriented videos spanning 63 categories and detailed tags for subjects, scenes, actions, and other dimensions, advancing video semantic understanding and AI research.

AI challengeDatasetMeitu
0 likes · 5 min read
Meitu Introduces a Multi‑Label Short‑Video Classification Dataset for the 2018 AI Challenger
High Availability Architecture
High Availability Architecture
Jul 14, 2017 · Artificial Intelligence

Facial Emotion Recognition Using Convolutional Neural Networks: Dataset, Model Architecture, and Evaluation

This article presents a deep‑learning approach for recognizing seven basic human facial expressions using a balanced FER2013 dataset, describes the CNN architecture built with Keras and OpenCV preprocessing, reports training on AWS GPU, and analyzes validation results and visualizations.

AWS GPUCNNComputer Vision
0 likes · 11 min read
Facial Emotion Recognition Using Convolutional Neural Networks: Dataset, Model Architecture, and Evaluation