Tagged articles

zero-shot learning

17 articles · Page 1 of 1

Jun 17, 2026 · Artificial Intelligence

Deterministic Video Depth (DVD): Open‑Source Framework Achieves Zero‑Shot SOTA

The DVD framework converts a pretrained video diffusion model into a deterministic, single‑pass video depth estimator, eliminating random sampling artifacts, preserving geometric and semantic priors, and reaching zero‑shot state‑of‑the‑art performance with 163× less training data.

HKUSTcomputer visiondeterministic inference

0 likes · 5 min read

Deterministic Video Depth (DVD): Open‑Source Framework Achieves Zero‑Shot SOTA

Data Party THU

May 9, 2026 · Artificial Intelligence

NOSE: Enabling AI to Smell with a Unified Molecule‑Receptor‑Semantic Tri‑modal Representation

NOSE introduces a neural olfactory‑semantic embedding that unifies molecular structure, receptor sequences, and natural‑language odor descriptions into a continuous space, achieving state‑of‑the‑art results on eleven tasks and strong zero‑shot generalization for odor and receptor retrieval.

Deep Learningcontrastive learningmolecular design

0 likes · 8 min read

NOSE: Enabling AI to Smell with a Unified Molecule‑Receptor‑Semantic Tri‑modal Representation

Machine Heart

Apr 8, 2026 · Artificial Intelligence

Retrieval-Augmented Affordance Prediction Enables Zero-Shot Fine-Grained Robot Manipulation

The RAAP framework decouples static contact point and dynamic action direction, using retrieval‑augmented inference to achieve zero‑sample, cross‑category fine‑grained robot manipulation with only a few training examples per task.

DROID datasetFranka robotHOI4D

0 likes · 10 min read

Retrieval-Augmented Affordance Prediction Enables Zero-Shot Fine-Grained Robot Manipulation

Data Party THU

Mar 25, 2026 · Artificial Intelligence

How Knowledge‑Guided Context Optimization Boosts Zero‑Shot Vision‑Language Models

The article analyzes the Base‑to‑New generalization problem of CLIP‑based visual‑language models, explains why standard prompt tuning (CoOp) forgets base knowledge, and presents the KgCoOp framework that adds a knowledge‑guided loss to keep learned prompts close to hand‑crafted ones, dramatically improving unseen‑class performance while preserving efficiency.

CLIPKnowledge-guided OptimizationPrompt Tuning

0 likes · 12 min read

How Knowledge‑Guided Context Optimization Boosts Zero‑Shot Vision‑Language Models

DeepHub IMBA

Mar 23, 2026 · Artificial Intelligence

How KgCoOp Uses Knowledge‑Guided Context Optimization to Prevent Prompt Tuning Forgetting

The article analyzes why standard prompt tuning (CoOp) causes catastrophic forgetting in visual‑language models, introduces the KgCoOp framework that adds a knowledge‑guided loss to regularize prompts, and shows through extensive experiments on 11 benchmarks that KgCoOp improves unseen‑class accuracy, harmonic mean, and efficiency while discussing trade‑offs and limitations.

Catastrophic ForgettingKnowledge-guided OptimizationPrompt Tuning

0 likes · 11 min read

How KgCoOp Uses Knowledge‑Guided Context Optimization to Prevent Prompt Tuning Forgetting

AI Algorithm Path

Feb 17, 2026 · Artificial Intelligence

Why Contrastive Learning Is the Core Foundation of Visual Language Models

The article explains how contrastive learning replaces fixed‑category visual training with a relationship‑based approach, detailing the dual‑encoder architecture, cosine similarity loss, batch scaling, temperature control, zero‑shot capabilities, scalability from web data, and the method's strengths and limitations in modern multimodal AI.

CLIPMultimodal AIcontrastive learning

0 likes · 25 min read

Why Contrastive Learning Is the Core Foundation of Visual Language Models

Xiaomi Tech

Feb 5, 2026 · Artificial Intelligence

TacRefineNet: A Tactile‑Driven Model for Millimeter‑Precision Robotic Grasp Refinement

TacRefineNet leverages high‑resolution tactile sensors, multimodal fusion of fingertip touch and proprioception, and a goal‑conditioned refinement network to achieve millimeter‑level grasp adjustments without vision or 3D models, demonstrating zero‑shot deployment and robust generalization across diverse automotive‑factory parts in both simulation and real‑world tests.

grasp refinementmultimodal fusionrobotic manipulation

0 likes · 8 min read

TacRefineNet: A Tactile‑Driven Model for Millimeter‑Precision Robotic Grasp Refinement

KooFE Frontend Team

Dec 13, 2025 · Artificial Intelligence

Unlocking LLM Reasoning: Advanced Chain‑of‑Thought Prompting Techniques Explained

This article explains how Chain‑of‑Thought prompting and its variants—zero‑shot CoT, Thread of Thought, Tabular CoT, Analogical Prompting, and Step‑back Prompting—enable large language models to perform multi‑step reasoning by breaking problems into intermediate steps, with practical prompts, examples, and implementation details.

Chain-of-Thoughtreasoningzero-shot learning

0 likes · 12 min read

Unlocking LLM Reasoning: Advanced Chain‑of‑Thought Prompting Techniques Explained

Data Party THU

Aug 10, 2025 · Artificial Intelligence

Can Evolutionary Algorithms Auto-Design Training-Free Vision-Language Model Adaptations?

This study introduces EvoVLMA, an evolutionary vision-language model adaptation framework that automatically searches training-free VLM adaptation algorithms using a two-stage LLM-guided evolution, demonstrating superior performance—such as a 1.91 % accuracy gain on 8-shot image classification—and releasing the code publicly.

Evolutionary AlgorithmsLLMModel Adaptation

0 likes · 5 min read

Can Evolutionary Algorithms Auto-Design Training-Free Vision-Language Model Adaptations?

DataFunTalk

Jul 16, 2025 · Artificial Intelligence

How Jason Wei’s Breakthroughs Are Shaping the Future of Large Language Models

Jason Wei, a former Google Brain and OpenAI researcher now at Meta, has driven key advances in large language models—including chain‑of‑thought prompting, instruction tuning, emergent abilities, zero‑shot learning, and data augmentation—shaping both AI research paradigms and real‑world applications.

Chain-of-ThoughtInstruction TuningLarge Language Models

0 likes · 7 min read

How Jason Wei’s Breakthroughs Are Shaping the Future of Large Language Models

AIWalker

Jun 18, 2025 · Artificial Intelligence

SeNaTra: Nvidia’s Spatial Grouping Layer Pushes Semantic Segmentation Past Swin Transformer

Nvidia introduces SeNaTra, a native‑segmentation vision transformer that replaces uniform down‑sampling with a content‑aware spatial grouping layer, delivering superior zero‑shot and supervised segmentation performance while cutting parameters and FLOPs compared with Swin Transformer and other backbones.

NVIDIASemantic SegmentationVision Transformer

0 likes · 29 min read

SeNaTra: Nvidia’s Spatial Grouping Layer Pushes Semantic Segmentation Past Swin Transformer

DataFunSummit

May 23, 2024 · Artificial Intelligence

GraphGPT: Enabling Large Language Models as Zero‑Shot Graph Learners

GraphGPT integrates large language models with graph neural networks by introducing graph tokens and instruction tuning, enabling zero‑shot graph learning for tasks such as node classification and link prediction, and demonstrates superior performance and generalization across supervised and zero‑shot benchmarks.

GraphGPTInstruction Tuningzero-shot learning

0 likes · 15 min read

GraphGPT: Enabling Large Language Models as Zero‑Shot Graph Learners

NewBeeNLP

Mar 26, 2024 · Artificial Intelligence

How OpenGraph Enables Zero‑Shot Graph Learning Across Datasets

OpenGraph introduces a zero‑shot graph learning framework that unifies graph tokenization, a scalable transformer with efficient sampling, and LLM‑driven data augmentation, achieving superior cross‑dataset generalization on node classification and link prediction tasks, as demonstrated by extensive experiments.

Graph Neural NetworksLLM data augmentationgraph tokenization

0 likes · 20 min read

How OpenGraph Enables Zero‑Shot Graph Learning Across Datasets

DataFunTalk

Nov 24, 2023 · Artificial Intelligence

Open Vocabulary Detection Contest 2023: Summary of Winning Teams' Technical Solutions

The article reviews the Open Vocabulary Detection Contest organized by the Chinese Society of Image and Graphics and 360 AI Institute, describing the competition setup, dataset characteristics, and detailed winning approaches that combine Detic, CLIP, prompt learning, and multi‑stage pipelines to achieve strong few‑shot and zero‑shot object detection performance.

CLIPOpen-Vocabulary Detectioncompetition

0 likes · 17 min read

Open Vocabulary Detection Contest 2023: Summary of Winning Teams' Technical Solutions

DataFunTalk

Feb 16, 2023 · Artificial Intelligence

Fine‑Grained Entity Recognition in Tencent TexSmart: System Overview and Key Techniques

This article presents an in‑depth overview of Tencent's TexSmart natural‑language understanding system, highlighting its fine‑grained NER capabilities, knowledge‑base combination methods, remote‑supervision via similar entities, multi‑source zero‑shot fusion, experimental results, and practical insights from a recent NLP summit.

Entity TypingFine-grained NERTexSmart

0 likes · 12 min read

Fine‑Grained Entity Recognition in Tencent TexSmart: System Overview and Key Techniques

DataFunSummit

May 9, 2022 · Artificial Intelligence

TextToKnowledge (解语): Zero‑Shot Chinese Text Knowledge Annotation and Mining Framework

The article introduces TextToKnowledge, an open‑source Baidu platform that provides a unified Chinese term taxonomy (TermTree) and two annotation tools (WordTag and NPTag) to enable zero‑sample text labeling, term‑linking, and downstream knowledge‑mining applications for various NLP tasks.

Chinese NLPKnowledge GraphPaddleNLP

0 likes · 25 min read

TextToKnowledge (解语): Zero‑Shot Chinese Text Knowledge Annotation and Mining Framework

NetEase Cloud Music Tech Team

Apr 27, 2022 · Artificial Intelligence

How Model-Agnostic Interest Learning (MAIL) Solves Cold‑Start in Recommender Systems

This paper introduces MAIL, a model‑agnostic dual‑tower framework that uses a zero‑shot learning tower to generate virtual user behaviors for new users and an embedding‑based ranking tower, achieving 13‑15% CTR lift in large‑scale live‑stream recommendation at NetEase Cloud Music.

Embeddingcold-startdual-tower

0 likes · 33 min read

How Model-Agnostic Interest Learning (MAIL) Solves Cold‑Start in Recommender Systems