Tagged articles

self-supervised learning

77 articles · Page 1 of 1

Jun 23, 2026 · Artificial Intelligence

Can VLA‑JEPA Achieve Robust Vision‑Language‑Action with Few Robot Trajectories and Lots of Human Video?

The article analyzes VLA‑JEPA, a JEPA‑style pre‑training framework that combines limited robot trajectories with abundant human video to build a latent world model for Vision‑Language‑Action tasks, showing improved robustness and high success rates across simulated and real‑robot benchmarks.

VLA-JEPAbenchmarklatent world modeling

0 likes · 12 min read

Can VLA‑JEPA Achieve Robust Vision‑Language‑Action with Few Robot Trajectories and Lots of Human Video?

HyperAI Super Neural

Jun 10, 2026 · Artificial Intelligence

Pixel‑Level Foundation Model for Earth Observation Sets New SOTA Across Tasks, Excelling with Sparse Labels

A joint team from Cambridge, Aalto and Bristol introduces TESSERA, a pixel‑level remote‑sensing foundation model that leverages a Barlow‑Twins self‑supervised scheme and a novel d‑pixel data organization to achieve state‑of‑the‑art accuracy on classification, segmentation and regression tasks, especially when annotations are scarce.

Sentinel-1Sentinel-2d-pixel

0 likes · 12 min read

Pixel‑Level Foundation Model for Earth Observation Sets New SOTA Across Tasks, Excelling with Sparse Labels

Huolala Tech

Jun 3, 2026 · Artificial Intelligence

Three Breakthroughs Driving the Rapid Rise of Computer Vision

The article reviews three major recent breakthroughs in computer vision—self‑supervised visual foundation models, feed‑forward 3D reconstruction, and unified multimodal models—detailing their underlying methods, key papers, performance characteristics, and practical implications for real‑world AI applications.

3D reconstructioncomputer visionmultimodal models

0 likes · 22 min read

Three Breakthroughs Driving the Rapid Rise of Computer Vision

Machine Heart

May 27, 2026 · Artificial Intelligence

CVPR 2026: Learning Camera Pose from 10M Unlabeled Driving Videos

LA‑Pose shows that a model can acquire accurate camera pose estimation for autonomous driving by self‑supervised pretraining on roughly ten million unlabeled driving video clips and fine‑tuning with only a small amount of high‑quality 3D annotations, achieving over 10% accuracy gains while drastically reducing labeling cost.

CVPR 2026LA-Poseautonomous driving

0 likes · 8 min read

CVPR 2026: Learning Camera Pose from 10M Unlabeled Driving Videos

Weekly Large Model Application

May 5, 2026 · Artificial Intelligence

What Pretraining Actually Teaches: Listening to All Sounds

The article explains that pretraining for speech models functions like a broad liberal‑arts education, teaching universal acoustic and linguistic patterns through next‑token prediction, joint audio‑text training, and mask‑or contrast objectives, while clarifying common misconceptions and highlighting data bias and the need for clean, task‑specific fine‑tuning.

audio-text alignmentdata biasfine-tuning

0 likes · 6 min read

What Pretraining Actually Teaches: Listening to All Sounds

Weekly Large Model Application

May 5, 2026 · Artificial Intelligence

How Audio Waveforms Are Turned Into Model‑Readable Tokens

The article explains why raw audio cannot be fed directly to language models, outlines the two essential compression steps, compares three common tokenization approaches—neural codecs, self‑supervised clustering, and continuous vectors—and warns of typical pitfalls for newcomers.

Large Language Modelsaudio tokenizationneural codecs

0 likes · 6 min read

How Audio Waveforms Are Turned Into Model‑Readable Tokens

Weekly Large Model Application

May 1, 2026 · Artificial Intelligence

How Speech Models Turn Waveforms into Computable Tokens

The article explains why speech tokenization is essential for large audio models, outlines three core challenges, compares five major tokenization paradigms—including neural codecs with vector quantization, self‑supervised learning with clustering, continuous embeddings, ASR‑derived text tokens, and hierarchical multi‑codebook tokens—and provides practical guidance for selecting the right approach based on task requirements and trade‑offs.

audio codechierarchical tokensself-supervised learning

0 likes · 11 min read

How Speech Models Turn Waveforms into Computable Tokens

AgentGuide

Apr 19, 2026 · Artificial Intelligence

Understanding the Key Differences Between Large Model Pretraining and Fine‑Tuning

The article explains how pretraining on massive generic data creates a reusable base model, while fine‑tuning uses smaller, high‑quality task‑specific data to adapt the model, covering objectives, data scale, cost, methods, and why most projects prefer fine‑tuning.

Large Language ModelLoRAPEFT

0 likes · 6 min read

Understanding the Key Differences Between Large Model Pretraining and Fine‑Tuning

Bighead's Algorithm Notes

Apr 14, 2026 · Artificial Intelligence

How Self‑Supervised HINTS Extracts Human Insights from Time Series to Boost Forecast Accuracy

The paper introduces HINTS, a two‑stage self‑supervised framework that leverages Friedkin‑Johnsen opinion dynamics to mine latent human‑driven factors from time‑series residuals, integrates them via attention into state‑of‑the‑art predictors, and demonstrates consistent accuracy gains and interpretability across nine benchmark and real‑world datasets.

Attention MechanismFriedkin-Johnsen modelTime Series Forecasting

0 likes · 17 min read

How Self‑Supervised HINTS Extracts Human Insights from Time Series to Boost Forecast Accuracy

AIWalker

Mar 21, 2026 · Artificial Intelligence

Re‑annotating ImageNet: 1.28 M Images Gain Multi‑Labels, Boosting COCO mAP by 4 Points

A Rochester research team automatically relabeled the entire 1.28 M‑image ImageNet training set with multi‑labels using self‑supervised object discovery and a lightweight region classifier, resulting in a pretrained model that raises COCO mAP by 4.2 points and VOC mAP by 2.3 points.

ImageNetdataset relabelingmodel performance

0 likes · 6 min read

Re‑annotating ImageNet: 1.28 M Images Gain Multi‑Labels, Boosting COCO mAP by 4 Points

PaperAgent

Mar 1, 2026 · Artificial Intelligence

How On-Policy Context Distillation Enables LLMs to Retain Experience Forever

On-Policy Context Distillation (OPCD) compresses transient in‑context knowledge into LLM parameters, allowing models to permanently retain problem‑solving experience without ground‑truth labels; the article details the OPCD framework, training steps, teacher‑student configurations, and experimental results on math, games, and system‑prompt tasks, highlighting its advantages over traditional context distillation.

Artificial IntelligenceKnowledge DistillationLLM

0 likes · 8 min read

How On-Policy Context Distillation Enables LLMs to Retain Experience Forever

Machine Learning Algorithms & Natural Language Processing

Feb 10, 2026 · Artificial Intelligence

LeCun Team’s Triple Breakthrough: Sparse Representations, Gradient Planning, and Lightweight JEPA for World Models

LeCun’s three new papers—Rectified LpJEPA, GRASP, and EB‑JEPA—address dense feature bottlenecks, inefficient gradient‑free planning, and heavyweight codebases by introducing sparsity‑preserving regularization, a parallel gradient‑based planner, and a lightweight modular library, delivering high‑performance world‑model representations that run on a single GPU.

AI researchJEPAgradient planning

0 likes · 11 min read

LeCun Team’s Triple Breakthrough: Sparse Representations, Gradient Planning, and Lightweight JEPA for World Models

AI Cyberspace

Jan 18, 2026 · Artificial Intelligence

Understanding Supervised, Unsupervised, Self‑Supervised, Semi‑Supervised, and Reinforcement Learning for Large Language Model Training

The article explains various learning paradigms (supervised, unsupervised, self‑supervised, semi‑supervised, and reinforcement), describes dataset types and quality considerations, outlines preprocessing steps like filtering, deduplication, and tokenization, and discusses scaling laws linking model size, data volume, and compute resources, with concrete examples and code.

Data preprocessingModel Trainingmachine learning

0 likes · 26 min read

Understanding Supervised, Unsupervised, Self‑Supervised, Semi‑Supervised, and Reinforcement Learning for Large Language Model Training

Bighead's Algorithm Notes

Sep 1, 2025 · Artificial Intelligence

How MERA’s Retrieval‑Augmented MoE Boosts Stock Selection Performance by 11%

The article introduces MERA, a Retrieval‑Augmented Mixture‑of‑Experts module that addresses the inability of single‑branch deep‑learning models to capture diverse stock market patterns, describes its self‑supervised pretraining, gating and expert mechanisms, and shows that it improves stock‑selection metrics by up to 11% on major Chinese indices.

MERAMixture of ExpertsRetrieval Augmented Representation

0 likes · 14 min read

How MERA’s Retrieval‑Augmented MoE Boosts Stock Selection Performance by 11%

AI Algorithm Path

Aug 16, 2025 · Artificial Intelligence

Meta Unveils DINOv3: A Universal Self‑Supervised Visual AI for All Image Tasks

Meta's DINOv3 is a 70‑billion‑parameter self‑supervised visual foundation model trained on 17 billion Instagram images without any labels, introducing dense feature extraction, Gram‑Anchoring to prevent feature collapse, high‑resolution adaptation, and multi‑student distillation that together enable out‑of‑the‑box performance on segmentation, depth estimation, 3D matching, and tracking while surpassing prior models such as DINOv2, CLIP, and SAM.

DINOv3Gram AnchoringLarge‑Scale Training

0 likes · 8 min read

Meta Unveils DINOv3: A Universal Self‑Supervised Visual AI for All Image Tasks

Amap Tech

Jul 11, 2025 · Artificial Intelligence

Unified Self‑Supervised Pretraining Accelerates Image Generation and Improves Understanding

The USP framework introduces masked latent modeling within a VAE space to pre‑train ViT encoders, enabling seamless weight transfer to both image classification, segmentation, and diffusion‑based generation tasks, dramatically speeding up DiT and SiT models while preserving strong visual representations.

Diffusion ModelsVAEViT³

0 likes · 13 min read

Unified Self‑Supervised Pretraining Accelerates Image Generation and Improves Understanding

Amap Tech

Jul 11, 2025 · Artificial Intelligence

Unified Self‑Supervised Pretraining Boosts Image Generation and Understanding

The USP framework introduces masked latent modeling within a VAE space to pretrain ViT encoders, enabling seamless weight transfer to both image classification and diffusion‑based generation tasks, dramatically accelerating training while preserving strong performance across multiple benchmarks.

Diffusion ModelsVision Transformerimage generation

0 likes · 10 min read

Unified Self‑Supervised Pretraining Boosts Image Generation and Understanding

AI Frontier Lectures

Jul 10, 2025 · Artificial Intelligence

Can Dispersive Loss Supercharge Diffusion Models Without Extra Pre‑training?

Dispersive Loss is a plug‑and‑play regularization technique that enhances diffusion‑based generative models by encouraging dispersed internal representations, requiring no additional pre‑training, parameters, or data, and consistently improves performance across various model sizes and configurations, as demonstrated through extensive experiments.

Dispersive LossRegularizationcontrastive learning

0 likes · 18 min read

Can Dispersive Loss Supercharge Diffusion Models Without Extra Pre‑training?

AI Algorithm Path

Jun 23, 2025 · Artificial Intelligence

Visual Language Model Beginner’s Guide Day 4: Major Contrastive Learning Frameworks

This article surveys six leading contrastive learning frameworks—SimCLR, MoCo, BYOL, SwAV, Barlow Twins, and NNCLR—detailing their loss functions, data‑augmentation pipelines, encoder architectures, and unique mechanisms such as momentum queues, twin networks, clustering swaps, and redundancy reduction, while highlighting their advantages and impact on self‑supervised vision research.

BYOLBarlow TwinsMoCo

0 likes · 14 min read

Visual Language Model Beginner’s Guide Day 4: Major Contrastive Learning Frameworks

AI Algorithm Path

Jun 22, 2025 · Artificial Intelligence

Beginner’s Guide to Visual Language Models – Day 3: Contrastive Learning Loss Functions

This article systematically introduces the most common contrastive learning loss functions—including Contrastive Loss, Triplet Loss, N‑pair Loss, InfoNCE, and Cross‑Entropy—explaining their mathematical formulations, advantages, challenges, and typical applications in visual, textual, and multimodal representation learning.

InfoNCELoss Functionscontrastive learning

0 likes · 10 min read

Beginner’s Guide to Visual Language Models – Day 3: Contrastive Learning Loss Functions

AI Algorithm Path

Jun 20, 2025 · Artificial Intelligence

Beginner’s Guide to Visual Language Models – Day 2: Understanding Contrastive Learning

This article explains contrastive learning for visual language models, covering its definition, four‑step workflow, how to choose positive and negative pairs, the difference between supervised and self‑supervised variants, and why the technique is essential for zero‑shot and cross‑modal capabilities.

contrastive learningdata augmentationrepresentation learning

0 likes · 6 min read

Beginner’s Guide to Visual Language Models – Day 2: Understanding Contrastive Learning

DataFunTalk

Jun 12, 2025 · Artificial Intelligence

How Meta’s V‑JEPA 2 Is Pushing AI Toward Human‑Like Physical Understanding

Meta’s newly released V‑JEPA 2 introduces a video‑trained world model that can understand, predict, and plan physical actions, enabling zero‑shot robot control and outperforming existing models on benchmarks like IntPhys 2, MVPBench, and CausalVQA, while outlining future directions for hierarchical and multimodal JEPA architectures.

V-JEPA 2Video AIbenchmark

0 likes · 8 min read

How Meta’s V‑JEPA 2 Is Pushing AI Toward Human‑Like Physical Understanding

Data Thinking Notes

Jun 2, 2025 · Artificial Intelligence

Why Pre‑Training Powers Modern AI: From Theory to Real‑World Applications

Pre‑training enables AI models to first acquire a universal knowledge map from massive unlabelled text, then quickly adapt to specific tasks with minimal labelled data, offering superior generalization, reduced annotation costs, and versatile applications across chatbots, content creation, retrieval, coding assistance, and more.

AI ApplicationsLarge Language ModelsTransformer

0 likes · 14 min read

Why Pre‑Training Powers Modern AI: From Theory to Real‑World Applications

DataFunSummit

Jan 13, 2025 · Artificial Intelligence

Deep Learning Approaches for Solving Graph Optimization Problems

This article reviews the use of deep learning, including supervised, reinforcement, and self‑supervised paradigms, to address graph optimization problems such as facility location and balanced graph partitioning, discusses existing research challenges, presents a three‑stage self‑supervised model with graph contrastive pre‑training, and evaluates its performance on synthetic and real‑world datasets.

Deep LearningGraph Neural Networkscombinatorial optimization

0 likes · 14 min read

Deep Learning Approaches for Solving Graph Optimization Problems

DevOps

Dec 19, 2024 · Artificial Intelligence

Yann LeCun Discusses AI, Self‑Supervised Learning, and the Future of AGI

Yann LeCun, in a half‑hour interview with Indian entrepreneur Nikhil Kamath, explains the fundamentals of artificial intelligence, critiques current transformer models, describes self‑supervised learning, outlines his joint‑embedding predictive architecture, and shares his vision for AGI, open‑source ecosystems, and the role of PhDs for AI entrepreneurs.

AGIArtificial IntelligenceOpen Source

0 likes · 16 min read

Yann LeCun Discusses AI, Self‑Supervised Learning, and the Future of AGI

AntTech

Aug 6, 2024 · Artificial Intelligence

Self‑Supervised Video Copy Localization with Regional Token Representation

The article presents a self‑supervised framework that uses a regional token structure within a Vision Transformer to accurately locate video plagiarism segments, dramatically reducing annotation costs and achieving state‑of‑the‑art performance without manual labeling, while also highlighting its real‑world deployment for copyright protection.

AIcopyright protectionself-supervised learning

0 likes · 5 min read

Self‑Supervised Video Copy Localization with Regional Token Representation

DataFunSummit

Jul 18, 2024 · Artificial Intelligence

Tencent Music Tianqin Lab’s Practice and Applications of Audio Representation Large Models

This article reviews Tencent Music Tianqin Lab’s research on audio representation large models, covering background, the evolution of audio features, self‑supervised methods such as SimCLR, BYOL, MAE, MLM, benchmark results, multimodal extensions, and real‑world applications like song authenticity detection and search ranking.

Multimodal AITencent Musicaudio representation

0 likes · 20 min read

Tencent Music Tianqin Lab’s Practice and Applications of Audio Representation Large Models

Ops Development & AI Practice

Jul 8, 2024 · Artificial Intelligence

Essential Denoising Techniques for Training Large AI Models

This article outlines key denoising methods—including data cleaning, augmentation, regularization, adversarial training, and self‑supervised learning—that improve the performance, generalization, and robustness of large neural network and transformer models.

DenoisingRegularizationadversarial training

0 likes · 5 min read

Essential Denoising Techniques for Training Large AI Models

DataFunSummit

Mar 23, 2024 · Artificial Intelligence

Graph Neural Networks for Real-World Complex Scenarios

This article presents a comprehensive overview of recent graph neural network research, covering adversarial representation learning for network embedding, block‑model guided GCN, enhanced class‑discriminative GNNs, self‑supervised contrastive GNNs, experimental results, and conclusions, highlighting their significance in real‑world applications.

GCNGraph Neural Networksadversarial learning

0 likes · 13 min read

Graph Neural Networks for Real-World Complex Scenarios

Rare Earth Juejin Tech Community

Dec 20, 2023 · Artificial Intelligence

BERT Model Overview: Inputs, Encoder, Fine‑tuning, and Variants

This article explains BERT's WordPiece tokenization, input embeddings (token, segment, and position embeddings), encoder architecture for Base and Large models, fine‑tuning strategies for various NLP tasks, and introduces popular variants such as RoBERTa and ALBERT.

BERTNLPTransformer

0 likes · 12 min read

BERT Model Overview: Inputs, Encoder, Fine‑tuning, and Variants

DataFunSummit

Dec 5, 2023 · Artificial Intelligence

Scenario-Adaptive and Self-Supervised Multi-Scenario Personalized Recommendation (SASS)

This article presents a comprehensive study of a scenario‑adaptive and self‑supervised multi‑scenario recommendation model (SASS) for Taobao, detailing its motivation, adaptive multi‑scenario architecture, two‑stage pre‑training and fine‑tuning, experimental validation, deployment in the recall stage, and practical challenges addressed through Q&A.

AlibabaRecommendation Systemsmulti‑scenario modeling

0 likes · 36 min read

Scenario-Adaptive and Self-Supervised Multi-Scenario Personalized Recommendation (SASS)

AntTech

Nov 7, 2023 · Artificial Intelligence

Multi‑Scale Stochastic Distribution Prediction for User Behavior Representation Learning

The paper proposes a multi‑scale stochastic distribution prediction (MSDP) framework that learns robust user behavior representations by predicting behavior distributions over random time windows, incorporates contrastive regularization, and demonstrates superior performance on both proprietary financial risk data and a public e‑commerce dataset compared with existing masked and next‑behavior pre‑training methods.

AIMulti-Scaledistribution prediction

0 likes · 13 min read

Multi‑Scale Stochastic Distribution Prediction for User Behavior Representation Learning

Alimama Tech

Sep 12, 2023 · Artificial Intelligence

Content Collaborative Graph Neural Network for Large‑Scale E‑commerce Search

CC‑GNN addresses three drawbacks of existing graph‑neural retrieval for e‑commerce by adding content phrase nodes, scalable meta‑path message passing, and difficulty‑aware noisy contrastive learning with counterfactual augmentation, achieving up to 16 % recall improvement and notably larger gains on long‑tail queries and cold‑start items.

E-commerce SearchGraph Neural NetworksLong Tail

0 likes · 19 min read

Content Collaborative Graph Neural Network for Large‑Scale E‑commerce Search

Meituan Technology Team

Jul 13, 2023 · Artificial Intelligence

Intelligent Companion Search Guidance for Meituan Waimai: Challenges, Solutions, and Future Directions

Meituan’s delivery team created an intelligent, real‑time companion‑type search guidance system for its waimai platform—combining a smart in‑box word refresh triggered by edge‑intelligence intent signals and a unified multi‑scenario query‑recommendation model with self‑supervised pre‑training and multi‑objective optimization—delivering over 1 % DAU growth, 1.7 % UV_RPM increase, and up to 187 % CTR lifts while outlining future extensions to more pages and large‑model embeddings.

Meituanfood deliveryintelligent companion

0 likes · 31 min read

Intelligent Companion Search Guidance for Meituan Waimai: Challenges, Solutions, and Future Directions

DataFunSummit

Jun 23, 2023 · Artificial Intelligence

Frontiers of Video Action Recognition: Concepts, Algorithms, and Applications

This article introduces video action recognition, covering its basic definition, downstream tasks, major algorithmic families—including CNN‑based, Vision‑Transformer, self‑supervised, and multimodal approaches—and discusses practical deployment scenarios and open challenges in the field.

CNNVision Transformermultimodal models

0 likes · 16 min read

Frontiers of Video Action Recognition: Concepts, Algorithms, and Applications

Meituan Technology Team

Jun 15, 2023 · Artificial Intelligence

Meituan Technical Team's 8 CVPR 2023 Papers: Overview and Insights

This article reviews eight CVPR 2023 papers selected by Meituan’s technology team, covering self‑supervised learning, domain adaptation, federated learning, object detection, 3D reconstruction, GAN‑based pre‑training, RGB‑T tracking, vision‑language navigation, and visual‑textual layout generation, highlighting each work’s methodology, experiments, and reported performance gains.

3D Object DetectionCVPR 2023Domain Adaptation

0 likes · 15 min read

Meituan Technical Team's 8 CVPR 2023 Papers: Overview and Insights

DataFunTalk

Apr 10, 2023 · Artificial Intelligence

Scenario-Adaptive and Self-Supervised Multi-Scenario Personalized Recommendation (SASS): Design, Training, and Deployment

This article presents a comprehensive study of multi‑scenario personalized recommendation, introducing a scenario‑adaptive and self‑supervised model (SASS) that jointly addresses data sparsity, domain adaptation, and recall‑stage deployment through a two‑stage training pipeline and extensive experiments on Alibaba’s Taobao platform.

AlibabaRecommendation Systemsmulti‑scenario modeling

0 likes · 36 min read

Scenario-Adaptive and Self-Supervised Multi-Scenario Personalized Recommendation (SASS): Design, Training, and Deployment

DataFunTalk

Mar 16, 2023 · Artificial Intelligence

Review of Deep Learning Model Evolution and Future Trends

The article reviews the past six years of deep learning model development, highlighting scaling limits, universality of Transformers, challenges in interpretability and control, and predicts future trends such as efficient architectures, multimodal capabilities, reinforcement learning in virtual worlds, and novel AI hardware, while also promoting a new deep‑learning practice ebook.

AI hardwareAI trendsself-supervised learning

0 likes · 6 min read

Review of Deep Learning Model Evolution and Future Trends

DataFunTalk

Feb 25, 2023 · Artificial Intelligence

Review of Deep Learning Model Evolution and Future Trends

The article reviews the historical development of deep learning models, highlights current limitations such as scaling inefficiencies, interpretability, and planning, and outlines future directions including efficient architectures, self‑supervised training, cross‑modal transformers, and the impact of AI on fields like life sciences and finance.

AI trendsFuture AITransformer

0 likes · 6 min read

DataFunTalk

Feb 20, 2023 · Artificial Intelligence

Review of Deep Learning Model Evolution and Future Trends

The article reviews the historical development of deep learning models, highlighting patterns such as scaling limits, increasing generality, interpretability challenges, planning deficiencies, and hardware constraints, and then outlines future directions including efficient architectures, enhanced capabilities, interdisciplinary applications, virtual agents, and novel AI hardware.

AI trendsTransformerself-supervised learning

0 likes · 6 min read

DaTaobao Tech

Dec 28, 2022 · Artificial Intelligence

Adaptive Multi-Scenario Modeling for Taobao Personalized Recommendation

On January 9 at 7 p.m., Alibaba senior algorithm engineer Zhang Yuanliang will present a scenario‑adaptive, self‑supervised model for multi‑scenario personalized recommendation, discussing its background, technical details, experimental results, and real‑world deployment within Taobao’s recommendation system.

AIAlibabamulti‑scenario modeling

0 likes · 1 min read

Adaptive Multi-Scenario Modeling for Taobao Personalized Recommendation

DataFunSummit

Dec 23, 2022 · Artificial Intelligence

Data‑Centric AI Practices for Content Moderation at NetEase Yidun

The article presents NetEase Yidun’s data‑centric AI approach to content moderation, covering the background of Data‑Centric AI, the specific business and data challenges of content safety, comprehensive data pipelines—including collection, labeling, augmentation, selection, cleaning, iteration and testing—and the role of self‑, semi‑ and weak‑supervised learning in enhancing algorithm performance.

Algorithm InnovationData ManagementData-centric AI

0 likes · 19 min read

Data‑Centric AI Practices for Content Moderation at NetEase Yidun

Alimama Tech

Dec 7, 2022 · Artificial Intelligence

Adaptive Domain Interest Network for Multi-domain Recommendation

The Adaptive Domain Interest Network (ADIN) introduces a shared backbone with scenario‑specific subnetworks, domain‑specific batch normalization and SE‑Block attention to capture both commonalities and divergences across recommendation scenarios, and, combined with self‑supervised training, consistently outperforms baselines, delivering a 1.8% revenue lift in Alibaba’s display‑ad platform and now runs in production.

Deep LearningDomain Adaptationrecommendation

0 likes · 12 min read

Adaptive Domain Interest Network for Multi-domain Recommendation

DataFunTalk

Oct 28, 2022 · Artificial Intelligence

Geometric Graph Neural Networks for Drug Discovery: 3D Structure‑Based Binding Affinity Prediction and Molecular Property Learning

This article presents a comprehensive overview of using geometric graph neural networks on the Baidu PaddleHelix platform to address challenges in drug discovery, including 3D‑structure‑aware protein‑ligand binding affinity prediction, molecular property prediction, and self‑supervised pre‑training, with experimental results showing significant improvements over existing baselines.

Graph Neural Networksdrug discoverygeometric-deep-learning

0 likes · 16 min read

Geometric Graph Neural Networks for Drug Discovery: 3D Structure‑Based Binding Affinity Prediction and Molecular Property Learning

Zhuanzhuan Tech

Oct 28, 2022 · Artificial Intelligence

Contrastive Learning: Definitions, Principles, Classic Algorithms, and Applications in Recommendation Systems

This article introduces contrastive learning, explains its definition, principles, and classic algorithms such as SimCLR and MoCo, and details its practical applications in recommendation systems, including a case study of its deployment at Zhuanzhuan that boosted order rates by over 10%.

AIcontrastive learningself-supervised learning

0 likes · 12 min read

Contrastive Learning: Definitions, Principles, Classic Algorithms, and Applications in Recommendation Systems

Alimama Tech

Oct 12, 2022 · Artificial Intelligence

Decoupled Graph Neural Networks for Large-Scale E-commerce Retrieval

Decoupled Graph Neural Networks (DC‑GNN) improve large‑scale e-commerce ad recall by separating graph processing from CTR prediction, using multi‑task pretraining (edge prediction + contrastive learning), efficient deep linear aggregation, and a dual‑tower CTR model, achieving higher efficiency and performance on billions‑scale data.

CTR PredictionDecoupled ArchitectureGraph Neural Networks

0 likes · 15 min read

Decoupled Graph Neural Networks for Large-Scale E-commerce Retrieval

DataFunTalk

Sep 19, 2022 · Artificial Intelligence

Pretraining Models and Graph Neural Networks for Recommendation Systems

This talk explores the evolution, objectives, and core challenges of pretraining models, their application in recommendation scenarios, service modes, and detailed case studies of graph neural network pretraining, illustrating how self‑supervised learning and multi‑domain data integration enhance user and item embeddings for improved recommendation performance.

Graph Neural NetworksMulti-domainRecommendation Systems

0 likes · 16 min read

Pretraining Models and Graph Neural Networks for Recommendation Systems

Laiye Technology Team

Sep 9, 2022 · Artificial Intelligence

Graph Convolutional Networks for Intelligent Document Processing: Principles, Feature Engineering, and Applications

This article presents a comprehensive overview of using graph convolutional networks in intelligent document processing, covering basic GCN theory, adjacency matrix construction, feature engineering—including text, image, and handcrafted features—model architecture, self-supervised training, and real-world applications such as semantic entity recognition and relation extraction.

Intelligent Document Processinggraph convolutional networksrelation extraction

0 likes · 14 min read

Graph Convolutional Networks for Intelligent Document Processing: Principles, Feature Engineering, and Applications

DataFunTalk

Jun 9, 2022 · Artificial Intelligence

Understanding and Reproducing MAE (Masked AutoEncoder) for Self‑Supervised Vision Learning with EasyCV

This article introduces the MAE (Masked AutoEncoder) self‑supervised learning method, explains its asymmetric encoder‑decoder design and high masking ratio, evaluates its performance, and provides a step‑by‑step guide to reproduce MAE using Alibaba’s EasyCV framework, including code snippets, training tips, and troubleshooting.

EasyCVMAEPyTorch

0 likes · 15 min read

Understanding and Reproducing MAE (Masked AutoEncoder) for Self‑Supervised Vision Learning with EasyCV

DataFunSummit

May 26, 2022 · Artificial Intelligence

Exploring Contrastive Learning in Kuaishou Recommendation Systems

This article presents a comprehensive overview of how contrastive learning can alleviate data sparsity and distribution bias in recommendation systems, detailing its theoretical advantages, recent research progress in computer vision and NLP, and a multi‑task self‑supervised framework applied to Kuaishou's short‑video ranking pipeline with significant offline and online performance gains.

AIKuaishouRecommendation Systems

0 likes · 21 min read

Exploring Contrastive Learning in Kuaishou Recommendation Systems

DaTaobao Tech

May 17, 2022 · Artificial Intelligence

Self-Supervised Learning for Image Embeddings in Recommendation Systems: SwAV and M6 Applications at Meiping Meiwu

The paper demonstrates how self‑supervised models SwAV and M6 generate high‑quality image and multimodal embeddings for Meiping Meiwu’s recommendation system, delivering notable gains in scene/style consistency, ranking AUC, classification and retrieval performance, especially for cold‑start items, and achieving measurable production lifts.

A/B testingM6 multimodalRecommendation Systems

0 likes · 15 min read

Self-Supervised Learning for Image Embeddings in Recommendation Systems: SwAV and M6 Applications at Meiping Meiwu

Alibaba Cloud Developer

Apr 26, 2022 · Artificial Intelligence

Unlocking Vision AI: Inside Alibaba’s EasyCV All‑in‑One Self‑Supervised & Transformer Framework

EasyCV is Alibaba’s open‑source, PyTorch‑based visual modeling platform that unifies self‑supervised learning and Transformer techniques, offering a comprehensive algorithm suite, pre‑trained models, high‑performance training/inference optimizations, extensible architecture, and seamless cloud deployment for a wide range of computer‑vision tasks.

AI FrameworkAlibabaDeep Learning

0 likes · 16 min read

Unlocking Vision AI: Inside Alibaba’s EasyCV All‑in‑One Self‑Supervised & Transformer Framework

Meituan Technology Team

Apr 14, 2022 · Artificial Intelligence

Short Video Content Understanding and Generation Practices at Meituan

Meituan leverages computer‑vision techniques to tag, analyze, and automatically generate short videos across consumer and merchant scenarios, detailing hierarchical tag design, self‑supervised representation learning, fine‑grained food recognition, intelligent cover creation, and pixel‑level editing to enhance content discovery and presentation.

AI content generationSemantic Segmentationcomputer vision

0 likes · 20 min read

Short Video Content Understanding and Generation Practices at Meituan

DaTaobao Tech

Apr 12, 2022 · Artificial Intelligence

ArcCSE: Angular Margin Contrastive Learning for Self‑Supervised Text Representation

ArcCSE introduces an angular‑margin contrastive loss and both pairwise (dropout‑augmented) and triple‑wise (span‑masked) relationship modeling to self‑supervise text embeddings, yielding tighter decision boundaries, higher alignment and uniformity, and superior performance on unsupervised STS, SentEval, and Alibaba’s retrieval and recommendation systems.

NLPangular margincontrastive learning

0 likes · 8 min read

ArcCSE: Angular Margin Contrastive Learning for Self‑Supervised Text Representation

DataFunSummit

Feb 22, 2022 · Artificial Intelligence

Graph Pretraining Techniques for Molecular Representation and Their Applications in Drug Discovery

This article reviews the motivation, methods, and results of graph-based self‑supervised pretraining for molecular data, introduces the ChemRL‑GEM model that incorporates 3‑D structural information, and demonstrates its superior performance on ADMET, affinity prediction, and benchmark competitions using the PaddleHelix platform.

AIChemistryGraph Neural Networks

0 likes · 18 min read

Graph Pretraining Techniques for Molecular Representation and Their Applications in Drug Discovery

Baobao Algorithm Notes

Jan 28, 2022 · Artificial Intelligence

How Masked Autoencoders Revolutionize Vision Pre‑Training: A Deep Dive

This article provides a detailed technical walkthrough of Masked Autoencoders (MAE) for computer vision, covering its BERT‑inspired masking strategy, asymmetric encoder‑decoder design, implementation specifics, experimental findings on mask ratios and decoder depth, and the resulting performance gains over supervised ViT models.

MAEMasked ModelingPyTorch

0 likes · 11 min read

How Masked Autoencoders Revolutionize Vision Pre‑Training: A Deep Dive

DataFunTalk

Jan 19, 2022 · Artificial Intelligence

ZEUS: A Self‑Supervised Multi‑Scenario Query Ranking Model for E‑commerce Search

The article presents ZEUS, a self‑supervised multi‑scenario ranking model that leverages user‑initiated behavior pre‑training to break feedback loops and improve query recommendation efficiency across diverse e‑commerce search scenarios, achieving significant gains in CTR, CVR, and GMV.

CTR Predictionmulti-scenario rankingquery recommendation

0 likes · 19 min read

ZEUS: A Self‑Supervised Multi‑Scenario Query Ranking Model for E‑commerce Search

DataFunTalk

Jan 7, 2022 · Artificial Intelligence

Group-Theoretic Self-Supervised Representation Learning (Lecture)

On Jan 7, 2024, BIT’s “Hundred Lectures” will feature Assistant Professor Hanwang Zhang presenting his group‑theoretic self‑supervised representation learning work, including the IP‑IRM method that iteratively partitions data and applies invariant risk minimization to achieve fully disentangled visual features, with the session streamed via Tencent Meeting.

AIgroup theorymachine learning

0 likes · 4 min read

Group-Theoretic Self-Supervised Representation Learning (Lecture)

Code DAO

Dec 22, 2021 · Artificial Intelligence

Understanding SimCLR: A Simple Contrastive Learning Framework for Visual Representations

This article explains SimCLR, the 2020 Google Research framework that advances self‑supervised visual pre‑training by using extensive data augmentations, a ResNet encoder, a projection‑head MLP, and the NT‑Xent loss to learn robust image representations that outperform many prior methods on ImageNet and other benchmarks.

NT-Xent lossResNetSimCLR

0 likes · 7 min read

Understanding SimCLR: A Simple Contrastive Learning Framework for Visual Representations

AntTech

Dec 16, 2021 · Artificial Intelligence

Robust AI: Ant Group’s Self‑Supervised Feature‑Compatible Model Wins NeurIPS ISC2021 Image Representation Competition

The Ant Group’s TitanShield Team secured the image representation track at NeurIPS ISC2021 using a self‑supervised, feature‑compatible pre‑training model that dramatically cuts labeling effort, speeds up training, and lowers image adversarial risk by 80%, highlighting AI robustness as a critical challenge for content‑security applications.

AI robustnessAnt GroupNeurIPS

0 likes · 5 min read

Robust AI: Ant Group’s Self‑Supervised Feature‑Compatible Model Wins NeurIPS ISC2021 Image Representation Competition

DataFunSummit

Dec 11, 2021 · Artificial Intelligence

Survey of User Representation Learning and Transfer Learning in Recommendation Systems

This article reviews recent advances in user representation learning for recommender systems, covering self‑supervised pre‑training, lifelong learning, multi‑task modeling, and large‑scale contrastive methods, and provides code and dataset links for key papers such as PeterRec, Conure, DUPN, ShopperBERT, PTUM, UPRec, and LURM.

Recommendation Systemspretrainingself-supervised learning

0 likes · 11 min read

Survey of User Representation Learning and Transfer Learning in Recommendation Systems

DataFunTalk

Nov 1, 2021 · Artificial Intelligence

Reflections on Working as an Algorithm Engineer at Meituan and the Rise of Contrastive Learning

The author shares personal experiences as a Meituan algorithm engineer, emphasizing the critical role of labeled data, the emergence of contrastive (self‑supervised) learning across computer vision, NLP, and recommendation systems, and offers practical advice for algorithm engineers to stay competitive.

AI researchMeituanalgorithm engineering

0 likes · 8 min read

Reflections on Working as an Algorithm Engineer at Meituan and the Rise of Contrastive Learning

DataFunTalk

Oct 26, 2021 · Artificial Intelligence

Contrastive Learning Perspective on Retrieval and Reranking Models in Recommendation Systems

This article explains how contrastive learning, originally popular in computer‑vision, can be interpreted and applied to recommendation‑system recall and coarse‑ranking models, covering its theoretical roots, typical architectures like SimCLR, MoCo and SwAV, and practical tricks such as in‑batch negatives, embedding normalization, temperature scaling, and graph‑based extensions.

Graph Neural NetworksRecommendation Systemscontrastive learning

0 likes · 40 min read

Contrastive Learning Perspective on Retrieval and Reranking Models in Recommendation Systems

DataFunTalk

Oct 19, 2021 · Artificial Intelligence

Graph Contrastive Learning: Foundations, Methods, and Recent Advances (GRACE & GCA)

This article reviews recent research on graph self‑supervised learning, focusing on contrastive learning fundamentals, the SimCLR‑style framework, representative models such as GRACE and its adaptive augmentation extension GCA, experimental evaluations, and future directions for graph contrastive methods.

GCAGraceGraph Neural Networks

0 likes · 16 min read

Graph Contrastive Learning: Foundations, Methods, and Recent Advances (GRACE & GCA)

DataFunSummit

Oct 19, 2021 · Artificial Intelligence

Deep Graph Contrastive Learning: GRACE and GCA

This article reviews recent advances in graph contrastive learning, introducing foundational concepts, the SimCLR framework, and representative models such as GRACE and its adaptive augmentation variant GCA, followed by experimental results, analysis, and future research directions.

GCAGraceGraph Representation

0 likes · 16 min read

Deep Graph Contrastive Learning: GRACE and GCA

DataFunTalk

Sep 29, 2021 · Artificial Intelligence

Self‑Supervised Learning and Contrastive Learning for Computer Vision and OCR Applications

This article reviews self‑supervised learning techniques, common computer‑vision pretext tasks, contrastive loss functions, popular frameworks such as SimCLR, MoCo and SimSiam, and demonstrates their application to OCR captcha recognition with detailed implementation and experimental results.

Deep LearningOCRPyTorch

0 likes · 22 min read

Self‑Supervised Learning and Contrastive Learning for Computer Vision and OCR Applications

DataFunTalk

Sep 28, 2021 · Artificial Intelligence

Graph Modeling and GCN Exploration at 极验: Evolution, Offline and Real‑time Solutions

The talk presents an overview of graph neural network development, explains 极验's graph modeling research and evolution, and details offline and real‑time GCN solutions, including self‑supervised training, large‑scale handling, and performance comparisons, highlighting practical applications in fraud detection and risk control.

Anomaly DetectionGCNGraph Modeling

0 likes · 26 min read

Graph Modeling and GCN Exploration at 极验: Evolution, Offline and Real‑time Solutions

DataFunSummit

Sep 26, 2021 · Artificial Intelligence

Contrastive Learning and Its Applications in Weibo Content Representation

This article explains the fundamentals of contrastive learning, reviews typical models such as SimCLR, MoCo, SwAV, BYOL, SimSiam and Barlow Twins, and demonstrates how these methods are applied to Weibo text and multimodal (text‑image) representation tasks like hashtag generation and image‑text matching.

MultimodalNLPWeibo

0 likes · 18 min read

Contrastive Learning and Its Applications in Weibo Content Representation

Laiye Technology Team

Sep 24, 2021 · Artificial Intelligence

Self‑Supervised Learning and Contrastive Methods for Computer Vision and OCR Applications

This article surveys self‑supervised learning techniques for computer‑vision tasks, explains common pretext tasks and contrastive loss designs, reviews representative models such as SimCLR, MoCo, SmAV and SimSiam, and demonstrates their practical impact on a captcha‑OCR system with measurable accuracy gains.

Deep LearningOCRSimCLR

0 likes · 23 min read

Self‑Supervised Learning and Contrastive Methods for Computer Vision and OCR Applications

DataFunTalk

Aug 30, 2021 · Artificial Intelligence

Contrastive Learning: Foundations, Typical Models, and Applications to Weibo Content Representation

This article explains the concept of contrastive learning, its relationship to self‑supervised and metric learning, describes key system components and loss functions, reviews major image, NLP and multimodal models such as SimCLR, MoCo, SwAV, BYOL, and demonstrates how contrastive learning is applied to Weibo hashtag generation, similar‑post retrieval, and text‑image matching using CD‑TOM and W‑CLIP models.

AIMultimodalWeibo

0 likes · 19 min read

Contrastive Learning: Foundations, Typical Models, and Applications to Weibo Content Representation

DataFunTalk

Jul 1, 2021 · Artificial Intelligence

Pre‑Trained Models: Past, Present, and Future – A Comprehensive Survey

This article surveys the evolution of pre‑trained models, covering the origins of transfer and self‑supervised learning, the rise of transformer‑based PTMs such as BERT and GPT, efficient architecture designs, multimodal and multilingual extensions, theoretical analyses, and future research directions for scalable and robust AI systems.

AI researchEfficient TrainingLarge Language Models

0 likes · 27 min read

Pre‑Trained Models: Past, Present, and Future – A Comprehensive Survey

Alibaba Cloud Developer

Jun 28, 2021 · Artificial Intelligence

How Alibaba Cloud’s MMAI Team Dominated CVPR2021 Video Action Challenges

Alibaba Cloud’s Multimedia AI team won five first‑place titles and one runner‑up across six major video‑action challenges at CVPR2021, showcasing advanced transformer‑CNN hybrids, self‑supervised initialization, and spatio‑temporal relation modeling that now power their multimedia AI cloud products.

Alibaba CloudCVPR2021multimedia AI

0 likes · 14 min read

How Alibaba Cloud’s MMAI Team Dominated CVPR2021 Video Action Challenges

Huawei Cloud Developer Alliance

May 8, 2021 · Artificial Intelligence

How Huawei’s Pangu Pre‑trained Models Slash Development Costs and Boost Vision AI

In a detailed interview, Huawei Cloud experts explain how the ultra‑large Pangu CV and NLP models—trained on billions of parameters and terabytes of data—achieve top benchmark scores, simplify developer workflows, and deliver industry‑wide deployments that dramatically cut labeling effort and iteration time.

Huawei CloudPretrained Modelsai-development

0 likes · 9 min read

How Huawei’s Pangu Pre‑trained Models Slash Development Costs and Boost Vision AI

DataFunTalk

Apr 10, 2021 · Artificial Intelligence

2020 Computer Vision Breakthroughs: Self‑Supervised Learning, Transformer Attention Modeling, and Neural Radiance Fields

The talk reviews three major 2020 advances in computer vision—self‑supervised learning surpassing supervised pre‑training, the successful adoption of Transformer‑based attention models for detection and classification, and the emergence of Neural Radiance Fields for view synthesis—while highlighting related research from Microsoft Research Asia and the broader community.

2020AI breakthroughsTransformer

0 likes · 19 min read

2020 Computer Vision Breakthroughs: Self‑Supervised Learning, Transformer Attention Modeling, and Neural Radiance Fields

JD Cloud Developers

Feb 10, 2021 · Artificial Intelligence

Three JD Tech AI Papers Shine at ICASSP 2021

At ICASSP 2021, JD Tech presented three AI research papers—introducing a Neural Kalman Filtering framework for speech enhancement, a cross‑utterance BERT‑based prosody modeling method for end‑to‑end speech synthesis, and a self‑supervised conversational query rewriting approach—each demonstrating superior performance over existing baselines on benchmark datasets.

AI researchICASSP 2021prosody modeling

0 likes · 9 min read

Three JD Tech AI Papers Shine at ICASSP 2021

Meituan Technology Team

Dec 24, 2020 · Artificial Intelligence

Meituan Unmanned Delivery Technical Salon – AI Research on Instance Segmentation, Visual Localization, Trajectory Prediction, and Depth‑Pose Learning

On January 9, 2021, Meituan hosted an unmanned‑delivery technical salon in Beijing where experts presented cutting‑edge AI research—including the CenterMask instance‑segmentation method, 3D geometry‑aware camera localization, multi‑agent trajectory prediction with attention‑based spatio‑temporal graphs, real‑time stereo visual‑inertial odometry calibration, and self‑supervised depth‑pose learning for dynamic scenes.

AIautonomous drivingcomputer vision

0 likes · 7 min read

Meituan Unmanned Delivery Technical Salon – AI Research on Instance Segmentation, Visual Localization, Trajectory Prediction, and Depth‑Pose Learning

Didi Tech

Oct 22, 2020 · Artificial Intelligence

Multi-turn Response Triggering Model (MRTM) for Intelligent Customer Service Chatbots

The article reviews Didi’s research on a Multi‑turn Response Triggering Model (MRTM) that uses self‑supervised learning and asymmetric self‑attention to decide when a customer‑service chatbot should reply, achieving higher accuracy and recall than rule‑based and supervised baselines while remaining efficient enough for production deployment.

AIChatbotcustomer service

0 likes · 12 min read

Multi-turn Response Triggering Model (MRTM) for Intelligent Customer Service Chatbots