Tagged articles
30 articles
Page 1 of 1
Data Party THU
Data Party THU
Apr 11, 2026 · Artificial Intelligence

How LLMs Are Uncovering Ultra‑Hard Carbon Allotropes in Minutes

Researchers at Xi'an Jiaotong University built a closed‑loop AI framework centered on a large language model that generates and evaluates thousands of carbon structures, rapidly discovering ultra‑hard, highly anisotropic and novel carbon allotropes such as C16_3, C12 and C8 within minutes.

AI-driven researchLLMMaterials Discovery
0 likes · 7 min read
How LLMs Are Uncovering Ultra‑Hard Carbon Allotropes in Minutes
HyperAI Super Neural
HyperAI Super Neural
Feb 5, 2026 · Artificial Intelligence

Scanning 100 Million Hubble Images in 3 Days: ESA’s AnomalyMatch Finds Over 1,000 Rare Objects

ESA’s ESAC team introduced AnomalyMatch, a semi‑supervised active‑learning framework that, with fewer than ten labeled anomalies, processed roughly 100 million Hubble cutouts in just 2–3 days, uncovering 1,339 distinct anomalous astrophysical objects such as merging galaxies, gravitational lenses, and jellyfish galaxies.

AnomalyMatchEfficientNetHubble Legacy Archive
0 likes · 16 min read
Scanning 100 Million Hubble Images in 3 Days: ESA’s AnomalyMatch Finds Over 1,000 Rare Objects
Ops Development & AI Practice
Ops Development & AI Practice
Sep 10, 2025 · Fundamentals

Why We Forget Fast and How to Turn Learning Into Lasting Knowledge

In the age of information overload, this article explains the science behind rapid forgetting, introduces Ebbinghaus’s Forgetting Curve, describes how memory consolidates from short‑term to long‑term storage, and outlines evidence‑based strategies such as active learning, spaced repetition, testing effect, and contextual association to build durable knowledge.

active learningcognitive scienceknowledge retention
0 likes · 7 min read
Why We Forget Fast and How to Turn Learning Into Lasting Knowledge
AI Algorithm Path
AI Algorithm Path
Jun 19, 2025 · Artificial Intelligence

Training Neural Networks with Minimal Labeled Data Using Active Learning

This article explains how active learning can dramatically reduce the amount of labeled data required for training deep neural networks by selecting the most informative and representative samples, and provides a complete Python implementation of a hybrid query strategy (DBAL) with ResNet‑18.

DBALDeep LearningPython
0 likes · 14 min read
Training Neural Networks with Minimal Labeled Data Using Active Learning
Tencent Advertising Technology
Tencent Advertising Technology
Nov 8, 2024 · Artificial Intelligence

Optimizing Real-Time Bidding: Machine Learning Approaches for Bid Shading and Winning Price Prediction

This article explores advanced machine learning techniques for optimizing bid shading in real-time advertising auctions, introducing a mixed censorship multi-task learning framework and a cost-effective active learning strategy to accurately predict winning price distributions and overcome sample selection bias.

Auction MechanismsBid ShadingWinning Price Prediction
0 likes · 16 min read
Optimizing Real-Time Bidding: Machine Learning Approaches for Bid Shading and Winning Price Prediction
DataFunTalk
DataFunTalk
Jun 20, 2024 · Artificial Intelligence

User Profiling Algorithms: From Ontology‑Based Methods to Deep Learning and Large Model Integration

This article provides a comprehensive overview of user profiling algorithms, covering the evolution from ontology‑based traditional methods to modern deep‑learning approaches, including structured label prediction, representation learning, active learning, and large‑model integration, while discussing challenges, practical applications, and future research directions.

Deep LearningOntologyactive learning
0 likes · 26 min read
User Profiling Algorithms: From Ontology‑Based Methods to Deep Learning and Large Model Integration
DataFunTalk
DataFunTalk
Apr 2, 2024 · Artificial Intelligence

User Portrait Algorithms: From Ontology‑Based Methods to Deep Learning and Future Directions

This article provides a comprehensive overview of user portrait algorithms, covering their historical development, ontology‑based traditional approaches, deep‑learning enhancements, representation‑learning techniques such as lookalike, active‑learning driven iteration, and the integration of large‑model world knowledge, while also discussing current challenges and future research directions.

Deep LearningOntologyRecommendation Systems
0 likes · 26 min read
User Portrait Algorithms: From Ontology‑Based Methods to Deep Learning and Future Directions
Model Perspective
Model Perspective
Mar 16, 2024 · Artificial Intelligence

What Watching a TV Drama Reveals About AI Model Training and Learning Strategies

The article draws parallels between expert viewers dissecting the drama "The Legend of Zhen Huan," efficient paper‑reading techniques, and the active‑prediction plus contrast‑learning approach that underpins modern AI model training, highlighting how proactive thinking boosts both personal and machine learning outcomes.

AI trainingPredictionactive learning
0 likes · 8 min read
What Watching a TV Drama Reveals About AI Model Training and Learning Strategies
Baobao Algorithm Notes
Baobao Algorithm Notes
Dec 11, 2023 · Artificial Intelligence

Boost Large‑Model Fine‑Tuning with Low‑Cost Data Selection and Construction

The article explains practical techniques for choosing and constructing fine‑tuning data for large language models, covering data diversity through similarity‑based clustering, semi‑supervised filtering with binary classifiers, and uncertainty‑driven sampling using perplexity or reward models to build an efficient, low‑cost pipeline.

Large ModelReward modelactive learning
0 likes · 9 min read
Boost Large‑Model Fine‑Tuning with Low‑Cost Data Selection and Construction
DataFunSummit
DataFunSummit
Nov 17, 2023 · Artificial Intelligence

Semantic‑Aware Active Learning on Graph Data for Risk Control: Tackling Sample Imbalance

This presentation discusses the challenges of label scarcity and class imbalance in graph‑based risk‑control scenarios and proposes a semantic‑aware active‑learning framework that combines uncertainty, graph structure, prototype diversity, and double‑channel information alignment to improve node classification performance.

active learninggraph datagraph neural networks
0 likes · 18 min read
Semantic‑Aware Active Learning on Graph Data for Risk Control: Tackling Sample Imbalance
DataFunTalk
DataFunTalk
Sep 21, 2023 · Artificial Intelligence

Active Learning and Sample Imbalance in Graph Data for Risk Control

This presentation explores the challenges of label scarcity and class imbalance in graph‑based risk‑control scenarios, proposing semantic‑aware active learning and prototype‑driven sampling strategies to improve node classification performance on imbalanced graph datasets.

active learninggraph datagraph neural networks
0 likes · 16 min read
Active Learning and Sample Imbalance in Graph Data for Risk Control
Amap Tech
Amap Tech
Jun 8, 2023 · Fundamentals

The Essence of Learning: Active vs. Passive Approaches and Their Impacts

Learning enriches knowledge but, when pursued passively or excessively, can crowd out personal thought, stifle creativity, and make thinking rigid, so a healthy approach blends active, experience‑based insight, selective classic sources, and continual self‑reflection to turn knowledge into genuine understanding.

active learningcognitioncritical thinking
0 likes · 36 min read
The Essence of Learning: Active vs. Passive Approaches and Their Impacts
58 Tech
58 Tech
May 11, 2023 · Artificial Intelligence

Stella Data Annotation Platform: Design, Architecture, and AI‑Assisted Labeling

The article details the design and implementation of the Stella data annotation SaaS platform at 58.com, covering its background, evolution, modular architecture, annotation capabilities across text, image, audio, and video, AI‑assisted labeling, storage solutions, quality and efficiency management, as well as localization and licensing considerations.

AI PlatformSystem Architectureactive learning
0 likes · 21 min read
Stella Data Annotation Platform: Design, Architecture, and AI‑Assisted Labeling
NetEase Cloud Music Tech Team
NetEase Cloud Music Tech Team
Jan 10, 2023 · Artificial Intelligence

Sentiment Classification and Topic Clustering for NetEase Cloud Music Comments

To boost NetEase Cloud Music’s comment handling, the authors combine active‑learning‑driven relabeling, domain‑specific MLM pretraining, contrastive‑learning‑based sample expansion, and multi‑task BERT sharing to raise sentiment‑classification precision and recall above 90 % and double moderation clean‑rate, while employing prompt‑generated story themes, IP‑focused classifiers, and hot‑word aggregation for effective short‑text topic clustering and scalable, theme‑aware distribution.

NLPSentiment Analysisactive learning
0 likes · 10 min read
Sentiment Classification and Topic Clustering for NetEase Cloud Music Comments
Zuoyebang Tech Team
Zuoyebang Tech Team
Nov 9, 2022 · Artificial Intelligence

Boost Data Annotation Efficiency with VAPAL: Active Learning Meets Virtual Adversarial Perturbation

This article explains how a pool‑based active learning framework that combines uncertainty sampling (using BADGE, ALPS, or virtual adversarial perturbations) with diversity‑driven clustering can dramatically cut labeling costs for Transformer‑based NLP models, and presents experimental results showing VAPAL’s competitive performance and early‑stage advantages.

NLPactive learningdata annotation
0 likes · 10 min read
Boost Data Annotation Efficiency with VAPAL: Active Learning Meets Virtual Adversarial Perturbation
Zuoyebang Tech Team
Zuoyebang Tech Team
Sep 15, 2022 · Artificial Intelligence

How We Replaced BERT with a Lightweight TextCNN to Slash GPU Costs

This article describes the production challenges of using BERT for large‑scale text classification at Zuoyebang, explores lightweight alternatives such as knowledge distillation, pruning and quantization, and details a teacher‑student‑active‑learning pipeline that trains a TextCNN model to match BERT performance while dramatically reducing GPU consumption and improving throughput.

BERTModel DeploymentNLP
0 likes · 13 min read
How We Replaced BERT with a Lightweight TextCNN to Slash GPU Costs
Zhuanzhuan Tech
Zhuanzhuan Tech
Aug 17, 2022 · Artificial Intelligence

Designing a Scalable Image Classification System for Prohibited Item Detection in a Second‑hand E‑commerce Platform

This article describes how a second‑hand e‑commerce company built a fast, modular image‑classification pipeline using small binary classifiers, efficientNet‑b0, and active‑learning‑driven data annotation to detect prohibited items while keeping inference latency under 200 ms and reducing labeling costs dramatically.

AIImage ClassificationModel architecture
0 likes · 10 min read
Designing a Scalable Image Classification System for Prohibited Item Detection in a Second‑hand E‑commerce Platform
DataFunSummit
DataFunSummit
Jun 23, 2022 · Artificial Intelligence

Unlocking Data Potential: Automatic Data Augmentation, Denoising, Active Learning, and Data Splitting

The talk explains how to maximize the value of training data by exploring background on model generalization, automatic data augmentation techniques, denoising strategies, active learning for selecting unlabeled samples, and robust data splitting methods, offering practical guidelines for AI practitioners.

AIData Qualityactive learning
0 likes · 16 min read
Unlocking Data Potential: Automatic Data Augmentation, Denoising, Active Learning, and Data Splitting
360 Smart Cloud
360 Smart Cloud
Sep 13, 2021 · Artificial Intelligence

Active Learning: Concepts, Workflow, Strategies, and Evaluation Metrics

Active learning addresses the high cost of labeling data by iteratively selecting the most informative unlabeled samples for annotation, thereby reducing labeling effort while achieving target model performance, and the article explains its fundamentals, relationship to supervised and semi‑supervised learning, common selection strategies, hybrid methods, and evaluation metrics.

Labeling Cost ReductionQuery by Committeeactive learning
0 likes · 7 min read
Active Learning: Concepts, Workflow, Strategies, and Evaluation Metrics
Meituan Technology Team
Meituan Technology Team
Aug 19, 2021 · Artificial Intelligence

Few-Shot Learning Methods and Applications in Meituan NLP

Meituan’s NLP team leverages few‑shot learning—using data‑augmentation, semi‑supervised, ensemble/self‑training, and domain‑adaptation techniques—to cut annotation costs, achieving 1–2 percentage‑point accuracy gains on internal benchmarks and deploying high‑performing models for tasks such as topic classification, fake‑review detection, and sentiment analysis, while planning broader platform and model extensions.

Few‑Shot LearningNLPSemi-supervised Learning
0 likes · 29 min read
Few-Shot Learning Methods and Applications in Meituan NLP
58 Tech
58 Tech
Aug 10, 2021 · Artificial Intelligence

Active Learning and Model Enhancements for Semantic Tag Mining in 58.com Voice Data

This article presents a comprehensive study on extracting semantic tags from 58.com voice data, detailing the use of active learning to address cold‑start problems, comparing keyword matching, XGBoost, TextCNN, CRNN, and an improved Wide&Deep model, and demonstrating significant reductions in labeling effort and superior F1 scores across multiple experiments.

CRNNactive learningmodel comparison
0 likes · 15 min read
Active Learning and Model Enhancements for Semantic Tag Mining in 58.com Voice Data
DataFunTalk
DataFunTalk
Sep 13, 2020 · Artificial Intelligence

Active Learning: Concepts, Query Strategies, and Applications

Active Learning is a machine learning approach that reduces labeling costs by iteratively selecting the most informative samples for human annotation, using various query strategies such as uncertainty sampling, query-by-committee, expected model change, and density-weighted methods, applicable to domains like image classification, security risk control, and anomaly detection.

Labeling Cost ReductionQuery Strategiesactive learning
0 likes · 15 min read
Active Learning: Concepts, Query Strategies, and Applications
Alibaba Cloud Developer
Alibaba Cloud Developer
Jul 7, 2020 · Artificial Intelligence

How Active Learning Can Cut Labeling Costs and Boost Model Performance

This article explains active learning techniques that let models select valuable training samples, reducing annotation costs and improving performance, and describes business‑specific adaptations, experiments, and results that demonstrate its effectiveness in content‑safety applications.

active learningbatch samplingdata annotation
0 likes · 14 min read
How Active Learning Can Cut Labeling Costs and Boost Model Performance
DataFunTalk
DataFunTalk
Apr 5, 2020 · Artificial Intelligence

WeChat Hotspot Mining Platform: Architecture, Detection, and Presentation

This article describes a WeChat hotspot mining platform that integrates multiple data sources, builds quality and prediction models, employs advanced clustering and multi‑granular text matching techniques, and uses generative active learning to efficiently discover, predict, and present news hotspots for users.

WeChatactive learninghotspot detection
0 likes · 17 min read
WeChat Hotspot Mining Platform: Architecture, Detection, and Presentation
Alibaba Cloud Developer
Alibaba Cloud Developer
Dec 3, 2019 · Artificial Intelligence

How Alibaba Detects ‘Disgusting’ Images on Taobao with AI

This article describes Alibaba's AI system for automatically filtering nauseating product images on Taobao, covering challenges such as cold‑start, class imbalance, and diverse visual features, and detailing solutions like semi‑supervised learning, active learning, OHEM‑cascade, attention mechanisms, and the resulting business impact.

Attention MechanismE-commerce AIImage Classification
0 likes · 15 min read
How Alibaba Detects ‘Disgusting’ Images on Taobao with AI
Model Perspective
Model Perspective
Sep 8, 2018 · Fundamentals

How the STEM SOS Model Transforms Classroom Engagement and Learning

The article explains the STEM SOS teaching approach—an interactive, project‑based model that blends concept lectures, hands‑on experiments, student‑led presentations, and multimedia resources—to boost student participation, confidence, and deep understanding of scientific concepts.

Project-Based LearningSTEM Educationactive learning
0 likes · 8 min read
How the STEM SOS Model Transforms Classroom Engagement and Learning
Alibaba Cloud Developer
Alibaba Cloud Developer
May 7, 2018 · Artificial Intelligence

How Active PU Learning Boosts Cash‑Out Fraud Detection by 3×

This article presents an Active PU Learning framework that combines active learning with two‑step PU semi‑supervised learning to improve cash‑out fraud detection, reducing labeling costs, enhancing model performance, and achieving a three‑fold increase in identified fraudulent transactions compared to traditional unsupervised methods.

AIRisk DetectionSemi-supervised Learning
0 likes · 15 min read
How Active PU Learning Boosts Cash‑Out Fraud Detection by 3×
AntTech
AntTech
Apr 16, 2018 · Artificial Intelligence

Active PU Learning for Cash‑Out Fraud Detection in Alipay’s AlphaRisk Engine

This article presents an Active PU Learning framework that combines active learning with two‑step positive‑unlabeled learning to improve cash‑out fraud detection in Alipay’s fifth‑generation risk engine, AlphaRisk, achieving three‑fold identification gains over unsupervised methods while reducing labeling costs.

Semi-supervised Learningactive learningfraud detection
0 likes · 14 min read
Active PU Learning for Cash‑Out Fraud Detection in Alipay’s AlphaRisk Engine