Tagged articles
7 articles
Page 1 of 1
Machine Heart
Machine Heart
Apr 15, 2026 · Artificial Intelligence

DataFlex: An Industrial‑Grade Dynamic Data Training System for Large Models

DataFlex, built on LLaMA‑Factory, offers a unified, reproducible infrastructure that dynamically selects, mixes, and re‑weights training data, turning data into a controllable optimization object and delivering measurable gains in training efficiency and model performance for large‑scale AI models.

DataFlexData‑Centric AIDynamic Data Training
0 likes · 14 min read
DataFlex: An Industrial‑Grade Dynamic Data Training System for Large Models
Sohu Tech Products
Sohu Tech Products
Nov 26, 2025 · Artificial Intelligence

How Cleanlab Cut Data Review by 34×: A Real‑World Text Classification Case Study

This article walks through a real text‑classification project where noisy labels inflated the review workload to over 15,000 samples, and shows how using cleanlab’s confident‑learning framework reduced the manual audit set to 438 items, boosting efficiency by thirty‑four times while improving model performance.

Data QualityData‑Centric AIcleanlab
0 likes · 16 min read
How Cleanlab Cut Data Review by 34×: A Real‑World Text Classification Case Study
Architects' Tech Alliance
Architects' Tech Alliance
Dec 23, 2024 · Artificial Intelligence

Why High‑Quality, Massive, Diverse Data Fuels AI Breakthroughs

The article explains how breakthroughs in artificial intelligence depend on high‑quality, large‑scale, and diverse training data, outlines the data‑centric AI movement, details a six‑step workflow for building datasets, and surveys the data industry ecosystem supporting large language model development.

AI dataData QualityData‑Centric AI
0 likes · 7 min read
Why High‑Quality, Massive, Diverse Data Fuels AI Breakthroughs
DataFunTalk
DataFunTalk
Jun 19, 2023 · Artificial Intelligence

Rensselaer Polytechnic Institute (RPI) Computer Science Faculty, Resources, and PhD/Intern Recruitment Overview

The announcement introduces RPI's prestigious computer science department, its extensive GPU resources, collaborations with IBM Research, and detailed profiles of three incoming faculty members—highlighting their research areas in graph neural networks, trustworthy AI, data‑centric AI, drug‑design generative models, and neural‑symbolic reasoning—while inviting PhD and intern applicants to apply with full scholarships and funding support.

Data‑Centric AILLMPhD Recruitment
0 likes · 8 min read
Rensselaer Polytechnic Institute (RPI) Computer Science Faculty, Resources, and PhD/Intern Recruitment Overview
Top Architect
Top Architect
Apr 12, 2023 · Artificial Intelligence

Data‑Centric AI Perspective on GPT Models: Training, Inference, and Maintenance

This article examines how large language models such as GPT‑1 through GPT‑4 succeed largely due to high‑quality, large‑scale training data, and explains the Data‑centric AI framework—training data development, inference data development, and data maintenance—while discussing prompt engineering, data‑driven improvements, and future trends in AI.

AIData‑Centric AIGPT
0 likes · 19 min read
Data‑Centric AI Perspective on GPT Models: Training, Inference, and Maintenance
DataFunSummit
DataFunSummit
Dec 23, 2022 · Artificial Intelligence

Data‑Centric AI Practices for Content Moderation at NetEase Yidun

The article presents NetEase Yidun’s data‑centric AI approach to content moderation, covering the background of Data‑Centric AI, the specific business and data challenges of content safety, comprehensive data pipelines—including collection, labeling, augmentation, selection, cleaning, iteration and testing—and the role of self‑, semi‑ and weak‑supervised learning in enhancing algorithm performance.

Algorithm InnovationData ManagementData‑Centric AI
0 likes · 19 min read
Data‑Centric AI Practices for Content Moderation at NetEase Yidun
DataFunTalk
DataFunTalk
Oct 27, 2022 · Artificial Intelligence

Data‑Centric AI and MLOps: A Case Study of Smart‑Cabin Applications in the Automotive Industry

The talk by Magic Data’s founder Zhang Qingqing outlines the shift from model‑centric to data‑centric AI, introduces Data‑Centric MLOps methodology, and demonstrates its automotive smart‑cabin application, highlighting data quality requirements, collaborative workflow, and performance gains across speech, live‑social and navigation scenarios.

AI applicationsAutomotive AIData‑Centric AI
0 likes · 9 min read
Data‑Centric AI and MLOps: A Case Study of Smart‑Cabin Applications in the Automotive Industry