Tag

Training Efficiency

0 views collected around this technical thread.

Xiaohongshu Tech REDtech
Xiaohongshu Tech REDtech
Jun 6, 2025 · Artificial Intelligence

How dots.llm1 Sets New Benchmarks for Open‑Source MoE Language Models

dots.llm1, an open‑source 142‑billion‑parameter Mixture‑of‑Experts language model from hi lab, achieves Qwen2.5‑72B‑level performance after training on 11.2 T high‑quality tokens, and the release includes full models, intermediate checkpoints, and detailed training pipelines for the research community.

AI researchMixture of ExpertsTraining Efficiency
0 likes · 10 min read
How dots.llm1 Sets New Benchmarks for Open‑Source MoE Language Models
DataFunTalk
DataFunTalk
Mar 9, 2025 · Artificial Intelligence

Critique Fine-Tuning (CFT): Boosting Large Language Model Reasoning with Minimal Data

The paper introduces Critique Fine-Tuning (CFT), a method that replaces simple imitation in supervised fine‑tuning with critique‑based learning, achieving superior reasoning performance on mathematical benchmarks using only 50 K samples, outperforming traditional reinforcement‑learning approaches that require millions of examples.

AI reasoningCritique Fine-TuningMathematical Benchmarks
0 likes · 7 min read
Critique Fine-Tuning (CFT): Boosting Large Language Model Reasoning with Minimal Data
ZhongAn Tech Team
ZhongAn Tech Team
Feb 16, 2025 · Artificial Intelligence

DeepSeek R1 and V3: Model Innovations, Industry Impact, and Future Trends

The article reviews DeepSeek's open‑source R1 and V3 large language models, highlighting their technical breakthroughs, cost advantages, expert opinions, industry adoption across chips, cloud services, and applications, and discusses future directions for model scaling, distillation, and AI competition.

AI IndustryAI competitionDeepSeek
0 likes · 13 min read
DeepSeek R1 and V3: Model Innovations, Industry Impact, and Future Trends
DataFunTalk
DataFunTalk
Dec 27, 2022 · Artificial Intelligence

Efficient Training for Very Large‑Scale Face Recognition and the FFC Framework

This article reviews the challenges of ultra‑large‑scale face recognition, presents existing solutions such as metric learning, PFC and VFC, and details the proposed FFC framework with dual loaders, ID groups, probe and gallery networks, plus experimental results showing its cost‑effective performance.

AIComputer VisionTraining Efficiency
0 likes · 7 min read
Efficient Training for Very Large‑Scale Face Recognition and the FFC Framework