Tagged articles
9 articles
Page 1 of 1
Fun with Large Models
Fun with Large Models
Mar 20, 2026 · Artificial Intelligence

Step‑by‑Step Guide to Using LLaMAFactory for Full‑Cycle Large‑Model Training (Part 9)

This article walks through the complete workflow of fine‑tuning a Qwen2.5‑0.5B model with LLaMAFactory, covering environment setup, model download, dataset preparation, configuration editing, training execution, LoRA weight merging, and deployment via vLLM, while highlighting the framework’s minimal‑code and broad model support.

AI model trainingLLaMAFactoryLoRA
0 likes · 12 min read
Step‑by‑Step Guide to Using LLaMAFactory for Full‑Cycle Large‑Model Training (Part 9)
Tencent Tech
Tencent Tech
May 7, 2025 · Artificial Intelligence

How Tencent’s DeepEP Doubles GPU Communication Speed on RoCE Networks

Tencent engineers highlighted a massive speedup in DeepSeek’s open‑source DeepEP communication framework, revealing how their TRMT‑based optimizations—dynamic multi‑QP topology awareness, IBGDA‑driven CPU‑bypass, and atomic signaling—boost RoCE network throughput up to 300% and add another 30% gain when applied to InfiniBand, effectively doubling GPU communication performance for large AI models.

AI model trainingDeepEPGPU communication
0 likes · 8 min read
How Tencent’s DeepEP Doubles GPU Communication Speed on RoCE Networks
Sohu Tech Products
Sohu Tech Products
Apr 2, 2025 · Artificial Intelligence

How SecretFlow Enables Privacy‑Preserving AI Model Training with Secure Multi‑Party Computation

SecretFlow is an open‑source privacy‑computing framework that lets multiple parties perform encrypted data analysis and AI model training without exposing raw data, offering unified MPC, federated learning and differential privacy features, with step‑by‑step Docker installation, Python examples, and a modular architecture for secure multi‑party computation.

AI model trainingData ProtectionPrivacy Computing
0 likes · 11 min read
How SecretFlow Enables Privacy‑Preserving AI Model Training with Secure Multi‑Party Computation
Alibaba Cloud Developer
Alibaba Cloud Developer
Feb 28, 2025 · Artificial Intelligence

How DeepSeek’s RL‑Powered Time Scaling Is Redefining AI Model Training

DeepSeek’s rapid rise is examined through its RL‑based Time Scaling paradigm, cost‑effective architecture, innovative training pipeline, open‑source strategy, and security challenges, highlighting how these breakthroughs disrupt traditional AI model development, lower resource demands, and influence industry dynamics.

AI model trainingDeepSeekModel architecture
0 likes · 13 min read
How DeepSeek’s RL‑Powered Time Scaling Is Redefining AI Model Training
DevOps
DevOps
Feb 23, 2025 · Artificial Intelligence

Understanding Reinforcement Learning, RLHF, PPO and GRPO for AI Applications

This article explains how DeepSeek‑R1‑Zero uses group‑relative policy optimization (GRPO) to enhance inference without labeled data, introduces reinforcement learning with human feedback (RLHF) and its components, and compares the PPO and GRPO algorithms, highlighting their suitable engineering scenarios and practical implications for AI applications.

AI model trainingDeep LearningGRPO
0 likes · 15 min read
Understanding Reinforcement Learning, RLHF, PPO and GRPO for AI Applications
Alibaba Cloud Developer
Alibaba Cloud Developer
Dec 4, 2024 · Artificial Intelligence

How Alibaba’s GTE‑Multilingual Models Boost RAG with Long‑Doc and Multi‑Language Support

Alibaba's Tongyi Lab introduces the GTE‑Multilingual series, high‑performance encoder‑only models that support 8k‑token texts, 75 languages, elastic and sparse embeddings, and demonstrate superior retrieval‑augmented generation performance across multilingual and long‑document benchmarks.

AI model trainingSparse Embeddingelastic embedding
0 likes · 18 min read
How Alibaba’s GTE‑Multilingual Models Boost RAG with Long‑Doc and Multi‑Language Support
Rare Earth Juejin Tech Community
Rare Earth Juejin Tech Community
Jul 30, 2023 · Artificial Intelligence

Understanding Codex: Training Framework, Evaluation Methodology, and Model Performance in ChatGPT’s Code Generation Ability

This article explains how Codex, built on the GPT‑3.5 architecture, is trained and fine‑tuned to give ChatGPT the ability to generate code, detailing the data collection, supervised fine‑tuning, evaluation using HumanEval and the pass@k metric, and presenting performance comparisons with GPT‑3 and Codex‑S.

AI model trainingChatGPTCodex
0 likes · 11 min read
Understanding Codex: Training Framework, Evaluation Methodology, and Model Performance in ChatGPT’s Code Generation Ability