Tagged articles
43 articles
Page 1 of 1
SuanNi
SuanNi
Apr 29, 2026 · Artificial Intelligence

Why Google’s Split 8th‑Gen TPU Could Out‑Earn General‑Purpose GPUs

Google’s Cloud Next 2026 reveal splits the 8th‑generation TPU into training‑focused Sunfish and inference‑focused Zebrafish, highlighting Ironwood’s record‑breaking performance, a multi‑vendor supply chain, Anthropic’s multi‑gigawatt order, and a broader industry shift toward custom AI chips that promise far higher profit margins than generic GPUs.

AICustom ASICGoogle
0 likes · 8 min read
Why Google’s Split 8th‑Gen TPU Could Out‑Earn General‑Purpose GPUs
Machine Heart
Machine Heart
Apr 23, 2026 · Artificial Intelligence

Google's TPU 8t and 8i: Training Powerhouse vs. Inference Specialist

Google unveiled its eighth‑generation TPU line at Cloud Next 2026, introducing the training‑focused TPU 8t with a 2.7× performance boost and massive scaling, and the inference‑optimized TPU 8i featuring three‑times more on‑chip SRAM and an 80% performance uplift for agentic AI workloads, while positioning the chips as a complement—not a replacement—to Nvidia's offerings.

AI hardwareAgentic AIGoogle Cloud
0 likes · 9 min read
Google's TPU 8t and 8i: Training Powerhouse vs. Inference Specialist
Fun with Large Models
Fun with Large Models
Apr 1, 2026 · Artificial Intelligence

A Beginner's Deep Dive into Large‑Model Training Parameters with LLaMAFactory

This article walks readers through the three major training methods—full‑parameter, LoRA, and QLoRA—explaining their memory costs, data requirements, and trade‑offs, then provides a line‑by‑line breakdown of LLaMAFactory configuration files, hyper‑parameter tuning guidelines, and the process for merging LoRA adapters into a deployable model.

LLaMAFactoryLoRAQLoRA
0 likes · 27 min read
A Beginner's Deep Dive into Large‑Model Training Parameters with LLaMAFactory
Baidu Intelligent Cloud Tech Hub
Baidu Intelligent Cloud Tech Hub
Jan 27, 2026 · Artificial Intelligence

Deploying Qwen3 on Kunlun P800: Full‑Parameter DPO Training and Inference Guide

This guide walks through setting up a Kunlun P800 XPU host, preparing Docker containers, deploying Qwen3‑8B/‑32B/‑VL models with vLLM‑Kunlun, benchmarking performance, and running full‑parameter DPO training using LLaMA‑Factory, providing scripts, configuration files, and troubleshooting tips for AI engineers.

DPOInferenceKunlun P800
0 likes · 32 min read
Deploying Qwen3 on Kunlun P800: Full‑Parameter DPO Training and Inference Guide
PaperAgent
PaperAgent
Dec 13, 2025 · Artificial Intelligence

Why Unified Multimodal Models Are the Key to Next‑Gen AGI – A Deep Survey

This article surveys the latest research on Unified Multimodal Foundations (UFM), explaining why integrating understanding and generation across text, image, video, and audio is essential for AGI, and detailing modeling paradigms, encoding/decoding strategies, training pipelines, benchmarks, and real‑world applications.

AI researchBenchmarkTraining
0 likes · 10 min read
Why Unified Multimodal Models Are the Key to Next‑Gen AGI – A Deep Survey
Baidu Intelligent Cloud Tech Hub
Baidu Intelligent Cloud Tech Hub
Sep 4, 2025 · Artificial Intelligence

Unlocking MoE Model Power: Baidu’s Baige 5.0 AI Platform’s FP8 and Distributed Innovations

Baidu’s Baige 5.0 AI Computing Platform introduces FP8 mixed‑precision training, MoE‑aware distributed strategies, adaptive parallelism, and a three‑tier KV‑Cache, delivering over 30% training speedup and 50% inference throughput gains while keeping token latency under half a second for large‑scale models.

AIFP8Inference
0 likes · 16 min read
Unlocking MoE Model Power: Baidu’s Baige 5.0 AI Platform’s FP8 and Distributed Innovations
Volcano Engine Developer Services
Volcano Engine Developer Services
Aug 6, 2025 · Artificial Intelligence

How VeOmni Revolutionizes Multimodal Model Training with 40% Speed Gains

VeOmni, ByteDance’s open‑source unified multimodal training framework, tackles fragmented training pipelines by integrating LoRA fine‑tuning, FSDP, Ulysses, and Expert Parallel, delivering up to 40% higher throughput, up to 55% memory savings, and streamlined one‑click deployment for LLM, VLM, and video models.

AIFrameworkParallelism
0 likes · 14 min read
How VeOmni Revolutionizes Multimodal Model Training with 40% Speed Gains
Baidu Geek Talk
Baidu Geek Talk
Apr 2, 2025 · Artificial Intelligence

DeepSeek-VL2 Multimodal Model: Architecture, Training, and Code Walkthrough

DeepSeek‑VL2 is a state‑of‑the‑art multimodal model built on a Mixture‑of‑Experts architecture that combines a SigLIP‑L vision encoder with dynamic tiling, a two‑layer VL adaptor, and a DeepSeek‑MoE language model using Multi‑head Latent Attention, trained in three stages on diverse visual‑language and text data, and achieving strong results on benchmarks such as DocVQA and TextVQA, with full implementation and inference code available in PaddleMIX.

DeepSeek-VL2InferenceMixture of Experts
0 likes · 36 min read
DeepSeek-VL2 Multimodal Model: Architecture, Training, and Code Walkthrough
JD Cloud Developers
JD Cloud Developers
Mar 3, 2025 · Artificial Intelligence

How JD.com Leverages Domestic NPU Chips to Power Large‑Scale AI Models

This article details JD.com's challenges and solutions for deploying domestic NPU chips across heterogeneous GPU‑NPU clusters, covering architecture, scheduling, high‑performance training and inference engines, real‑world case studies, and future plans to scale AI workloads securely and efficiently.

AIDomestic ChipsInference
0 likes · 19 min read
How JD.com Leverages Domestic NPU Chips to Power Large‑Scale AI Models
Code Mala Tang
Code Mala Tang
Mar 1, 2025 · Artificial Intelligence

Why Do Large Language Models Hallucinate and How Can We Fix It?

This article explains why large language models produce plausible‑looking but false information, traces the problem to the supervised fine‑tuning stage, and outlines mitigation techniques such as knowledge interrogation, RLHF, and tool‑augmented search to reduce hallucinations.

LLMRLHFTraining
0 likes · 12 min read
Why Do Large Language Models Hallucinate and How Can We Fix It?
Architect
Architect
Feb 21, 2025 · Artificial Intelligence

DeepSeek Model Innovations: Architecture, Training Methods, and Performance Evaluation

This article reviews DeepSeek's recent breakthroughs, including the MLA attention redesign, GRPO alignment algorithm, MoE enhancements, multi‑stage training pipelines (SFT, RL, preference tuning, distillation), and comparative performance against GPT‑4o‑Mini and Llama 3.1, highlighting both strengths and remaining challenges.

DeepSeekMixture of ExpertsModel Evaluation
0 likes · 16 min read
DeepSeek Model Innovations: Architecture, Training Methods, and Performance Evaluation
Practical DevOps Architecture
Practical DevOps Architecture
Feb 20, 2025 · Artificial Intelligence

Training MiniDeepSeek V3+R1 from Scratch: Full-Scale Large Model Technical Practice for 2025

This tutorial series provides a step‑by‑step technical guide to training, deploying, and fine‑tuning the MiniDeepSeek V3+R1 large language model, covering model performance, open‑source details, API usage, parameter explanation, multi‑turn chatbot construction, function calling, integration with Open WebUI, GraphRAG, Swarm, and various deployment and optimization techniques.

AIMiniDeepSeekTraining
0 likes · 4 min read
Training MiniDeepSeek V3+R1 from Scratch: Full-Scale Large Model Technical Practice for 2025
Baobao Algorithm Notes
Baobao Algorithm Notes
Oct 7, 2024 · Artificial Intelligence

Mastering LLM Supervised Fine‑Tuning: Practical Tips, Data Strategies, and Debugging

This article provides a comprehensive, experience‑driven guide to supervised fine‑tuning (SFT) of large language models, covering special tokens, latency considerations, data diversity and production, training frameworks and hyper‑parameters, over‑/under‑fitting diagnostics, and evaluation metrics such as helpfulness, honesty, and harmlessness.

AILLMSFT
0 likes · 40 min read
Mastering LLM Supervised Fine‑Tuning: Practical Tips, Data Strategies, and Debugging
JD Retail Technology
JD Retail Technology
Aug 30, 2024 · Artificial Intelligence

GPU Optimization Practices for Training and Inference in JD Advertising Recommendation Systems

The article details JD Advertising's technical challenges and solutions for large‑scale sparse recommendation models, describing GPU‑focused storage, compute and I/O optimizations for both training and low‑latency inference, including distributed pipelines, heterogeneous deployment, batch aggregation, multi‑stream execution, and compiler extensions.

Distributed SystemsGPU OptimizationInference
0 likes · 13 min read
GPU Optimization Practices for Training and Inference in JD Advertising Recommendation Systems
Baobao Algorithm Notes
Baobao Algorithm Notes
Aug 29, 2024 · Artificial Intelligence

Why RLHF Is Essential: The Limits of SFT and the Power of Reward Modeling

The article analyzes why Reinforcement Learning from Human Feedback (RLHF) cannot be replaced by Supervised Fine‑Tuning (SFT), highlighting SFT's lack of negative feedback, its one‑directional attention limitation, and how RLHF's reward models provide crucial safety and performance improvements for large language models.

AI AlignmentRLHFSFT
0 likes · 9 min read
Why RLHF Is Essential: The Limits of SFT and the Power of Reward Modeling
DataFunTalk
DataFunTalk
Jul 26, 2024 · Artificial Intelligence

Llama 3: Open‑source Large Language Model Technical Report and Evaluation

This comprehensive technical report details the development, architecture, training methodology, extensive benchmark evaluations, safety measures, and inference optimizations of Meta's open‑source Llama 3 large language model series, covering models up to 405 billion parameters and supporting multilingual, multimodal, and tool‑use capabilities.

AILLaMATraining
0 likes · 115 min read
Llama 3: Open‑source Large Language Model Technical Report and Evaluation
NewBeeNLP
NewBeeNLP
Jul 24, 2024 · Industry Insights

From Black Iron to Silver: The Evolution of Large Model Infrastructure (2019‑2024)

The article traces the evolution of large‑model training and inference infrastructure from the early “black‑iron” era (2019‑2021) through the “golden” boom (2022‑2023) to the emerging “silver” phase (2024‑), highlighting key research breakthroughs, open‑source frameworks, hardware trends, market dynamics, and practical challenges for engineers entering the field.

AI InfrastructureInferenceLarge Model
0 likes · 22 min read
From Black Iron to Silver: The Evolution of Large Model Infrastructure (2019‑2024)
Architects' Tech Alliance
Architects' Tech Alliance
Jun 22, 2024 · Artificial Intelligence

Rising Compute Demand of Generative AI Models and GPU Accelerator Trends in 2024

The article analyzes how generative AI models from GPT‑1 to the upcoming GPT‑5 are driving exponential growth in compute requirements, prompting massive cloud capital expenditures and intense competition among GPU vendors such as NVIDIA, AMD, Google, and emerging domestic chip makers, while also highlighting interconnect innovations and cost‑effective solutions.

AIAcceleratorsCompute
0 likes · 12 min read
Rising Compute Demand of Generative AI Models and GPU Accelerator Trends in 2024
IT Services Circle
IT Services Circle
May 2, 2024 · Artificial Intelligence

LLM.c: A 1000‑Line C Implementation for Training GPT‑2

Andrej Karpathy’s LLM.c project demonstrates how a compact, pure‑C (and CUDA) codebase of roughly 1000 lines can train a GPT‑2 model, covering data preparation, memory management, layer implementations, compilation, and practical tips for running and testing the model on CPUs and GPUs.

AICCUDA
0 likes · 10 min read
LLM.c: A 1000‑Line C Implementation for Training GPT‑2
NewBeeNLP
NewBeeNLP
Mar 8, 2024 · Industry Insights

Why Building LLMs Is Like Buying a Hardware Lottery – Lessons from a Startup

The article recounts Yi Tay’s experience founding Reka and building large language models from scratch, highlighting the unpredictable quality of GPU clusters, the challenges of multi‑cluster orchestration, code‑base choices, and how startups must rely on fast, intuition‑driven experimentation to succeed.

Cluster ManagementGPUHardware
0 likes · 12 min read
Why Building LLMs Is Like Buying a Hardware Lottery – Lessons from a Startup
DataFunSummit
DataFunSummit
Feb 11, 2024 · Artificial Intelligence

GPU-Accelerated Model Service and Optimization Practices at Xiaohongshu

This article details Xiaohongshu's end‑to‑end GPU‑based transformation of its recommendation and search models, covering background, model characteristics, training and inference frameworks, system‑level and GPU‑level optimizations, compilation tricks, hardware upgrades, and future directions for large‑scale machine‑learning infrastructure.

GPUModel ServingTraining
0 likes · 18 min read
GPU-Accelerated Model Service and Optimization Practices at Xiaohongshu
Baobao Algorithm Notes
Baobao Algorithm Notes
Jan 2, 2024 · Artificial Intelligence

Uncovering Mixtral‑8x7B: How MoE Experts Shape Performance and Training

This article analyses the Mixtral‑8x7B Mixture‑of‑Experts LLM, explains its gate‑driven 8‑expert architecture, presents a simplified PyTorch implementation, and reports a series of experiments that probe top‑2 gating during training, individual expert contributions, task‑specific pre‑training, the impact of expert count, and similarity with Mistral‑7B, ultimately offering hypotheses about its training pipeline.

LLMMixtralMixture of Experts
0 likes · 14 min read
Uncovering Mixtral‑8x7B: How MoE Experts Shape Performance and Training
DataFunTalk
DataFunTalk
Dec 1, 2023 · Artificial Intelligence

GPU‑Driven Model Service and Optimization Practices in Xiaohongshu's Search Scenario

This article details Xiaohongshu's end‑to‑end GPU‑centric transformation for search‑related machine‑learning models, covering model characteristics, training and inference frameworks, system‑level GPU and CPU optimizations, multi‑card and compilation techniques, and future directions for scaling large sparse and dense models.

GPU OptimizationInferenceModel Serving
0 likes · 16 min read
GPU‑Driven Model Service and Optimization Practices in Xiaohongshu's Search Scenario
Kujiale Project Management
Kujiale Project Management
Oct 31, 2023 · R&D Management

How a One‑Day Scrum Master Workshop Transforms Agile Performance

This article details a company’s year‑long evolution of a one‑day Scrum Master training program, covering its motivation, survey‑driven redesign, scenario creation, theory delivery techniques, domino‑based sandbox exercises, key agile practices like time‑boxing and retrospectives, and the resulting growth of a sizable certified Scrum Master talent pool.

Scrum MasterScrum WorkshopTraining
0 likes · 11 min read
How a One‑Day Scrum Master Workshop Transforms Agile Performance
DataFunTalk
DataFunTalk
Mar 31, 2023 · Artificial Intelligence

Estimating the Resource and Cost Requirements for Large Language Model Training and Inference

The article analyses the computational resources, hardware costs, and human investment needed to train and serve large language models such as GPT‑3, discusses practical cost calculations, highlights the challenges faced by Chinese AI teams, and argues for sustained, long‑term funding to achieve meaningful breakthroughs.

AI InfrastructureChina AIInference
0 likes · 14 min read
Estimating the Resource and Cost Requirements for Large Language Model Training and Inference
Software Development Quality
Software Development Quality
Aug 19, 2022 · Operations

Comprehensive Quality Management SLA Framework for IT Services

This document outlines a detailed Service Level Agreement (SLA) framework covering quality service standards, management processes, testing capabilities, tool support, resource management, measurement systems, risk handling, and continuous improvement to ensure consistent delivery and customer satisfaction across IT operations.

OperationsSLATraining
0 likes · 17 min read
Comprehensive Quality Management SLA Framework for IT Services
Code DAO
Code DAO
Jun 7, 2022 · Artificial Intelligence

How to Implement SRCNN for Image Super‑Resolution in PyTorch

This article walks through a complete PyTorch implementation of the SRCNN model for image super‑resolution, covering dataset preparation, patch extraction, model architecture, training on a GTX 770 GPU for 2500 epochs, PSNR evaluation, and visual comparisons with bicubic up‑sampling.

PSNRPatchifyPyTorch
0 likes · 22 min read
How to Implement SRCNN for Image Super‑Resolution in PyTorch
58UXD
58UXD
Apr 29, 2022 · Operations

How 58 Home Service Standardized Cleaning: From User Research to SOP Success

This article examines how 58 Home Service identified service gaps through user research, built a detailed user‑experience map, created a comprehensive SOP handbook covering image, etiquette, and behavior, and implemented training, assessment, and incentives to dramatically improve customer satisfaction and reduce complaints.

OperationsTrainingUser experience
0 likes · 9 min read
How 58 Home Service Standardized Cleaning: From User Research to SOP Success
DataFunTalk
DataFunTalk
Nov 2, 2021 · Artificial Intelligence

Optimizing AI Platform Resource Efficiency: Scheduling Strategies for Deep Learning Inference and Training

The article outlines a technical exchange hosted by 58.com AI Lab and Tianjin University that discusses high‑efficiency AI computing, resource‑aware scheduling for both online inference and offline training, and methods to mitigate GPU under‑utilization and gray‑interference in distributed deep‑learning platforms.

AIGPU utilizationInference
0 likes · 4 min read
Optimizing AI Platform Resource Efficiency: Scheduling Strategies for Deep Learning Inference and Training
21CTO
21CTO
May 31, 2021 · Operations

How One Engineer Turned Huawei’s MindSpore Community into a Viral Success

This article recounts how a former programmer became a deep‑learning evangelist, built Huawei’s MindSpore open‑source community, leveraged one‑minute videos, launched a hands‑on training camp and a lightweight tool, and fostered a thriving ecosystem that now serves over 190,000 developers.

AI FrameworkMindSporeTraining
0 likes · 16 min read
How One Engineer Turned Huawei’s MindSpore Community into a Viral Success
Aikesheng Open Source Community
Aikesheng Open Source Community
Jan 18, 2021 · Databases

How to Build a Professional DBA Operations Team: Infrastructure, Standards, Training, Knowledge Base, and Culture

The article explains how to construct an effective DBA operations team by focusing on reusable infrastructure, clear team standards, a structured training system, a comprehensive knowledge base, and a positive team atmosphere, providing practical tools and methods for each aspect.

DBADatabase operationsInfrastructure
0 likes · 4 min read
How to Build a Professional DBA Operations Team: Infrastructure, Standards, Training, Knowledge Base, and Culture
JD Tech Talk
JD Tech Talk
Nov 16, 2020 · Artificial Intelligence

Practical Guide to Deploying Federated Learning: Architecture, Deployment, Training, and Inference

This article provides a comprehensive overview of federated learning engineering, covering deployment via Docker containers, the design of training and inference frameworks, key services such as communication, training, model management, and registration, and practical considerations for scaling and reliability in production environments.

AIDeploymentDocker
0 likes · 11 min read
Practical Guide to Deploying Federated Learning: Architecture, Deployment, Training, and Inference
21CTO
21CTO
Mar 26, 2020 · R&D Management

How to Build and Manage High‑Performing Tech Teams: Recruitment, Training, Culture

The article shares practical insights on managing technical teams in mature companies, covering recruitment strategies, effective interview evaluation, comprehensive training programs, and fostering a positive team culture through democratic decision‑making, balanced freedom, challenging work, innovation encouragement, and mutual support.

CultureLeadershipTraining
0 likes · 14 min read
How to Build and Manage High‑Performing Tech Teams: Recruitment, Training, Culture
Efficient Ops
Efficient Ops
Nov 7, 2016 · Operations

How to Train New SREs Effectively: Proven Practices and Playbooks

This article outlines a systematic approach to onboarding and training new Site Reliability Engineers, covering trust building, readiness assessment, diverse learning methods, structured curricula, on‑call milestones, project‑focused work, reverse‑engineering skills, statistical thinking, and improvisation techniques to develop high‑performing SRE teams.

On-CallOperationsSRE
0 likes · 17 min read
How to Train New SREs Effectively: Proven Practices and Playbooks
Architects Research Society
Architects Research Society
Sep 19, 2016 · Information Security

Recommended Books, Training, and Conferences for Industrial Control Systems Cybersecurity

This guide curates essential books, professional training courses, and major conferences for industrial control systems cybersecurity, offering insights into historical context, technical security practices, and community engagement to help practitioners deepen their knowledge and connect with the field.

ICS securityTrainingconferences
0 likes · 10 min read
Recommended Books, Training, and Conferences for Industrial Control Systems Cybersecurity
Qunar Tech Salon
Qunar Tech Salon
Aug 19, 2016 · Artificial Intelligence

Deep Learning Anti‑Scam Guide: A Non‑Technical Overview of Neural Networks, Training, and Practical Tips

This article provides a humorous yet informative, non‑mathematical guide to deep learning, covering neural network basics, layer addition, training methods, back‑propagation, unsupervised pre‑training, regularization, ResNet shortcuts, GPU computation, framework choices, and practical advice for applying deep learning to industrial data.

AIDeep LearningGPU
0 likes · 26 min read
Deep Learning Anti‑Scam Guide: A Non‑Technical Overview of Neural Networks, Training, and Practical Tips
Tongcheng Travel Technology Center
Tongcheng Travel Technology Center
Nov 13, 2015 · Fundamentals

Internal Training Courses: XML Serialization, .NET Parallel Programming Basics, and SQL Query Fundamentals

The announcement details three internal technical training sessions covering XML serialization fundamentals, .NET parallel programming basics, and essential SQL query concepts and optimization, including instructors, topics, locations, target audiences, and scheduled times for each course.

Parallel ProgrammingSQLTraining
0 likes · 2 min read
Internal Training Courses: XML Serialization, .NET Parallel Programming Basics, and SQL Query Fundamentals
Huawei Cloud Developer Alliance
Huawei Cloud Developer Alliance
Sep 2, 2015 · R&D Management

Senior Manager Liang Yonggang Shares eSDK Training Insights and Project Success Tips

In this interview, senior manager Liang Yonggang of Dongfang Hongtai discusses his company's background, recent eSDK training on UC/IVS/TP, the benefits and shortcomings of the program, suggestions for the eSDK WeChat channel, and valuable software development and project management lessons he has gathered over a decade in the industry.

IVSTrainingUC
0 likes · 7 min read
Senior Manager Liang Yonggang Shares eSDK Training Insights and Project Success Tips