Tagged articles
7 articles
Page 1 of 1
Python Programming Learning Circle
Python Programming Learning Circle
Sep 16, 2025 · Artificial Intelligence

Boost Your Python Coding with DeepSeek‑V3 in PyCharm: A Step‑by‑Step Guide

This tutorial walks you through integrating the 671‑billion‑parameter DeepSeek‑V3 model into PyCharm via the Continue plugin, covering API key creation, plugin installation, configuration of model parameters, and practical code‑explanation and modification demos to enhance your Python development workflow.

AI code assistanceContinue pluginDeepSeek-V3
0 likes · 5 min read
Boost Your Python Coding with DeepSeek‑V3 in PyCharm: A Step‑by‑Step Guide
IT Services Circle
IT Services Circle
Jul 22, 2025 · Artificial Intelligence

Why Kimi K2 Overtook DeepSeek to Become the Top Open‑Source AI Model

Kimi K2 has surged to the global open‑source #1 spot, ranking fifth overall and rivaling top closed‑source models, thanks to strong multi‑turn dialogue, programming, and complex‑prompt abilities, extensive community adoption, and a refined DeepSeek V3‑based architecture.

AI PerformanceDeepSeek-V3Kimi K2
0 likes · 8 min read
Why Kimi K2 Overtook DeepSeek to Become the Top Open‑Source AI Model
Architect
Architect
Feb 16, 2025 · Artificial Intelligence

DeepSeek-V3, DeepSeek-R1, and Janus‑Pro: Architecture, Training Techniques, and Performance Insights

This article provides an in‑depth technical overview of DeepSeek‑V3, DeepSeek‑R1 and Janus‑Pro models, covering their Mixture‑of‑Experts architecture, novel MLA attention, auxiliary‑loss‑free load balancing, multi‑token prediction, FP8 mixed‑precision training, efficient cross‑node communication, reinforcement‑learning pipelines, multimodal modeling strategies, performance comparisons, cost statistics, and current limitations.

AI ArchitectureDeepSeek-V3FP8 training
0 likes · 18 min read
DeepSeek-V3, DeepSeek-R1, and Janus‑Pro: Architecture, Training Techniques, and Performance Insights
Architects' Tech Alliance
Architects' Tech Alliance
Feb 12, 2025 · Artificial Intelligence

DeepSeek‑V3 Training Efficiency, Knowledge Distillation, and the Risks of Synthetic Data

The article examines DeepSeek‑V3’s low‑cost training using 2048 H800 GPUs, explains how knowledge distillation and high‑quality data improve efficiency, discusses expert concerns about training on AI‑generated content, and outlines the limitations and ceiling effect of distillation techniques.

AI SafetyAI Training EfficiencyDeepSeek-V3
0 likes · 7 min read
DeepSeek‑V3 Training Efficiency, Knowledge Distillation, and the Risks of Synthetic Data
Alibaba Cloud Big Data AI Platform
Alibaba Cloud Big Data AI Platform
Jan 10, 2025 · Artificial Intelligence

Deploy DeepSeek‑V3 LLM on Alibaba Cloud with One‑Click Model Gallery

This article introduces the 671‑billion‑parameter DeepSeek‑V3 Mixture‑of‑Experts LLM, explains the PAI‑Model Gallery platform that aggregates top AI models, and provides a step‑by‑step guide to deploy DeepSeek‑V3 on Alibaba Cloud’s PAI‑EAS service with zero‑code configuration.

AI deploymentAlibaba CloudDeepSeek-V3
0 likes · 5 min read
Deploy DeepSeek‑V3 LLM on Alibaba Cloud with One‑Click Model Gallery
Baobao Algorithm Notes
Baobao Algorithm Notes
Jan 3, 2025 · Artificial Intelligence

How DeepSeek-V3 Achieves Massive Scale with FP8, MoE, and System Optimizations

The article examines DeepSeek‑V3’s architecture and training pipeline, highlighting its use of MLA and a highly granular MoE design, pioneering FP8 mixed‑precision training, fine‑grained per‑tile quantization, advanced parallelism strategies, and inference optimizations such as PD separation and NanoFlow to achieve unprecedented efficiency on limited GPU resources.

DeepSeek-V3FP8Inference Optimization
0 likes · 10 min read
How DeepSeek-V3 Achieves Massive Scale with FP8, MoE, and System Optimizations