Tagged articles

DeepSeek

623 articles · Page 4 of 7
AI Frontier Lectures
AI Frontier Lectures
Mar 10, 2025 · Industry Insights

Why DeepSeek’s Rise Is Shaking China’s AGI Landscape

The article analyzes how DeepSeek’s unexpected success has triggered a strategic rethink across Chinese AI firms, prompting shifts from product‑centric growth to foundational model research, reshaping talent structures at Tencent and ByteDance, and questioning where the true barriers to AGI lie.

AGIChina AIDeepSeek
0 likes · 13 min read
Why DeepSeek’s Rise Is Shaking China’s AGI Landscape
Alibaba Cloud Developer
Alibaba Cloud Developer
Mar 10, 2025 · Artificial Intelligence

Seamlessly Switch Between DeepSeek‑R1 and QwQ‑32B with Higress AI Gateway

Learn how to deploy the new QwQ‑32B inference model alongside DeepSeek‑R1 using the Higress AI gateway, covering environment setup, model configuration, routing, token‑level rate limiting, content safety, semantic caching, and advanced features like automatic fallback and internet‑search integration.

DeepSeekHigressLLM integration
0 likes · 16 min read
Seamlessly Switch Between DeepSeek‑R1 and QwQ‑32B with Higress AI Gateway
Baobao Algorithm Notes
Baobao Algorithm Notes
Mar 10, 2025 · Artificial Intelligence

Why DeepSeek V3’s FP8 Training Beats Traditional Schemes: A Deep Dive

This article provides a detailed technical analysis of FP8 training, comparing Nvidia’s TransformerEngine approach with DeepSeek V3’s novel scheme, and examines how block‑wise scaling, high‑precision accumulation, and vector length and correlation affect quantization error and signal‑to‑noise ratio in large‑language‑model training.

DeepSeekFP8LLM
0 likes · 20 min read
Why DeepSeek V3’s FP8 Training Beats Traditional Schemes: A Deep Dive
CSS Magic
CSS Magic
Mar 10, 2025 · Artificial Intelligence

Three Advanced Ways to Harness DeepSeek for Everyone

The article outlines three practical approaches to get the most out of DeepSeek—using it as a conversational assistant, integrating its API to power AI tools such as the Chrome immersive‑translation plugin, and leveraging it for AI‑assisted programming—while comparing the V3 and R1 models and offering concrete configuration steps.

AI programmingAI translationAPI integration
0 likes · 8 min read
Three Advanced Ways to Harness DeepSeek for Everyone
Java Architect Essentials
Java Architect Essentials
Mar 9, 2025 · Backend Development

Building an AI-Powered Chatbot with Spring Boot and DeepSeek

This tutorial demonstrates how to create an AI-driven Spring Boot application by integrating DeepSeek's large language model, covering project setup, dependency configuration, API key management, and implementing a REST controller that provides weather forecasts via a conversational interface.

AIChatbotDeepSeek
0 likes · 8 min read
Building an AI-Powered Chatbot with Spring Boot and DeepSeek
Data Thinking Notes
Data Thinking Notes
Mar 9, 2025 · Artificial Intelligence

How DeepSeek R1 Uses Large‑Scale Reinforcement Learning to Rival OpenAI o1

DeepSeek R1, an open‑source large language model, leverages rule‑based, large‑scale reinforcement learning and mixed supervised‑fine‑tuning data to achieve deep reasoning comparable to OpenAI o1, illustrating China’s rapid AI progress, the importance of efficiency, and the democratizing impact of open AI research.

DeepSeekModel EfficiencyOpen-source AI
0 likes · 11 min read
How DeepSeek R1 Uses Large‑Scale Reinforcement Learning to Rival OpenAI o1
Architects' Tech Alliance
Architects' Tech Alliance
Mar 9, 2025 · Industry Insights

DeepSeek’s AI Ecosystem: From Core Tech to Market Impact

This article provides a comprehensive analysis of DeepSeek, covering its foundational AI research, technology stack, product offerings, and the broader upstream, midstream, and downstream AI industry landscape, including hardware, server, cloud, and market trends.

AI InfrastructureDeepSeekMarket Trends
0 likes · 13 min read
DeepSeek’s AI Ecosystem: From Core Tech to Market Impact
DataFunTalk
DataFunTalk
Mar 8, 2025 · Artificial Intelligence

DeepSeek Reflection Wave and the Shifting Landscape of AGI Development in China

The article analyzes how DeepSeek's rapid rise has triggered a strategic rethink across Chinese AI startups and tech giants, prompting a shift from product‑centric growth to deep‑model research, while examining the real barriers to AGI and the importance of time‑advantage in the large‑model race.

AGIAIChinese tech
0 likes · 12 min read
DeepSeek Reflection Wave and the Shifting Landscape of AGI Development in China
Fun with Large Models
Fun with Large Models
Mar 8, 2025 · Artificial Intelligence

Make AI Obey: A Detailed Prompt Engineering Guide to Boost Large‑Model Logic

This tutorial explains how to enhance large language models' logical reasoning by using DeepSeek‑R1's deep‑thinking mode, few‑shot prompting, chain‑of‑thought, and zero‑shot chain‑of‑thought techniques, providing concrete examples, comparisons, and a step‑by‑step template for effective prompt design.

AI reasoningChain-of-ThoughtDeepSeek
0 likes · 10 min read
Make AI Obey: A Detailed Prompt Engineering Guide to Boost Large‑Model Logic
Java Architect Essentials
Java Architect Essentials
Mar 7, 2025 · Artificial Intelligence

Introducing DeepSeek4j 1.4: A Java Spring Boot Integration for DeepSeek AI with Chain‑of‑Thought and Streaming Support

The article introduces DeepSeek4j 1.4, a Java Spring Boot library that overcomes existing framework limitations by preserving DeepSeek's chain‑of‑thought capabilities, adding full reactive streaming, and providing a simple one‑line API along with quick‑start instructions and code examples.

AI integrationChain-of-ThoughtDeepSeek
0 likes · 5 min read
Introducing DeepSeek4j 1.4: A Java Spring Boot Integration for DeepSeek AI with Chain‑of‑Thought and Streaming Support
Architects' Tech Alliance
Architects' Tech Alliance
Mar 7, 2025 · Industry Insights

How DeepSeek’s V3 and R1 Are Redefining the Global AI Landscape

The 2025 DeepSeek analysis report examines the V3 and R1 models' novel Transformer‑based technologies, their performance gains, and how they are reshaping global AI competition, boosting domestic AI valuations, and ushering in an open‑source AI breakthrough that could spark the next killer applications.

AI modelsDeepSeekOpen-source AI
0 likes · 5 min read
How DeepSeek’s V3 and R1 Are Redefining the Global AI Landscape
DevOps
DevOps
Mar 6, 2025 · Artificial Intelligence

Building Multi-Model Chat Agents with Dify: Integrating DeepSeek‑R1 and Gemini

This article explains how to create a high‑performance multi‑model chat agent on the Dify platform by combining DeepSeek‑R1 for reasoning and Gemini for answer generation, covering the underlying principles, configuration steps, API integration, performance benchmarks, and practical deployment guidance.

API integrationChatbotDeepSeek
0 likes · 12 min read
Building Multi-Model Chat Agents with Dify: Integrating DeepSeek‑R1 and Gemini
Data Thinking Notes
Data Thinking Notes
Mar 6, 2025 · Artificial Intelligence

How China’s State‑Owned Giants Are Accelerating AI with DeepSeek

Amid a global digital surge, 45% of China’s central state‑owned enterprises have deployed the DeepSeek large‑model platform, rapidly integrating AI across energy, power, telecom, construction and other sectors to boost intelligent transformation and operational efficiency.

AI adoptionChinaDeepSeek
0 likes · 7 min read
How China’s State‑Owned Giants Are Accelerating AI with DeepSeek
Model Perspective
Model Perspective
Mar 6, 2025 · Artificial Intelligence

Can AI Boost High School Math Problem Solving? A DeepSeek Case Study

This article explores how the AI model DeepSeek can assist high‑school students in tackling challenging sequence problems from the 2024 Chinese college entrance exam, detailing its reasoning process, strengths, pitfalls, and practical tips for using AI to train mathematical thinking rather than just obtain answers.

AIDeepSeekhigh school
0 likes · 9 min read
Can AI Boost High School Math Problem Solving? A DeepSeek Case Study
Fun with Large Models
Fun with Large Models
Mar 6, 2025 · Artificial Intelligence

Master Prompt Engineering: Make AI Follow Your Commands with Simple, Effective Prompts

Prompt engineering transforms vague queries into precise, reliable AI responses by structuring prompts with clear instructions, context, input, and output specifications, and by using role‑playing and formatting tricks, enabling models like DeepSeek and OpenAI to deliver accurate, consistent results across tasks.

AI Prompt DesignDeepSeekOpenAI
0 likes · 15 min read
Master Prompt Engineering: Make AI Follow Your Commands with Simple, Effective Prompts
Architects' Tech Alliance
Architects' Tech Alliance
Mar 5, 2025 · Industry Insights

How DeepSeek’s Open‑Source Tools Are Supercharging AI Model Performance

DeepSeek’s Open‑Source Week unveiled five high‑performance projects—FlashMLA, DeepEP, DeepGEMM, DualPipe/EPLB, and 3FS—each delivering novel GPU optimizations, communication kernels, matrix‑multiplication libraries, parallelism strategies, and a distributed file system that together dramatically accelerate large‑scale AI training and inference workloads.

AI accelerationDeepSeekGPU Optimization
0 likes · 9 min read
How DeepSeek’s Open‑Source Tools Are Supercharging AI Model Performance
Java Architect Essentials
Java Architect Essentials
Mar 5, 2025 · Artificial Intelligence

Step-by-Step Guide to Integrate DeepSeek AI with a WeChat Public Account Using a Cloud Server

This tutorial walks beginners through obtaining a DeepSeek API key, setting up an Alibaba Cloud ECS instance, configuring the WeChat public‑account interface, cloning and configuring the open‑source COW project, and finally deploying a Python service that connects the WeChat bot to the DeepSeek large‑language model.

DeepSeekPython TutorialWeChat
0 likes · 13 min read
Step-by-Step Guide to Integrate DeepSeek AI with a WeChat Public Account Using a Cloud Server
Architects' Tech Alliance
Architects' Tech Alliance
Mar 5, 2025 · Industry Insights

DeepSeek R1 & Kimi 1.5: Inside the Development of Near‑Strong Reasoning Models

The article analyzes DeepSeek's recent releases—V3 dialogue model and R1 inference model—detailing their launch dates, rapid popularity surge, R1's reinforcement‑learning‑based design for code and math tasks, and provides links to related Beijing University technical reports while stripping promotional sales content.

AIDeepSeekIndustry Analysis
0 likes · 3 min read
DeepSeek R1 & Kimi 1.5: Inside the Development of Near‑Strong Reasoning Models
Model Perspective
Model Perspective
Mar 5, 2025 · Artificial Intelligence

Can AI Really Crack NP‑Hard Problems? Inside the DeepSeek‑R1 Breakthrough

Researchers from Nanjing University of Aeronautics, Nanjing University of Technology and Oxford show that high‑instruction prompts dramatically boost large language models' mathematical reasoning, enabling DeepSeek‑R1 and Qwen2.5 to solve complex polynomial tasks and even produce a new counterexample to Hilbert's 17th problem.

AIDeepSeekNP-hard
0 likes · 6 min read
Can AI Really Crack NP‑Hard Problems? Inside the DeepSeek‑R1 Breakthrough
Tencent Cloud Developer
Tencent Cloud Developer
Mar 5, 2025 · Artificial Intelligence

DeepSeek Series Overview: Core Technologies, Model Innovations, and Product Highlights

The article delivers a PPT‑style deep dive into the DeepSeek series—from the original LLM through DeepSeek‑MoE, Math, V2, V3 and R1—highlighting core innovations such as Multi‑Head Latent Attention, fine‑grained MoE, GRPO reinforcement learning, Multi‑Token Prediction, DualPipe parallelism and FP8 training that together achieve high performance at a fraction of traditional costs, and notes their integration into Tencent’s OlaChat intelligent assistant.

AIDeepSeekFP8 training
0 likes · 21 min read
DeepSeek Series Overview: Core Technologies, Model Innovations, and Product Highlights
Open Source Linux
Open Source Linux
Mar 5, 2025 · Artificial Intelligence

How DeepSeek‑R1 Redefines Prompt Engineering and Real‑World AI Deployment

The article analyzes DeepSeek‑R1’s low‑cost inference architecture, Chinese language optimizations, novel prompt‑engineering techniques, and the practical challenges of deploying large domestic models, offering insights into vertical AI applications and the evolving open‑source ecosystem in China.

AI DeploymentDeepSeekLarge Language Model
0 likes · 8 min read
How DeepSeek‑R1 Redefines Prompt Engineering and Real‑World AI Deployment
Data Thinking Notes
Data Thinking Notes
Mar 4, 2025 · Artificial Intelligence

Unlock AI-Powered Research: The DeepSeek‑R1 & DeepResearch Guide

Compiled by Tsinghua University experts, this guide systematically analyzes the DeepSeek‑R1 inference model and DeepResearch platform, offering multi‑model comparisons, real‑world case studies, and end‑to‑end AI‑driven solutions from data collection to report generation for researchers.

AI researchData AutomationDeepSeek
0 likes · 6 min read
Unlock AI-Powered Research: The DeepSeek‑R1 & DeepResearch Guide
Big Data Tech Team
Big Data Tech Team
Mar 4, 2025 · Industry Insights

100 Real-World DeepSeek Scenarios: How AI Is Reshaping Industries

The article analyzes DeepSeek's open‑source model launch, its rapid user growth, and presents a comprehensive list of 100 practical AI use cases across sectors—grouped by frequency and adoption stage—to illustrate the model's market impact and future potential.

AI ApplicationsDeepSeekMarket Analysis
0 likes · 16 min read
100 Real-World DeepSeek Scenarios: How AI Is Reshaping Industries
Smart Era Software Development
Smart Era Software Development
Mar 4, 2025 · Artificial Intelligence

How DeepSeek‑R1 Is Redefining AI Applications and the AIGC Landscape

The article analyses DeepSeek‑R1’s low‑cost open‑source strategy, superior inference performance (including GPQA benchmark gains over GPT‑4o), its focus on complex reasoning, math and programming, and how these traits reshape AIGC across industries while highlighting remaining privacy and ethical challenges.

AI ApplicationsAIGCDeepSeek
0 likes · 6 min read
How DeepSeek‑R1 Is Redefining AI Applications and the AIGC Landscape
JD Tech Talk
JD Tech Talk
Mar 4, 2025 · Artificial Intelligence

Building a Local Personal Knowledge Base with Ollama, DeepSeek‑R1, AnythingLLM and Integrating Continue into VSCode

This guide walks through setting up a local personal knowledge base using Ollama, DeepSeek‑R1, and AnythingLLM, and demonstrates how to integrate the Continue AI code assistant into VSCode, covering installation, configuration, and usage tips for efficient, secure development.

AI integrationAnythingLLMDeepSeek
0 likes · 10 min read
Building a Local Personal Knowledge Base with Ollama, DeepSeek‑R1, AnythingLLM and Integrating Continue into VSCode
Java Web Project
Java Web Project
Mar 4, 2025 · Artificial Intelligence

How to Seamlessly Integrate DeepSeek AI into IntelliJ IDEA for Java Development

This step‑by‑step guide shows Java developers how to prepare their environment, install the CodeGPT plugin, configure DeepSeek with an API key and model settings, and then use the assistant for code generation, completion, explanation, question answering, and usage monitoring within IntelliJ IDEA.

AI code assistantCodeGPTDeepSeek
0 likes · 8 min read
How to Seamlessly Integrate DeepSeek AI into IntelliJ IDEA for Java Development
Alibaba Cloud Big Data AI Platform
Alibaba Cloud Big Data AI Platform
Mar 4, 2025 · Artificial Intelligence

Deploy a High‑Performance RAG Service with Hologres, DeepSeek, and PAI‑EAS

This guide walks you through building a Retrieval‑Augmented Generation (RAG) system by integrating Alibaba Cloud's Hologres vector store, the Proxima high‑performance vector engine, and DeepSeek large language models via PAI‑EAS, covering prerequisites, deployment steps, configuration, and inference verification.

AI DeploymentDeepSeekHologres
0 likes · 12 min read
Deploy a High‑Performance RAG Service with Hologres, DeepSeek, and PAI‑EAS
Architect
Architect
Mar 3, 2025 · Artificial Intelligence

Unlocking Reasoning LLMs: Methods, DeepSeek R1 Insights, and Cost‑Effective Strategies

This article examines how to build and improve reasoning‑capable large language models, explains the definition and use‑cases of reasoning models, details DeepSeek‑R1’s training pipeline, compares four key enhancement methods—including inference‑time scaling, pure RL, SFT + RL, and distillation—and offers budget‑friendly advice.

AI researchDeepSeekInference Scaling
0 likes · 27 min read
Unlocking Reasoning LLMs: Methods, DeepSeek R1 Insights, and Cost‑Effective Strategies
AI Algorithm Path
AI Algorithm Path
Mar 3, 2025 · Artificial Intelligence

DeepSeek‑R1 Model Performance: Comparing 32B, 70B, and R1

This article evaluates DeepSeek‑R1’s 32B and 70B distilled models alongside the original R1 on a range of reasoning and coding tasks, detailing hardware setup, test methodology, per‑task results, and a comparative analysis of their strengths and weaknesses.

32B70BDeepSeek
0 likes · 6 min read
DeepSeek‑R1 Model Performance: Comparing 32B, 70B, and R1
DataFunSummit
DataFunSummit
Mar 3, 2025 · Artificial Intelligence

DeepSeek Open Source Week: Seven Core Technologies Reshaping Large‑Model Training

The DeepSeek open‑source week introduced seven breakthrough technologies—FlashMLA, DeepGEMM, DeepEP, DualPipe, EPLB, 3FS, and Smallpond—that together overhaul data flow, algorithmic complexity, hardware utilization, MoE communication, and resource balancing, dramatically improving large‑model training efficiency and lowering entry barriers for the AI industry.

AI hardwareDeepSeekdata pipelines
0 likes · 17 min read
DeepSeek Open Source Week: Seven Core Technologies Reshaping Large‑Model Training
Alibaba Cloud Developer
Alibaba Cloud Developer
Mar 3, 2025 · Mobile Development

Build a WeChat Mini‑Program Without Writing Code Using AI

This article demonstrates how a non‑programmer can use the DeepSeek‑powered “AI Programmer” mode in Tongyi Lingma to generate, modify, and deploy a functional WeChat mini‑program entirely through natural language prompts, complete with screenshots of each step.

AI programmingDeepSeekMobile Development
0 likes · 5 min read
Build a WeChat Mini‑Program Without Writing Code Using AI
macrozheng
macrozheng
Mar 3, 2025 · Artificial Intelligence

Integrate DeepSeek with Spring AI: Step‑by‑Step Spring Boot Guide

This tutorial walks you through integrating DeepSeek via Spring AI into a Spring Boot project, covering Spring AI basics, obtaining an API key, adding dependencies and configuration, implementing controller endpoints, testing with Postman, and accessing the full source code.

AI integrationChatbotDeepSeek
0 likes · 7 min read
Integrate DeepSeek with Spring AI: Step‑by‑Step Spring Boot Guide
AI Large Model Application Practice
AI Large Model Application Practice
Mar 3, 2025 · Artificial Intelligence

Can DeepSeek‑R1 Unlock True “Deep Thinking” for Enterprise RAG?

This article examines how swapping in DeepSeek‑R1 enhances Retrieval‑Augmented Generation with deeper reasoning, outlines its benefits and pitfalls—including slower inference, higher compute costs, and hallucinations—provides a simple hallucination test, and proposes an Agentic RAG research assistant to balance accuracy and creativity.

AI reasoningDeepSeekLLM
0 likes · 10 min read
Can DeepSeek‑R1 Unlock True “Deep Thinking” for Enterprise RAG?
Architects' Tech Alliance
Architects' Tech Alliance
Mar 1, 2025 · Artificial Intelligence

Decoding DeepSeek: A Four‑Tier Capability Framework for Multimodal AI

The article outlines DeepSeek's four‑level capability hierarchy—basic multimodal data fusion and dynamic governance, intermediate domain modeling with causal reasoning and multi‑objective optimization, advanced complex system modeling with digital twins and multi‑agent coordination, and ultimate autonomous evolution features such as concept‑space exploration and self‑programming.

DeepSeekDigital TwinModel Capability
0 likes · 5 min read
Decoding DeepSeek: A Four‑Tier Capability Framework for Multimodal AI
ITPUB
ITPUB
Mar 1, 2025 · Artificial Intelligence

Can DeepSeek AI Replace Your DBA? Real-World Database Scenarios Tested

This article examines DeepSeek, a Chinese AGI‑focused AI model, explains prompt‑engineering techniques, and evaluates its performance across database architecture, development, and operations tasks through concrete Q&A examples, SQL plan analysis, and shell‑script generation, while also discussing its broader impact on professionals, vendors and enterprises.

AIDeepSeekPrompt engineering
0 likes · 10 min read
Can DeepSeek AI Replace Your DBA? Real-World Database Scenarios Tested
Architects' Tech Alliance
Architects' Tech Alliance
Feb 28, 2025 · Artificial Intelligence

DeepSeek V3 & R1: How Their Training Costs Compare to Llama 3.1

The article analyzes DeepSeek’s latest V3 conversational model and R1 inference model, detailing their MoE architecture, training on H800 GPUs costing about $558 k, comparing compute expenses to Meta’s Llama 3.1, and showing that their API pricing is roughly one‑tenth of GPT‑4o for dialogue and one‑twentieth of OpenAI o1 for inference.

AI model analysisDeepSeekLarge Language Model
0 likes · 4 min read
DeepSeek V3 & R1: How Their Training Costs Compare to Llama 3.1
Alibaba Cloud Developer
Alibaba Cloud Developer
Feb 28, 2025 · Artificial Intelligence

How DeepSeek’s RL‑Powered Time Scaling Is Redefining AI Model Training

DeepSeek’s rapid rise is examined through its RL‑based Time Scaling paradigm, cost‑effective architecture, innovative training pipeline, open‑source strategy, and security challenges, highlighting how these breakthroughs disrupt traditional AI model development, lower resource demands, and influence industry dynamics.

AI model trainingDeepSeekOpen-source AI
0 likes · 13 min read
How DeepSeek’s RL‑Powered Time Scaling Is Redefining AI Model Training
Alibaba Cloud Native
Alibaba Cloud Native
Feb 27, 2025 · Cloud Native

Build AI-Powered Code Review with Alibaba Cloud Flow and DeepSeek

This guide walks you through creating a custom Cloud Native Flow step that calls DeepSeek to automatically review code in Alibaba Cloud Codeup, covering token creation, API key setup, step publishing, pipeline configuration, and viewing AI‑generated review comments.

Alibaba CloudDeepSeekFlow-CLI
0 likes · 7 min read
Build AI-Powered Code Review with Alibaba Cloud Flow and DeepSeek
IT Services Circle
IT Services Circle
Feb 27, 2025 · Artificial Intelligence

DeepSeek Announces FlashMLA: An Efficient Multi‑Layer Attention Decoding Kernel for Hopper GPUs

DeepSeek’s OpenSourceWeek introduced FlashMLA, a GPU‑optimized MLA decoding kernel for Hopper GPUs that leverages FlashAttention and CUTLASS to dramatically improve large‑model inference performance, with early adoption showing up to 30% higher compute utilization and doubled speed in some scenarios.

DeepSeekFlashMLAGPU
0 likes · 3 min read
DeepSeek Announces FlashMLA: An Efficient Multi‑Layer Attention Decoding Kernel for Hopper GPUs
JavaEdge
JavaEdge
Feb 27, 2025 · Artificial Intelligence

How to Quickly Build a DeepSeek‑Powered Knowledge Base on Tencent Cloud

This guide walks through deploying the full‑feature DeepSeek V3+R1 model on Tencent Cloud, configuring a smart knowledge‑base application, importing documentation, enabling internet search, tuning retrieval parameters, and publishing the app for public use, all without writing code.

AIDeepSeekKnowledge Base
0 likes · 6 min read
How to Quickly Build a DeepSeek‑Powered Knowledge Base on Tencent Cloud
Model Perspective
Model Perspective
Feb 27, 2025 · Artificial Intelligence

Why AI Model Cost Cuts Trigger a New Wave of Nvidia Demand

The article explains how DeepSeek’s low‑cost large‑language‑model training reduces GPU price pressure, yet paradoxically fuels greater demand for Nvidia hardware by lowering entry barriers, illustrating the modern Jevons paradox and its broader economic and societal implications.

AI hardwareDeepSeekGPU demand
0 likes · 8 min read
Why AI Model Cost Cuts Trigger a New Wave of Nvidia Demand
NewBeeNLP
NewBeeNLP
Feb 27, 2025 · Industry Insights

How DeepSeek’s Open‑Source Tools Exploit China‑Specific H800 GPUs to Boost AI Performance

The article analyzes DeepSeek’s three open‑source projects—FlashMLA, DeepEP, and DeepGEMM—showing how they optimize for the China‑only NVIDIA H800 GPU, contrast this with the abundant hardware resources of Western AI firms, and highlight the growing demand for talent that masters both AI models and GPU hardware.

AI hardwareDeepEPDeepGEMM
0 likes · 7 min read
How DeepSeek’s Open‑Source Tools Exploit China‑Specific H800 GPUs to Boost AI Performance
Tencent Cloud Developer
Tencent Cloud Developer
Feb 27, 2025 · Artificial Intelligence

DeepSeek LLM Series (V1‑V3, R1) Technical Overview and Analysis

The DeepSeek technical overview details the evolution from the dense 67 B V1 model through the 236 B MoE‑based V2 and 671 B V3 with FP8 training, to the RL‑only R1 series that learns reasoning without supervision, highlighting innovations such as Grouped‑Query Attention, Multi‑Head Latent Attention, load‑balancing‑free MoE, Multi‑Token Prediction, and knowledge distillation, and reporting state‑of‑the‑art benchmark results and open‑source reproduction projects.

AI researchDeepSeekMixture of Experts
0 likes · 37 min read
DeepSeek LLM Series (V1‑V3, R1) Technical Overview and Analysis
Architects' Tech Alliance
Architects' Tech Alliance
Feb 27, 2025 · Artificial Intelligence

How Inspur Metabrain R1 Server Enables 1000+ Concurrent Users for DeepSeek 671B via SGLang Optimization

The Inspur Metabrain R1 inference server, equipped with FP8 acceleration and a 1128 GB HBM3e memory pool, has been tightly integrated with SGLang 0.4.3 to run the 671‑billion‑parameter DeepSeek R1 model, delivering over 1,000 concurrent user sessions and up to 3,976 tokens/s throughput.

AI serverDeepSeekInference Optimization
0 likes · 5 min read
How Inspur Metabrain R1 Server Enables 1000+ Concurrent Users for DeepSeek 671B via SGLang Optimization
IT Architects Alliance
IT Architects Alliance
Feb 26, 2025 · Artificial Intelligence

DeepSeek Large Model: Core Architecture, Key Technologies, and Training Strategies

The article provides an in‑depth overview of DeepSeek’s large language model, detailing its mixture‑of‑experts and Transformer foundations, novel attention mechanisms, load‑balancing, multi‑token prediction, FP8 mixed‑precision training, and various training regimes such as knowledge distillation and reinforcement learning.

DeepSeekFP8Large Language Model
0 likes · 18 min read
DeepSeek Large Model: Core Architecture, Key Technologies, and Training Strategies
Tencent Technical Engineering
Tencent Technical Engineering
Feb 26, 2025 · Artificial Intelligence

Engineers' Perspectives on DeepSeek: Technical Innovations and Implications

Thirteen engineers praise DeepSeek’s open‑source, reinforcement‑learning‑driven architecture—using FP8 storage and SFT‑free training—to deliver GPT‑4‑level reasoning at one‑twentieth the cost, enabling single‑GPU deployment, lowering barriers for academia and startups, and prompting notable market reactions that could democratize advanced AI.

AI cost reductionDeepSeekFP8
0 likes · 9 min read
Engineers' Perspectives on DeepSeek: Technical Innovations and Implications
58UXD
58UXD
Feb 26, 2025 · Artificial Intelligence

How AI Tools Like Deepseek Transform Design Workflow

This article shows designers how to combine AI services such as Deepseek, JiMeng, Tripo, Tongyi and Jianying to accelerate 3D modeling, PPT creation and short‑video production, turning lengthy manual tasks into fast, creative processes.

3D modelingAIDeepSeek
0 likes · 5 min read
How AI Tools Like Deepseek Transform Design Workflow
Architecture Digest
Architecture Digest
Feb 26, 2025 · Artificial Intelligence

DeepSeek4j 1.4: A Java Integration Framework for DeepSeek AI Models

DeepSeek4j 1.4 introduces a Java‑native framework that fully preserves DeepSeek's chain‑of‑thought and billing features, adds reactive streaming support, and provides a Spring Boot starter for effortless integration, accompanied by quick‑start code, configuration examples, and a built‑in debugging UI.

AIAPIDeepSeek
0 likes · 5 min read
DeepSeek4j 1.4: A Java Integration Framework for DeepSeek AI Models
Architect
Architect
Feb 25, 2025 · Artificial Intelligence

DeepSeek R1: Multi‑Stage Reinforcement Learning, Reward Modeling, and Distillation for a High‑Performance LLM

DeepSeek R1 builds on the DeepSeek V3 base model using a multi‑stage reinforcement learning pipeline—including GRPO optimization, rule‑based reward modeling, supervised fine‑tuning, language‑consistency rewards, rejection sampling, and distillation—to produce a high‑performing, aligned LLM capable of accurate reasoning.

DeepSeekLLM trainingReward Modeling
0 likes · 24 min read
DeepSeek R1: Multi‑Stage Reinforcement Learning, Reward Modeling, and Distillation for a High‑Performance LLM
Architects' Tech Alliance
Architects' Tech Alliance
Feb 25, 2025 · Artificial Intelligence

What Makes DeepSeek‑R1 a Game‑Changer in AIGC? Insights from Peking University

This article summarizes a Peking University lecture on DeepSeek‑R1, detailing its core concepts, advantages, and historical significance, then explains the underlying mechanisms of large‑model AI and AIGC tools, and finally offers practical guidance for selecting and efficiently applying AI solutions.

AI model analysisAIGCDeepSeek
0 likes · 5 min read
What Makes DeepSeek‑R1 a Game‑Changer in AIGC? Insights from Peking University
Alibaba Cloud Big Data AI Platform
Alibaba Cloud Big Data AI Platform
Feb 25, 2025 · Artificial Intelligence

Accelerate DeepSeek‑V2‑Lite Deployment with FlashMLA: A Step‑by‑Step Guide

This tutorial walks users through installing FlashMLA, integrating it with the vLLM framework, downloading the DeepSeek‑V2‑Lite‑Chat model, benchmarking various MLA implementations, and running a local inference demo that shows FlashMLA’s speed advantage on long‑sequence generation.

DeepSeekFlashMLAInferenceOptimization
0 likes · 16 min read
Accelerate DeepSeek‑V2‑Lite Deployment with FlashMLA: A Step‑by‑Step Guide
Alibaba Cloud Big Data AI Platform
Alibaba Cloud Big Data AI Platform
Feb 25, 2025 · Artificial Intelligence

Build a RAG‑Powered Smart Q&A Assistant with Milvus, DeepSeek, and PAI LangStudio

This step‑by‑step guide shows how to assemble a Retrieval‑Augmented Generation (RAG) system using Alibaba Cloud Milvus vector search, the DeepSeek large language model, and PAI LangStudio, covering instance creation, data upload, model deployment, connection setup, flow design, and service invocation.

AI TutorialDeepSeekLLM
0 likes · 9 min read
Build a RAG‑Powered Smart Q&A Assistant with Milvus, DeepSeek, and PAI LangStudio
Architecture Digest
Architecture Digest
Feb 25, 2025 · Artificial Intelligence

DeepSeek Distillation Technology: Overview, Innovations, Architecture, Training, Performance, and Challenges

DeepSeek’s distillation technology combines data and model distillation to transfer knowledge from large teacher models to compact student models, detailing its definitions, principles, key innovations, architecture, training methods, performance gains, and challenges, especially in multimodal contexts.

AI researchDeepSeekknowledge distillation
0 likes · 16 min read
DeepSeek Distillation Technology: Overview, Innovations, Architecture, Training, Performance, and Challenges
Java Web Project
Java Web Project
Feb 25, 2025 · Artificial Intelligence

How DeepSeek4j 1.4 Solves Spring AI’s Chain‑of‑Thought and Streaming Gaps

The article explains why existing Java AI frameworks struggle with DeepSeek R1’s chain‑of‑thought and streaming features, introduces DeepSeek4j 1.4 as a targeted solution, details its core capabilities, and provides a step‑by‑step guide to integrate it with Spring Boot and Project Reactor.

AI integrationChain-of-ThoughtDeepSeek
0 likes · 5 min read
How DeepSeek4j 1.4 Solves Spring AI’s Chain‑of‑Thought and Streaming Gaps
Efficient Ops
Efficient Ops
Feb 25, 2025 · Artificial Intelligence

How to Deploy DeepSeek R1 Locally: A Step‑by‑Step Guide for AI Enthusiasts

This guide explains what DeepSeek R1 is, compares its full and distilled versions, details hardware requirements for Linux, Windows, and macOS, and provides step‑by‑step instructions for local deployment using Ollama, LM Studio, Docker, and visual interfaces like Open‑WebUI and Dify.

AI modelDeepSeekDify
0 likes · 9 min read
How to Deploy DeepSeek R1 Locally: A Step‑by‑Step Guide for AI Enthusiasts
Tencent Cloud Developer
Tencent Cloud Developer
Feb 25, 2025 · Artificial Intelligence

Deploy DeepSeek AI: Cloud, Local, API – Full Step‑by‑Step Guide

This guide walks developers through the full lifecycle of using DeepSeek—choosing the right deployment method (API, local machine, or private cloud), selecting model sizes based on hardware, configuring Tencent Cloud services, building AI applications, and integrating the model into development tools and mini‑programs.

AI application developmentAI model deploymentAPI integration
0 likes · 12 min read
Deploy DeepSeek AI: Cloud, Local, API – Full Step‑by‑Step Guide
CSS Magic
CSS Magic
Feb 25, 2025 · Artificial Intelligence

Two Simple Ways to Access DeepSeek API for Free

This guide shows how to obtain free DeepSeek API access through GitHub Models and SiliconFlow, detailing the required API base URL, key, and model name, how to register, create keys, verify usage with a web chat tool, and compare model choices and platform limits.

APIDeepSeekFree access
0 likes · 7 min read
Two Simple Ways to Access DeepSeek API for Free
DevOps
DevOps
Feb 24, 2025 · Artificial Intelligence

AI‑Powered Full‑Stack Development with DeepSeek and ClinePRO: A 12× Efficiency Case Study

During the Chinese New Year break the author used DeepSeek and AISE ClinePRO to build a complete full‑stack product in only 20 hours, demonstrating a twelve‑fold productivity boost over traditional development while showcasing AI‑driven code generation, multilingual support, automated documentation, and DevOps integration.

AI codingClinePRODeepSeek
0 likes · 17 min read
AI‑Powered Full‑Stack Development with DeepSeek and ClinePRO: A 12× Efficiency Case Study
AI Algorithm Path
AI Algorithm Path
Feb 24, 2025 · Artificial Intelligence

Flash-MLA: Boosting LLM Inference Speed on Nvidia Hopper GPUs

Flash-MLA is an open‑source GPU kernel optimized for Nvidia Hopper GPUs that compresses the KV cache of multi‑head attention, cutting memory usage by up to 93.3% and delivering 580 TFLOPS compute, thereby dramatically accelerating large‑language‑model inference while lowering cost.

DeepSeekFlash-MLAGPU Optimization
0 likes · 8 min read
Flash-MLA: Boosting LLM Inference Speed on Nvidia Hopper GPUs
21CTO
21CTO
Feb 24, 2025 · Artificial Intelligence

From Transformers to DeepSeek-R1: Evolution of Large Language Models

Since the 2017 introduction of the Transformer architecture, this article chronicles the rapid development of large language models—including BERT, GPT series, multimodal systems, and the cost‑effective DeepSeek‑R1—highlighting key innovations, scaling trends, alignment techniques, and their transformative impact across AI research and industry.

AI evolutionDeepSeekLLM History
0 likes · 23 min read
From Transformers to DeepSeek-R1: Evolution of Large Language Models
Architects' Tech Alliance
Architects' Tech Alliance
Feb 24, 2025 · Artificial Intelligence

NSA: Hardware‑Optimized Sparse Attention Mechanism from DeepSeek, Peking University and University of Washington

The NSA mechanism introduces a three‑branch hardware‑optimized sparse attention architecture—token compression, token selection, and sliding window—combined with learnable gating to balance global and local context, dramatically improving inference speed and efficiency for long‑context large language models.

AI ArchitectureDeepSeekSparse attention
0 likes · 5 min read
NSA: Hardware‑Optimized Sparse Attention Mechanism from DeepSeek, Peking University and University of Washington
Huawei Cloud Developer Alliance
Huawei Cloud Developer Alliance
Feb 24, 2025 · Artificial Intelligence

Generate Game Code Instantly with DeepSeek V3 on Huawei Cloud

This tutorial walks you through configuring a Huawei Cloud host, installing the AutoGen framework, setting up DeepSeek V3 model API keys, and using the model to automatically generate Python code for a graphical two‑player battle game, complete with step‑by‑step instructions and sample commands.

AI code generationAutoGenDeepSeek
0 likes · 9 min read
Generate Game Code Instantly with DeepSeek V3 on Huawei Cloud
AI Large Model Application Practice
AI Large Model Application Practice
Feb 24, 2025 · Artificial Intelligence

How Web Agents Combine LLMs and Browser Automation to Perform Real‑World Tasks

This article explains what Web Agents are, their ReAct‑style reasoning loop, key implementation technologies such as observation parsing, multimodal models, and browser control tools like Selenium and Playwright, and demonstrates building a DeepSeek‑powered Web Agent with the Browser‑use framework, including code samples and performance insights.

DeepSeekLLMPlaywright
0 likes · 11 min read
How Web Agents Combine LLMs and Browser Automation to Perform Real‑World Tasks
Alibaba Cloud Big Data AI Platform
Alibaba Cloud Big Data AI Platform
Feb 24, 2025 · Artificial Intelligence

How to Distill and Fine‑Tune DeepSeek R1 with Qwen on Alibaba Cloud PAI

This guide walks you through the complete workflow of preparing instruction data, deploying the DeepSeek‑R1 teacher model, using Alibaba Cloud PAI to generate teacher responses, distilling a smaller Qwen2.5‑7B‑Instruct student model, fine‑tuning it, and deploying the final service, with performance comparisons on several math‑reasoning benchmarks.

Alibaba Cloud PAIDeepSeek
0 likes · 17 min read
How to Distill and Fine‑Tune DeepSeek R1 with Qwen on Alibaba Cloud PAI
Java Web Project
Java Web Project
Feb 23, 2025 · Artificial Intelligence

Build Your First AI Chatbot with Spring Boot and DeepSeek LLM

This guide walks you through creating a Spring Boot project, configuring DeepSeek's large language model via SiliconFlow, setting up OpenAI‑compatible parameters, and implementing a REST controller that returns weather forecasts using the model, complete with step‑by‑step code snippets, configuration files, and deployment instructions.

AIChatbotDeepSeek
0 likes · 7 min read
Build Your First AI Chatbot with Spring Boot and DeepSeek LLM
Open Source Linux
Open Source Linux
Feb 23, 2025 · Artificial Intelligence

How Chinese Universities Are Rapidly Deploying DeepSeek AI Models on Campus

After a winter break surge, DeepSeek AI models have been swiftly adopted across Chinese universities, enabling local deployments for teaching, research, and campus services, while facing bans and security concerns abroad, highlighting both rapid domestic integration and international challenges.

AI modelsChinaDeepSeek
0 likes · 13 min read
How Chinese Universities Are Rapidly Deploying DeepSeek AI Models on Campus
Su San Talks Tech
Su San Talks Tech
Feb 23, 2025 · Artificial Intelligence

How DeepSeek’s Distillation Breaks AI Model Limits: Core Principles & Performance

This article explores DeepSeek’s cutting‑edge distillation technology, detailing its definition, underlying principles, innovative data‑model fusion, architecture choices, training strategies, performance gains over large language models, and the remaining challenges in knowledge transfer and multimodal data processing.

DeepSeekMultimodal Learningai-optimization
0 likes · 16 min read
How DeepSeek’s Distillation Breaks AI Model Limits: Core Principles & Performance
macrozheng
macrozheng
Feb 22, 2025 · Artificial Intelligence

Choosing the Right DeepSeek‑R1 Model: Hardware Needs & Use Cases Explained

This guide compares DeepSeek‑R1’s 1.5B/7B/8B, 14B/32B, and 70B/671B versions, detailing their characteristics, typical applications, and the specific CPU, memory, and GPU specifications required for local deployment, helping you select the optimal model for your resources.

AI model deploymentDeepSeekHardware Requirements
0 likes · 7 min read
Choosing the Right DeepSeek‑R1 Model: Hardware Needs & Use Cases Explained
Rare Earth Juejin Tech Community
Rare Earth Juejin Tech Community
Feb 22, 2025 · Artificial Intelligence

Deploying DeepSeek Locally with Ollama, Building Personal and Organizational Knowledge Bases, and Integrating with Spring AI

This guide explains how to locally deploy the DeepSeek large‑language model using Ollama on Windows, macOS, and Linux, configure model storage and CORS, build personal and enterprise RAG knowledge bases with AnythingLLM and Open WebUI, and integrate the model into a Spring AI application via Docker and Docker‑Compose.

DeepSeekDockerKnowledge Base
0 likes · 16 min read
Deploying DeepSeek Locally with Ollama, Building Personal and Organizational Knowledge Bases, and Integrating with Spring AI