Tagged articles

DeepSeek

623 articles · Page 7 of 7
Software Engineering 3.0 Era
Software Engineering 3.0 Era
Feb 1, 2025 · Artificial Intelligence

DeepSeek Deep Dive: How Its Breakthroughs Could Usher in an Era of Universal AI

The article provides a detailed analysis of DeepSeek’s model performance across language, reasoning, and code generation benchmarks, its cost‑effective training methods, novel architecture innovations, the team’s expertise, and the broader impact these factors may have on accelerating AI innovation and reshaping industry competition.

AI benchmarksAI industry impactDeepSeek
0 likes · 18 min read
DeepSeek Deep Dive: How Its Breakthroughs Could Usher in an Era of Universal AI
Alibaba Cloud Big Data AI Platform
Alibaba Cloud Big Data AI Platform
Feb 1, 2025 · Artificial Intelligence

Deploy DeepSeek-V3 and R1 Models with One-Click on Alibaba Cloud PAI Model Gallery

This article introduces Alibaba Cloud's PAI Model Gallery, detailing the DeepSeek-V3 and DeepSeek‑R1 large language models, their architectures and parameters, and provides a step‑by‑step guide for one‑click deployment of these models and their distilled variants using vLLM or BladeLLM.

AI inferenceAlibaba CloudDeepSeek
0 likes · 6 min read
Deploy DeepSeek-V3 and R1 Models with One-Click on Alibaba Cloud PAI Model Gallery
Code Mala Tang
Code Mala Tang
Jan 31, 2025 · Artificial Intelligence

Master DeepSeek: 7 Prompt Engineering Tricks to Boost AI Responses

This guide presents seven practical prompt‑engineering techniques—clear goals, structured queries, domain terminology, concrete examples, scoped questions, step‑by‑step breakdowns, and multi‑turn interactions—to help users get more accurate and useful answers from DeepSeek.

AI promptsDeepSeekLanguage Model
0 likes · 6 min read
Master DeepSeek: 7 Prompt Engineering Tricks to Boost AI Responses
Architect
Architect
Jan 29, 2025 · Artificial Intelligence

How Janus‑Pro Redefines Multimodal AI with Bigger Models and New Training Strategies

DeepSeek’s newly released Janus‑Pro series (1B and 7B) advances multimodal AI by decoupling visual understanding and generation, employing optimized three‑stage training, massive data expansion, and larger LLM backbones, achieving performance that matches or exceeds leading models such as Meta, Google, OpenAI, and Stability AI.

DeepSeekJanus-ProModel Scaling
0 likes · 6 min read
How Janus‑Pro Redefines Multimodal AI with Bigger Models and New Training Strategies
Software Engineering 3.0 Era
Software Engineering 3.0 Era
Jan 28, 2025 · Artificial Intelligence

How DeepSeek’s $5.5 M Training Cost Triggered a $1 T Market Collapse and Redefined AI Innovation

DeepSeek’s low‑cost, open‑source AI model, trained for $5.5 million, caused Nvidia’s market value to plunge by nearly $6 trillion, outperformed proprietary rivals on benchmarks, slashed token costs to $0.14, and sparked a global debate on AI democratization and the end of compute‑centric dominance.

AI democratizationDeepSeekNvidia market impact
0 likes · 8 min read
How DeepSeek’s $5.5 M Training Cost Triggered a $1 T Market Collapse and Redefined AI Innovation
Programmer DD
Programmer DD
Jan 27, 2025 · Artificial Intelligence

Run DeepSeek‑R1 Locally with Ollama and Call It from Spring Boot

Learn how to deploy the open‑source DeepSeek‑R1 model using Ollama on Linux or macOS, configure various model sizes, and integrate it into a Spring Boot application with Spring AI to build an API‑driven translation service, complete with code examples and testing.

APIDeepSeekOllama
0 likes · 9 min read
Run DeepSeek‑R1 Locally with Ollama and Call It from Spring Boot
Software Engineering 3.0 Era
Software Engineering 3.0 Era
Jan 27, 2025 · Industry Insights

What Capital Currents Hide Behind DeepSeek’s R1 Model Surge?

The article analyzes how DeepSeek’s R1 model, touted as a low‑cost AI breakthrough, sparked Wall Street speculation, prompted a sharp Nvidia stock decline, and may be part of a broader quant‑driven strategy to manipulate market sentiment and capture short‑term capital gains.

AI hardwareDeepSeekNVIDIA
0 likes · 8 min read
What Capital Currents Hide Behind DeepSeek’s R1 Model Surge?
AI Code to Success
AI Code to Success
Jan 26, 2025 · Industry Insights

How DeepSeek‑R1 Is Challenging OpenAI’s o1 and Shaping the AI Landscape

DeepSeek‑R1 achieved a 1357‑point Arena score, ranking third overall and tying OpenAI o1 for first in StyleCtrl, while its open‑source MIT‑licensed release—including distilled variants—and low‑cost API service aim to democratize advanced AI inference for developers worldwide.

AI competitionArena benchmarkDeepSeek
0 likes · 5 min read
How DeepSeek‑R1 Is Challenging OpenAI’s o1 and Shaping the AI Landscape
DevOps
DevOps
Jan 25, 2025 · Artificial Intelligence

DeepSeek R1: An Open‑Source Large Model Matching OpenAI’s o1 at a Fraction of the Cost

DeepSeek’s newly released R1 model delivers performance comparable to OpenAI’s o1 while cutting inference costs by 90‑95%, leveraging innovative MLA and MoE architectures, low‑cost hardware training, an open‑source strategy, and a youthful, flat‑structured team that challenges the AI industry’s high‑spending model.

AI startupCost‑Efficient TrainingDeepSeek
0 likes · 12 min read
DeepSeek R1: An Open‑Source Large Model Matching OpenAI’s o1 at a Fraction of the Cost
Alibaba Cloud Native
Alibaba Cloud Native
Jan 22, 2025 · Cloud Native

Seamlessly Migrate from OpenAI to DeepSeek with Higress AI Gateway

This guide explains how to install the Higress AI gateway, configure provider API keys, set up gray‑release routing between OpenAI and DeepSeek, use a Python client to call DeepSeek, and enable content security and observability features for safe, cost‑effective large‑model deployments.

AI GatewayDeepSeekHigress
0 likes · 7 min read
Seamlessly Migrate from OpenAI to DeepSeek with Higress AI Gateway
Baobao Algorithm Notes
Baobao Algorithm Notes
Jan 22, 2025 · Artificial Intelligence

Can RL‑Only Training Make LLMs Beat OpenAI‑o1? Inside DeepSeek‑R1’s Architecture and Results

DeepSeek‑R1’s open‑source series demonstrates that reinforcement‑learning‑only training can match top‑tier models like OpenAI‑o1, while a small amount of SFT further improves readability; the article dissects its technical report, training pipeline, reward design, distillation strategy, benchmark outcomes, and remaining challenges.

DeepSeekLarge Language ModelSupervised Fine‑Tuning
0 likes · 11 min read
Can RL‑Only Training Make LLMs Beat OpenAI‑o1? Inside DeepSeek‑R1’s Architecture and Results
Java Architecture Diary
Java Architecture Diary
Jan 21, 2025 · Artificial Intelligence

Unlocking DeepSeek R1: How to Leverage the New Reasoning Model with Spring AI

This article introduces DeepSeek R1, a breakthrough reasoning‑focused large model that visualizes its chain‑of‑thought process, matches OpenAI O1 performance, offers open‑source advantages, and provides step‑by‑step Spring AI integration guidance, including dependency setup, configuration, and code examples.

AI integrationDeepSeekR1
0 likes · 9 min read
Unlocking DeepSeek R1: How to Leverage the New Reasoning Model with Spring AI
Software Engineering 3.0 Era
Software Engineering 3.0 Era
Jan 20, 2025 · Industry Insights

Can Xiaohongshu + DeepSeek Overturn Baidu? A Hard‑Core Social + AI Showdown

The article analyzes how Xiaohongshu’s youthful, content‑rich community combined with DeepSeek’s real‑time, transparent, conversational AI could challenge Baidu’s search dominance by offering personalized, socially integrated experiences, while also examining the partnership’s business model, technical strengths, and the uncertainties that remain.

AIBaiduDeepSeek
0 likes · 12 min read
Can Xiaohongshu + DeepSeek Overturn Baidu? A Hard‑Core Social + AI Showdown
Baobao Algorithm Notes
Baobao Algorithm Notes
Jan 7, 2025 · Artificial Intelligence

How Efficient Is DeepSeek V3? Calculating Its MFU Around 37%

This article derives DeepSeek V3's training Model FLOPs Utilization (MFU) using publicly available data, showing an MFU of roughly 37%—about a 60% improvement over V2—and provides detailed formulas, parameter settings, and a reproducible Python script.

AI performanceDeepSeekLarge Language Model
0 likes · 8 min read
How Efficient Is DeepSeek V3? Calculating Its MFU Around 37%
ShiZhen AI
ShiZhen AI
Jan 6, 2025 · Industry Insights

AI Daily Roundup: Altman's Singularity Hint, Microsoft’s $80B AI Investment, DeepSeek DeepThink, and BCI Breakthroughs

The article reviews Sam Altman's cryptic six‑word tweet about approaching the AI singularity, Microsoft’s $80 billion plan to expand AI data‑center infrastructure, DeepSeek’s DeepThink feature for step‑wise reasoning, and NeuroXess’s brain‑computer‑interface advances that let patients control AI and robots with thought.

AIDeepSeekMicrosoft
0 likes · 8 min read
AI Daily Roundup: Altman's Singularity Hint, Microsoft’s $80B AI Investment, DeepSeek DeepThink, and BCI Breakthroughs
ZhongAn Tech Team
ZhongAn Tech Team
Jan 5, 2025 · Artificial Intelligence

Weekly AI Roundup Issue 9: OpenAI Vision, LeCun Interview, ByteDance HLLM, and DeepSeek‑V3 Highlights

This issue presents a curated overview of recent AI developments, including Sam Altman's 2025 technology vision poll, LeCun's interview on future AI directions, ByteDance's hierarchical large language model for recommendation, and the performance and cost advantages of the open‑source DeepSeek‑V3 model.

AIByteDanceDeepSeek
0 likes · 10 min read
Weekly AI Roundup Issue 9: OpenAI Vision, LeCun Interview, ByteDance HLLM, and DeepSeek‑V3 Highlights
CSS Magic
CSS Magic
May 13, 2024 · Artificial Intelligence

DeepSeek: China’s New LLM Dark Horse – First Impressions and Shockingly Low Prices

The article evaluates DeepSeek v2, a 100‑billion‑parameter MoE model, highlighting its near‑GPT‑4 benchmark performance, OpenAI‑compatible API, 32k‑token context, exceptionally low pricing, a custom token‑utilization metric, and the practical drawbacks observed during hands‑on testing.

API compatibilityDeepSeekLarge Language Model
0 likes · 9 min read
DeepSeek: China’s New LLM Dark Horse – First Impressions and Shockingly Low Prices
Baobao Algorithm Notes
Baobao Algorithm Notes
May 9, 2024 · Artificial Intelligence

Inside Deepseek‑V2: How Multi‑Head Latent Attention Cuts KV‑Cache and Boosts Performance

This article provides an in‑depth technical analysis of Deepseek‑V2, covering its 236B parameter size, Multi‑Head Latent Attention optimization that reduces KV‑cache memory, architectural details, training pipelines, infrastructure choices, and performance results on benchmarks such as MMLU and instruction following.

AI ArchitectureDeepSeekLarge Language Model
0 likes · 17 min read
Inside Deepseek‑V2: How Multi‑Head Latent Attention Cuts KV‑Cache and Boosts Performance