Tagged articles

DeepSeek

623 articles · Page 7 of 7

Feb 1, 2025 · Artificial Intelligence

DeepSeek Deep Dive: How Its Breakthroughs Could Usher in an Era of Universal AI

The article provides a detailed analysis of DeepSeek’s model performance across language, reasoning, and code generation benchmarks, its cost‑effective training methods, novel architecture innovations, the team’s expertise, and the broader impact these factors may have on accelerating AI innovation and reshaping industry competition.

AI benchmarksAI industry impactDeepSeek

0 likes · 18 min read

DeepSeek Deep Dive: How Its Breakthroughs Could Usher in an Era of Universal AI

Alibaba Cloud Big Data AI Platform

Feb 1, 2025 · Artificial Intelligence

Deploy DeepSeek-V3 and R1 Models with One-Click on Alibaba Cloud PAI Model Gallery

This article introduces Alibaba Cloud's PAI Model Gallery, detailing the DeepSeek-V3 and DeepSeek‑R1 large language models, their architectures and parameters, and provides a step‑by‑step guide for one‑click deployment of these models and their distilled variants using vLLM or BladeLLM.

AI inferenceAlibaba CloudDeepSeek

0 likes · 6 min read

Deploy DeepSeek-V3 and R1 Models with One-Click on Alibaba Cloud PAI Model Gallery

Code Mala Tang

Jan 31, 2025 · Artificial Intelligence

Master DeepSeek: 7 Prompt Engineering Tricks to Boost AI Responses

This guide presents seven practical prompt‑engineering techniques—clear goals, structured queries, domain terminology, concrete examples, scoped questions, step‑by‑step breakdowns, and multi‑turn interactions—to help users get more accurate and useful answers from DeepSeek.

AI promptsDeepSeekLanguage Model

0 likes · 6 min read

Master DeepSeek: 7 Prompt Engineering Tricks to Boost AI Responses

Alibaba Cloud Infrastructure

Jan 31, 2025 · Cloud Computing

How to Deploy DeepSeek‑R1 on Alibaba Cloud Compute Nest in Minutes

This guide walks you through deploying the open‑source DeepSeek‑R1 inference model on Alibaba Cloud's Compute Nest platform, covering service creation, instance configuration, login procedures, and API calls with sample curl commands for text generation and chat.

AI modelAlibaba CloudCompute Nest

0 likes · 4 min read

How to Deploy DeepSeek‑R1 on Alibaba Cloud Compute Nest in Minutes

Java Web Project

Jan 29, 2025 · Industry Insights

How DeepSeek’s Low‑Cost AI Model Is Redrawing the Compute Landscape and Salary Benchmarks

DeepSeek’s ability to deliver top‑tier model performance on modest hardware sparked a US‑stock flash crash, challenged the high‑GPU demand narrative, and revealed unusually high salary tiers for AI researchers, prompting a reassessment of compute economics and talent compensation in the industry.

AI computeDeepSeekMarket Trends

0 likes · 5 min read

How DeepSeek’s Low‑Cost AI Model Is Redrawing the Compute Landscape and Salary Benchmarks

Architect

Jan 29, 2025 · Artificial Intelligence

How Janus‑Pro Redefines Multimodal AI with Bigger Models and New Training Strategies

DeepSeek’s newly released Janus‑Pro series (1B and 7B) advances multimodal AI by decoupling visual understanding and generation, employing optimized three‑stage training, massive data expansion, and larger LLM backbones, achieving performance that matches or exceeds leading models such as Meta, Google, OpenAI, and Stability AI.

DeepSeekJanus-ProModel Scaling

0 likes · 6 min read

How Janus‑Pro Redefines Multimodal AI with Bigger Models and New Training Strategies

Software Engineering 3.0 Era

Jan 28, 2025 · Artificial Intelligence

How DeepSeek’s $5.5 M Training Cost Triggered a $1 T Market Collapse and Redefined AI Innovation

DeepSeek’s low‑cost, open‑source AI model, trained for $5.5 million, caused Nvidia’s market value to plunge by nearly $6 trillion, outperformed proprietary rivals on benchmarks, slashed token costs to $0.14, and sparked a global debate on AI democratization and the end of compute‑centric dominance.

AI democratizationDeepSeekNvidia market impact

0 likes · 8 min read

How DeepSeek’s $5.5 M Training Cost Triggered a $1 T Market Collapse and Redefined AI Innovation

Su San Talks Tech

Jan 28, 2025 · Artificial Intelligence

How DeepSeek Overtook ChatGPT on the App Store: Low‑Cost AI Model Shakes the Industry

DeepSeek, a Chinese AI model, surged to the top of both China and US Apple App Store free‑app charts, outpacing ChatGPT and other major generative AI services, while boasting dramatically lower training costs and an open‑source approach that has sparked worldwide attention.

AI modelApp StoreChatGPT

0 likes · 4 min read

How DeepSeek Overtook ChatGPT on the App Store: Low‑Cost AI Model Shakes the Industry

Programmer DD

Jan 27, 2025 · Artificial Intelligence

Run DeepSeek‑R1 Locally with Ollama and Call It from Spring Boot

Learn how to deploy the open‑source DeepSeek‑R1 model using Ollama on Linux or macOS, configure various model sizes, and integrate it into a Spring Boot application with Spring AI to build an API‑driven translation service, complete with code examples and testing.

APIDeepSeekOllama

0 likes · 9 min read

Run DeepSeek‑R1 Locally with Ollama and Call It from Spring Boot

Software Engineering 3.0 Era

Jan 27, 2025 · Industry Insights

What Capital Currents Hide Behind DeepSeek’s R1 Model Surge?

The article analyzes how DeepSeek’s R1 model, touted as a low‑cost AI breakthrough, sparked Wall Street speculation, prompted a sharp Nvidia stock decline, and may be part of a broader quant‑driven strategy to manipulate market sentiment and capture short‑term capital gains.

AI hardwareDeepSeekNVIDIA

0 likes · 8 min read

What Capital Currents Hide Behind DeepSeek’s R1 Model Surge?

AI Code to Success

Jan 26, 2025 · Industry Insights

How DeepSeek‑R1 Is Challenging OpenAI’s o1 and Shaping the AI Landscape

DeepSeek‑R1 achieved a 1357‑point Arena score, ranking third overall and tying OpenAI o1 for first in StyleCtrl, while its open‑source MIT‑licensed release—including distilled variants—and low‑cost API service aim to democratize advanced AI inference for developers worldwide.

AI competitionArena benchmarkDeepSeek

0 likes · 5 min read

How DeepSeek‑R1 Is Challenging OpenAI’s o1 and Shaping the AI Landscape

DevOps

Jan 25, 2025 · Artificial Intelligence

DeepSeek R1: An Open‑Source Large Model Matching OpenAI’s o1 at a Fraction of the Cost

DeepSeek’s newly released R1 model delivers performance comparable to OpenAI’s o1 while cutting inference costs by 90‑95%, leveraging innovative MLA and MoE architectures, low‑cost hardware training, an open‑source strategy, and a youthful, flat‑structured team that challenges the AI industry’s high‑spending model.

AI startupCost‑Efficient TrainingDeepSeek

0 likes · 12 min read

DeepSeek R1: An Open‑Source Large Model Matching OpenAI’s o1 at a Fraction of the Cost

Alibaba Cloud Native

Jan 22, 2025 · Cloud Native

Seamlessly Migrate from OpenAI to DeepSeek with Higress AI Gateway

This guide explains how to install the Higress AI gateway, configure provider API keys, set up gray‑release routing between OpenAI and DeepSeek, use a Python client to call DeepSeek, and enable content security and observability features for safe, cost‑effective large‑model deployments.

AI GatewayDeepSeekHigress

0 likes · 7 min read

Seamlessly Migrate from OpenAI to DeepSeek with Higress AI Gateway

Baobao Algorithm Notes

Jan 22, 2025 · Artificial Intelligence

Can RL‑Only Training Make LLMs Beat OpenAI‑o1? Inside DeepSeek‑R1’s Architecture and Results

DeepSeek‑R1’s open‑source series demonstrates that reinforcement‑learning‑only training can match top‑tier models like OpenAI‑o1, while a small amount of SFT further improves readability; the article dissects its technical report, training pipeline, reward design, distillation strategy, benchmark outcomes, and remaining challenges.

DeepSeekLarge Language ModelSupervised Fine‑Tuning

0 likes · 11 min read

Can RL‑Only Training Make LLMs Beat OpenAI‑o1? Inside DeepSeek‑R1’s Architecture and Results

Java Architecture Diary

Jan 21, 2025 · Artificial Intelligence

Unlocking DeepSeek R1: How to Leverage the New Reasoning Model with Spring AI

This article introduces DeepSeek R1, a breakthrough reasoning‑focused large model that visualizes its chain‑of‑thought process, matches OpenAI O1 performance, offers open‑source advantages, and provides step‑by‑step Spring AI integration guidance, including dependency setup, configuration, and code examples.

AI integrationDeepSeekR1

0 likes · 9 min read

Unlocking DeepSeek R1: How to Leverage the New Reasoning Model with Spring AI

Software Engineering 3.0 Era

Jan 20, 2025 · Industry Insights

Can Xiaohongshu + DeepSeek Overturn Baidu? A Hard‑Core Social + AI Showdown

The article analyzes how Xiaohongshu’s youthful, content‑rich community combined with DeepSeek’s real‑time, transparent, conversational AI could challenge Baidu’s search dominance by offering personalized, socially integrated experiences, while also examining the partnership’s business model, technical strengths, and the uncertainties that remain.

AIBaiduDeepSeek

0 likes · 12 min read

Can Xiaohongshu + DeepSeek Overturn Baidu? A Hard‑Core Social + AI Showdown

Baobao Algorithm Notes

Jan 15, 2025 · Artificial Intelligence

How Multi-Token Prediction Boosts LLM Training and Inference Efficiency

This article reviews the evolution of Multi‑Token Prediction (MTP) techniques—from early blockwise parallel decoding to Meta's and DeepSeek's implementations—explaining their architectures, training and inference workflows, and the speed‑up gains they offer for large language models.

DeepSeekLLMMTP

0 likes · 20 min read

How Multi-Token Prediction Boosts LLM Training and Inference Efficiency

Baobao Algorithm Notes

Jan 7, 2025 · Artificial Intelligence

How Efficient Is DeepSeek V3? Calculating Its MFU Around 37%

This article derives DeepSeek V3's training Model FLOPs Utilization (MFU) using publicly available data, showing an MFU of roughly 37%—about a 60% improvement over V2—and provides detailed formulas, parameter settings, and a reproducible Python script.

AI performanceDeepSeekLarge Language Model

0 likes · 8 min read

How Efficient Is DeepSeek V3? Calculating Its MFU Around 37%

ShiZhen AI

Jan 6, 2025 · Industry Insights

AI Daily Roundup: Altman's Singularity Hint, Microsoft’s $80B AI Investment, DeepSeek DeepThink, and BCI Breakthroughs

The article reviews Sam Altman's cryptic six‑word tweet about approaching the AI singularity, Microsoft’s $80 billion plan to expand AI data‑center infrastructure, DeepSeek’s DeepThink feature for step‑wise reasoning, and NeuroXess’s brain‑computer‑interface advances that let patients control AI and robots with thought.

AIDeepSeekMicrosoft

0 likes · 8 min read

AI Daily Roundup: Altman's Singularity Hint, Microsoft’s $80B AI Investment, DeepSeek DeepThink, and BCI Breakthroughs

ZhongAn Tech Team

Jan 5, 2025 · Artificial Intelligence

Weekly AI Roundup Issue 9: OpenAI Vision, LeCun Interview, ByteDance HLLM, and DeepSeek‑V3 Highlights

This issue presents a curated overview of recent AI developments, including Sam Altman's 2025 technology vision poll, LeCun's interview on future AI directions, ByteDance's hierarchical large language model for recommendation, and the performance and cost advantages of the open‑source DeepSeek‑V3 model.

AIByteDanceDeepSeek

0 likes · 10 min read

Weekly AI Roundup Issue 9: OpenAI Vision, LeCun Interview, ByteDance HLLM, and DeepSeek‑V3 Highlights

ZhongAn Tech Team

Dec 1, 2024 · Artificial Intelligence

AI Weekly Digest Issue 4: Market Insights, Industry Solutions, and Emerging Technologies

The fourth AI weekly newsletter reviews recent industry news—including Jensen Huang's robot era vision and Tesla's Optimus plans—introduces Claude's new style‑customization feature, explores AI‑enhanced input methods, and evaluates DeepSeek's R1‑Lite model performance on complex reasoning tasks.

AIAI ApplicationsClaude

0 likes · 10 min read

AI Weekly Digest Issue 4: Market Insights, Industry Solutions, and Emerging Technologies

CSS Magic

May 13, 2024 · Artificial Intelligence

DeepSeek: China’s New LLM Dark Horse – First Impressions and Shockingly Low Prices

The article evaluates DeepSeek v2, a 100‑billion‑parameter MoE model, highlighting its near‑GPT‑4 benchmark performance, OpenAI‑compatible API, 32k‑token context, exceptionally low pricing, a custom token‑utilization metric, and the practical drawbacks observed during hands‑on testing.

API compatibilityDeepSeekLarge Language Model

0 likes · 9 min read

DeepSeek: China’s New LLM Dark Horse – First Impressions and Shockingly Low Prices

Baobao Algorithm Notes

May 9, 2024 · Artificial Intelligence

Inside Deepseek‑V2: How Multi‑Head Latent Attention Cuts KV‑Cache and Boosts Performance

This article provides an in‑depth technical analysis of Deepseek‑V2, covering its 236B parameter size, Multi‑Head Latent Attention optimization that reduces KV‑cache memory, architectural details, training pipelines, infrastructure choices, and performance results on benchmarks such as MMLU and instruction following.

AI ArchitectureDeepSeekLarge Language Model

0 likes · 17 min read

Inside Deepseek‑V2: How Multi‑Head Latent Attention Cuts KV‑Cache and Boosts Performance