Tagged articles
2 articles
Page 1 of 1
IT Architects Alliance
IT Architects Alliance
Feb 15, 2025 · Artificial Intelligence

DeepSeek: Architecture, Core Technologies, Training Strategies, and Comparative Analysis

The article provides an in‑depth overview of DeepSeek's transformer‑based foundation, Mixture‑of‑Experts architecture, novel attention mechanisms, multi‑token prediction, FP8 mixed‑precision training, knowledge distillation, reinforcement‑learning approaches, and compares its performance and cost advantages against leading models such as GPT and Gemini.

AI model architectureDeepSeekFP8 training
0 likes · 29 min read
DeepSeek: Architecture, Core Technologies, Training Strategies, and Comparative Analysis
Architect
Architect
Dec 14, 2023 · Artificial Intelligence

How Multi‑Task Multi‑Scene Modeling Powers ZhiZhuan’s Search: Algorithms, Industry Practices, and Lessons

This article analyzes the challenges of multi‑task and multi‑scene recommendation for large‑scale C‑end services, reviews key academic and industry solutions such as Shared‑Bottom, MMoE, PLE, ESMM, LHUC, PEPNet, MTMS and HiNet, and details ZhiZhuan’s end‑to‑end architecture that achieved over 6% click‑through and 2% conversion improvements.

AI model architectureRecommendation SystemsZhiZhuan
0 likes · 15 min read
How Multi‑Task Multi‑Scene Modeling Powers ZhiZhuan’s Search: Algorithms, Industry Practices, and Lessons