Tagged articles

AI model optimization

4 articles · Page 1 of 1

May 8, 2026 · Interview Experience

From Rookie to Tech Pro in One Year: How Two New Engineers Mastered Real-World Projects

The article interviews two engineers who joined DeWu a year ago, revealing how structured mentorship, hands‑on project ownership, AI model optimization, and a supportive learning ecosystem accelerated their growth from beginners to contributors capable of leading complex technical initiatives.

AI model optimizationcareer growthknowledge sharing

0 likes · 13 min read

From Rookie to Tech Pro in One Year: How Two New Engineers Mastered Real-World Projects

Alibaba Cloud Big Data AI Platform

Dec 16, 2025 · Artificial Intelligence

How CosyVoice 2.0 Cuts First‑Chunk Latency for High‑Fidelity Voice Cloning

CosyVoice 2.0, Alibaba DAMO Academy's next‑gen high‑fidelity speech synthesis model, introduces architecture decoupling, streaming generation, reference‑audio caching and dynamic load balancing to dramatically reduce first‑packet latency and improve real‑time factor while supporting multi‑language voice cloning.

AI model optimizationLow latencyStreaming Inference

0 likes · 9 min read

How CosyVoice 2.0 Cuts First‑Chunk Latency for High‑Fidelity Voice Cloning

DataFunSummit

Sep 1, 2025 · Artificial Intelligence

How We Cut ERNIE Model Resource Use by 75% with Pruning, Structured Slimming, and ONNX Runtime

In this detailed engineering guide we diagnose a heavyweight ERNIE‑Base text‑classification service consuming 128 CPU cores and 96 GB RAM, then apply a three‑step optimization—model selection, structured pruning with PaddleSlim, and engine migration to ONNX Runtime—achieving a 75% reduction in resource usage while keeping recall above 99.5% and boosting inference speed by over 20%.

AI model optimizationModel PruningONNX Runtime

0 likes · 11 min read

How We Cut ERNIE Model Resource Use by 75% with Pruning, Structured Slimming, and ONNX Runtime

Data Thinking Notes

Feb 20, 2025 · Artificial Intelligence

How to Deploy DeepSeek R1 671B Model Locally with Ollama: A Step‑by‑Step Guide

This article provides a comprehensive tutorial on locally deploying the 671‑billion‑parameter DeepSeek R1 model using Ollama, covering model selection, hardware requirements, dynamic quantization, detailed installation steps, performance observations, and practical recommendations for consumer‑grade hardware.

AI model optimizationDeepSeekDynamic Quantization

0 likes · 14 min read

How to Deploy DeepSeek R1 671B Model Locally with Ollama: A Step‑by‑Step Guide

AI model optimization

From Rookie to Tech Pro in One Year: How Two New Engineers Mastered Real-World Projects

How CosyVoice 2.0 Cuts First‑Chunk Latency for High‑Fidelity Voice Cloning

How We Cut ERNIE Model Resource Use by 75% with Pruning, Structured Slimming, and ONNX Runtime

How to Deploy DeepSeek R1 671B Model Locally with Ollama: A Step‑by‑Step Guide

How CosyVoice 2.0 Cuts First‑Chunk Latency for High‑Fidelity Voice Cloning