Tag

edge deployment

0 views collected around this technical thread.

Java Captain
Java Captain
Feb 7, 2025 · Artificial Intelligence

DeepSeek: Disruptive Innovations in Large Language Model Architecture, Efficiency, and Ecosystem

DeepSeek reshapes the AI landscape by replacing brute‑force compute scaling with algorithmic breakthroughs such as a novel MoE architecture, memory compression, active‑learning data pipelines, and open‑source tooling, delivering dramatically lower training and inference costs while enabling edge deployment and a vibrant developer ecosystem.

Algorithmic EfficiencyDeepSeekLarge Language Models
0 likes · 11 min read
DeepSeek: Disruptive Innovations in Large Language Model Architecture, Efficiency, and Ecosystem
Sohu Tech Products
Sohu Tech Products
May 21, 2024 · Artificial Intelligence

OPPO Multimodal Pretrained Model Deployment in Cloud-Edge Scenarios: Practices and Optimizations

OPPO details how it deploys multimodal pretrained models on resource‑constrained edge devices by compressing CLIP‑based image‑text retrieval, adapting Chinese text‑to‑image generation with LoRA and adapters, and lightweighting diffusion models through layer pruning and progressive distillation, achieving sub‑3‑second generation while preserving cloud‑level quality.

ClipLoRAOPPO
0 likes · 18 min read
OPPO Multimodal Pretrained Model Deployment in Cloud-Edge Scenarios: Practices and Optimizations
DaTaobao Tech
DaTaobao Tech
Jan 5, 2024 · Mobile Development

Edge Deployment and Performance Optimization of Large Language Models with MNN

The upgraded mnn‑llm framework adds a unified llm‑export pipeline, cross‑platform inference with tokenizers and disk‑embedding, and ARM‑focused linear‑layer optimizations—including SIMD, hand‑written assembly and 4‑bit quantization—that dramatically speed up prefilling and achieve real‑time LLM conversation on mobile devices within a 2 GB memory budget, outperforming llama.cpp, fastllm and mlc‑llm.

ARM CPULLMMNN
0 likes · 17 min read
Edge Deployment and Performance Optimization of Large Language Models with MNN
DataFunSummit
DataFunSummit
Sep 11, 2023 · Artificial Intelligence

Challenges and Insights for Deploying Large Models on Edge with MNN

The talk presents an overview of the MNN inference engine, outlines the end‑to‑end workflow for deploying large language models on mobile devices, discusses technical challenges and practical solutions, and concludes with future directions for edge AI deployment.

AIInference EngineLarge Models
0 likes · 2 min read
Challenges and Insights for Deploying Large Models on Edge with MNN
Baidu Geek Talk
Baidu Geek Talk
Mar 9, 2022 · Artificial Intelligence

Communication Tower Recognition Using PaddlePaddle: An Industrial AI Practice

The article describes an industrial AI system that uses PaddlePaddle’s PP‑PicoDet model, enhanced with COCO pre‑training and quantization, to accurately recognize communication towers in diverse outdoor conditions, achieving 94.5% mAP at 78 ms inference and supporting edge deployment via PaddleLite and ONNX.

PP-PicoDetPaddlePaddlecommunication tower
0 likes · 6 min read
Communication Tower Recognition Using PaddlePaddle: An Industrial AI Practice