Tagged articles
74 articles
Page 1 of 1
DataFunSummit
DataFunSummit
Apr 22, 2026 · Artificial Intelligence

From Flawed RAG to Production‑Ready: Deep Dive into Scaling Retrieval‑Augmented Generation

This expert roundtable dissects why RAG often fails in production—low recall, hallucinations, cost overruns—and walks through concrete diagnostics, hybrid search designs, knowledge‑engineering tricks, GraphRAG and Agentic RAG advances, plus practical deployment, security, and cost‑optimization guidelines.

AI deploymentAgentic RAGHybrid Search
0 likes · 20 min read
From Flawed RAG to Production‑Ready: Deep Dive into Scaling Retrieval‑Augmented Generation
Coder Trainee
Coder Trainee
Apr 20, 2026 · Artificial Intelligence

How to Install and Configure Ollama Locally for a CRM AI Engine

This guide walks through installing Ollama on Windows 10, downloading a Chinese‑friendly LLM such as Qwen2, configuring a CRM’s application‑dev.yml to point to the local Ollama service, restarting the backend, and handling optional CORS settings, highlighting zero‑cost, privacy, and stability benefits.

AI deploymentCRM integrationLocal-LLM
0 likes · 4 min read
How to Install and Configure Ollama Locally for a CRM AI Engine
DataFunTalk
DataFunTalk
Apr 16, 2026 · Operations

Deploy Your AI Hermes Agent in Minutes with PPHermes Cloud Sandbox

This guide walks you through installing Python, obtaining a PPIO API key, installing the PPHermes CLI, launching a Hermes Agent sandbox in the cloud, and managing its lifecycle, with optional integration to Feishu/Lark and AI‑agent skill usage.

AI deploymentCLIDevOps
0 likes · 10 min read
Deploy Your AI Hermes Agent in Minutes with PPHermes Cloud Sandbox
AI Large-Model Wave and Transformation Guide
AI Large-Model Wave and Transformation Guide
Apr 12, 2026 · Industry Insights

How to Choose the Right Large Language Model in 2025: A Six‑Dimension Guide

This article analyzes the rapid growth of large language models, presents a six‑dimensional classification framework, compares open‑source and closed‑source options, and offers a step‑by‑step selection checklist for enterprises seeking the most suitable model for their specific needs.

AI deploymentAI trendsEnterprise AI
0 likes · 10 min read
How to Choose the Right Large Language Model in 2025: A Six‑Dimension Guide
AI Large-Model Wave and Transformation Guide
AI Large-Model Wave and Transformation Guide
Apr 11, 2026 · Artificial Intelligence

How to Build a Full‑Cycle Model Engineering System for Scalable AI

This article outlines a comprehensive, six‑part model engineering framework that transforms AI capabilities into reusable business functions, defines a stable technical stack, establishes model selection and architecture guidelines, implements rigorous control, data, and training processes, and explains how these layers synergize for reliable, scalable deployment.

AI deploymentModel TrainingOperations
0 likes · 27 min read
How to Build a Full‑Cycle Model Engineering System for Scalable AI
Woodpecker Software Testing
Woodpecker Software Testing
Apr 3, 2026 · Artificial Intelligence

Why 80% of AI Projects Fail: Bridging Model Evaluation from Theory to Real‑World Impact

The article explains that most AI project failures stem from unrealistic evaluation rather than model intelligence, and outlines concrete practices—business‑aligned metrics, scenario sandboxes, human‑in‑the‑loop reviews, and auditable documentation—to make model evaluation truly actionable.

AI deploymentAI reliabilityMLOps
0 likes · 7 min read
Why 80% of AI Projects Fail: Bridging Model Evaluation from Theory to Real‑World Impact
Yunqi AI+
Yunqi AI+
Mar 14, 2026 · Industry Insights

Why Building an AI Knowledge Base Becomes an All‑Hands Initiative Once AI Goes Deep

The article explains how scaling AI agents reveals fragmented, inconsistent internal documentation, and argues that high‑quality production knowledge bases require a company‑wide, role‑based process, concrete writing rules, continuous inspection, and cross‑department ownership to ensure AI answers remain accurate and user‑focused.

AI deploymentAI knowledge baseData Quality
0 likes · 14 min read
Why Building an AI Knowledge Base Becomes an All‑Hands Initiative Once AI Goes Deep
Woodpecker Software Testing
Woodpecker Software Testing
Mar 1, 2026 · Artificial Intelligence

Four Hidden Model Evaluation Pitfalls That Undermine AI Deployments

The article examines four common yet hidden model evaluation mistakes—confusing attractive metrics with business impact, using static test sets, ignoring statistical significance, and lacking fine‑grained attribution—illustrating each with real‑world cases and offering concrete practices to build a more robust, business‑aligned evaluation pipeline.

A/B testingAI deploymentMetrics
0 likes · 8 min read
Four Hidden Model Evaluation Pitfalls That Undermine AI Deployments
Fighter's World
Fighter's World
Dec 12, 2025 · Artificial Intelligence

Why OpenAI’s Forward Deployed Engineering Takes Six Months to Deliver Usable AI

The article explains how OpenAI’s Forward Deployed Engineering (FDE) team bridges the gap between powerful models and real‑world value by embedding engineers on‑site, iterating over a 6‑week technical rollout followed by a 4‑month trust‑building phase, and using eval‑driven development to turn custom solutions into reusable products.

AI deploymentEval-driven developmentForward Deployed Engineering
0 likes · 35 min read
Why OpenAI’s Forward Deployed Engineering Takes Six Months to Deliver Usable AI
AI2ML AI to Machine Learning
AI2ML AI to Machine Learning
Dec 3, 2025 · Artificial Intelligence

2026 Forecast: How Large‑Model AI Will Evolve After 2025 Breakthroughs

The article reviews the major 2025 breakthroughs in multimodal, open‑source, and deployment technologies for large models and outlines four 2026 trends—including ToC vs. ToB service split, dual‑hand data generation, MoE routing advances, and AI4Science breakthroughs—that will shape the next wave of AI development.

AI deploymentAI4ScienceMixture of Experts
0 likes · 6 min read
2026 Forecast: How Large‑Model AI Will Evolve After 2025 Breakthroughs
Sohu Tech Products
Sohu Tech Products
Nov 5, 2025 · Artificial Intelligence

How nndeploy Simplifies the Last Mile of On-Device AI Deployment

nndeploy is an open‑source, high‑performance on‑device AI deployment framework that abstracts the repetitive “last‑mile” workflow into a visual drag‑and‑drop DAG, offering multi‑platform inference, optimization, and ready‑to‑use model configs, enabling developers to go from prototype to production in minutes.

AI deploymentedge AInndeploy
0 likes · 15 min read
How nndeploy Simplifies the Last Mile of On-Device AI Deployment
21CTO
21CTO
Nov 5, 2025 · Artificial Intelligence

How Block Scaled AI Agents to 12,000 Employees in Just 8 Weeks

Block, a fintech giant, deployed AI agents across all 12,000 staff in eight weeks by adopting the Model Context Protocol, simplifying installation, offering model choice, automating tool management, and building a supportive community, revealing key lessons for enterprise AI adoption.

AI agentsAI deploymentEnterprise AI
0 likes · 10 min read
How Block Scaled AI Agents to 12,000 Employees in Just 8 Weeks
Alibaba Cloud Native
Alibaba Cloud Native
Sep 17, 2025 · Cloud Native

Deploy Dify AI Platform on Alibaba Cloud SAE in Two Simple Steps

This guide explains how to quickly deploy the open‑source Dify AI platform on Alibaba Cloud's Serverless Application Engine, covering the challenges of private deployment, the benefits of SAE's one‑click solution, and detailed step‑by‑step configuration to achieve a production‑grade AI service.

AI deploymentAlibaba CloudDify
0 likes · 5 min read
Deploy Dify AI Platform on Alibaba Cloud SAE in Two Simple Steps
Tencent Cloud Developer
Tencent Cloud Developer
Jul 24, 2025 · Artificial Intelligence

How Architects Turn Tech Roadblocks into Career Wins

This article showcases a prize‑driven Q&A session where senior architects answer real‑world questions on AI deployment, backend language choices, large‑model integration, and career advancement, offering practical guidance and a chance to win exclusive Tencent Cloud merchandise.

AI deploymentTechnology Selectioncareer advice
0 likes · 7 min read
How Architects Turn Tech Roadblocks into Career Wins
MaGe Linux Operations
MaGe Linux Operations
Jul 21, 2025 · Artificial Intelligence

Master Multi‑GPU Load Balancing for OLLAMA: From Zero to Production

This guide walks you through configuring OLLAMA for multi‑GPU load balancing, covering hardware checks, CUDA setup, native and Docker deployment methods, detailed parameter tuning, advanced sharding strategies, troubleshooting, performance optimization, and production‑grade monitoring to maximize throughput and stability of large language models.

AI deploymentCUDAOllama
0 likes · 16 min read
Master Multi‑GPU Load Balancing for OLLAMA: From Zero to Production
Data Thinking Notes
Data Thinking Notes
Jul 13, 2025 · Artificial Intelligence

How to Build an Enterprise Knowledge Base with Dify: Full Setup Guide

This article walks developers through the entire process of deploying Dify locally, configuring model providers, creating and segmenting a knowledge base with RAG, choosing indexing methods, and integrating the knowledge base into a chatbot application, complete with code snippets and visual guides.

AI deploymentDifyKnowledge Base
0 likes · 11 min read
How to Build an Enterprise Knowledge Base with Dify: Full Setup Guide
Data Thinking Notes
Data Thinking Notes
Jul 6, 2025 · Artificial Intelligence

How Quantization Shrinks Giant AI Models for Edge Devices

This article explains why quantizing massive AI models is essential for deploying them on resource‑constrained devices, outlines core quantization concepts, techniques, and methods, compares their pros and cons, and presents practical application scenarios such as smartphones, autonomous driving, IoT, and edge computing.

AI deploymentModel QuantizationPerformance Optimization
0 likes · 9 min read
How Quantization Shrinks Giant AI Models for Edge Devices
Instant Consumer Technology Team
Instant Consumer Technology Team
Jul 3, 2025 · Artificial Intelligence

Why Buying an AI Appliance Is a Strategic Pitfall for Enterprises

Enterprises rushing to purchase DeepSeek AI appliances and smart‑agent platforms often face hidden technical, data, and organizational challenges that turn promised "plug‑and‑play" solutions into costly missteps, highlighting the need for realistic strategy, robust data governance, and continuous capability building.

AI capability buildingAI deploymentData Governance
0 likes · 28 min read
Why Buying an AI Appliance Is a Strategic Pitfall for Enterprises
DataFunTalk
DataFunTalk
Jul 2, 2025 · Artificial Intelligence

How Multimodal Large Models Are Revolutionizing Complex Document OCR

In a detailed interview, Zhao Chenyang explains how multimodal large models (VLM) overcome the limitations of traditional OCR in mixed layouts, table reconstruction, and handwritten text by leveraging self‑supervised pre‑training, lightweight fine‑tuning, and hybrid pipelines that dramatically cut annotation costs and improve recall rates.

AI deploymentMultimodal AIdocument OCR
0 likes · 13 min read
How Multimodal Large Models Are Revolutionizing Complex Document OCR
Data Thinking Notes
Data Thinking Notes
Jun 4, 2025 · Artificial Intelligence

How DeepSeek AI Model is Revolutionizing China’s State Enterprises – Over 100 Deployment Cases

The DeepSeek large language model has been extensively deployed across more than 100 central and local Chinese state‑owned enterprises, spanning sectors such as energy, manufacturing, transportation, finance, telecommunications, construction, and public services, driving intelligent transformation through applications like smart scheduling, risk assessment, intelligent customer service, and AI‑enhanced office automation.

AI deploymentDeepSeekIndustrial AI
0 likes · 38 min read
How DeepSeek AI Model is Revolutionizing China’s State Enterprises – Over 100 Deployment Cases
Alibaba Cloud Native
Alibaba Cloud Native
May 16, 2025 · Artificial Intelligence

Boost AI Reliability with MetaGPT’s Multi‑Agent Collaboration on Serverless Function AI

This guide explains how MetaGPT’s multi‑agent architecture eliminates the logical gaps of single‑agent systems, improves task stability, and can be rapidly deployed on Alibaba Cloud’s Serverless Function AI platform with step‑by‑step instructions, configuration details, and example applications.

AI deploymentFunction AIMetaGPT
0 likes · 8 min read
Boost AI Reliability with MetaGPT’s Multi‑Agent Collaboration on Serverless Function AI
Alibaba Cloud Developer
Alibaba Cloud Developer
May 14, 2025 · Artificial Intelligence

Deploy Alibaba’s Qwen3 LLM in 10 Minutes with Bailei Platform

Learn how to quickly set up Alibaba Cloud’s Bailei platform to call the open-source Qwen3 large language model, explore its cost‑effective performance, dual‑mode reasoning, multilingual support, and enhanced agent capabilities, and follow step‑by‑step instructions for API key configuration, Cherry Studio integration, and tool‑calling setup.

AI deploymentAlibaba CloudMLOps
0 likes · 6 min read
Deploy Alibaba’s Qwen3 LLM in 10 Minutes with Bailei Platform
ByteDance Cloud Native
ByteDance Cloud Native
Apr 9, 2025 · Artificial Intelligence

How to Deploy ComfyUI Cluster Edition on Volcengine for Multi‑User AI Workflows

This guide explains how to launch the ComfyUI Cluster Edition on Volcengine, covering its enterprise features such as multi‑user collaboration, resource isolation, built‑in plugins, flexible mounting, and step‑by‑step deployment using VKE, CP, and API Gateway to enable efficient, scalable AI image generation.

AI deploymentComfyUIMulti-user collaboration
0 likes · 10 min read
How to Deploy ComfyUI Cluster Edition on Volcengine for Multi‑User AI Workflows
Architect
Architect
Apr 1, 2025 · Artificial Intelligence

When to Fine‑Tune Large Language Models vs. Relying on Prompting and RAG

The article explains why most projects should start with prompt engineering or simple agent workflows, outlines the scenarios where model fine‑tuning adds real value, compares fine‑tuning with Retrieval‑Augmented Generation, and offers practical criteria for deciding which approach to adopt.

AI deploymentLoRARAG
0 likes · 9 min read
When to Fine‑Tune Large Language Models vs. Relying on Prompting and RAG
Open Source Linux
Open Source Linux
Mar 5, 2025 · Artificial Intelligence

How DeepSeek‑R1 Redefines Prompt Engineering and Real‑World AI Deployment

The article analyzes DeepSeek‑R1’s low‑cost inference architecture, Chinese language optimizations, novel prompt‑engineering techniques, and the practical challenges of deploying large domestic models, offering insights into vertical AI applications and the evolving open‑source ecosystem in China.

AI deploymentDeepSeekModel Optimization
0 likes · 8 min read
How DeepSeek‑R1 Redefines Prompt Engineering and Real‑World AI Deployment
Alibaba Cloud Big Data AI Platform
Alibaba Cloud Big Data AI Platform
Mar 4, 2025 · Artificial Intelligence

Deploy a High‑Performance RAG Service with Hologres, DeepSeek, and PAI‑EAS

This guide walks you through building a Retrieval‑Augmented Generation (RAG) system by integrating Alibaba Cloud's Hologres vector store, the Proxima high‑performance vector engine, and DeepSeek large language models via PAI‑EAS, covering prerequisites, deployment steps, configuration, and inference verification.

AI deploymentDeepSeekHologres
0 likes · 12 min read
Deploy a High‑Performance RAG Service with Hologres, DeepSeek, and PAI‑EAS
Liangxu Linux
Liangxu Linux
Feb 16, 2025 · Artificial Intelligence

Build a Free Private AI with DeepSeek, Ollama, and Local Knowledge Base

This guide explains how to locally deploy the open‑source DeepSeek model using Ollama, enhance interaction with Chatbox and Page Assist, and connect a local knowledge base via AnythingLLM's RAG architecture, providing step‑by‑step instructions, hardware requirements, and API examples for a self‑hosted AI system.

AI deploymentAnythingLLMDeepSeek
0 likes · 22 min read
Build a Free Private AI with DeepSeek, Ollama, and Local Knowledge Base
Java Tech Enthusiast
Java Tech Enthusiast
Feb 15, 2025 · Artificial Intelligence

DeepSeek-R1: High-Performance AI Inference Model

DeepSeek‑R1 is a high‑performance AI inference model that leverages reinforcement‑learning techniques to boost reasoning on complex tasks, has become a Chinese‑New‑Year sensation, and requires substantial hardware resources for local deployment, especially the full‑scale 671‑billion‑parameter version.

AI deploymentAI inferenceAI model
0 likes · 4 min read
DeepSeek-R1: High-Performance AI Inference Model
Ops Development & AI Practice
Ops Development & AI Practice
Feb 14, 2025 · Artificial Intelligence

Large Model Format Showdown: Hugging Face, TensorFlow, ONNX, TorchScript, GGUF

This comprehensive guide examines the leading large‑model storage formats—including Hugging Face Transformers, TensorFlow SavedModel, ONNX, TorchScript, and GGUF—detailing their file structures, serialization methods, strengths, weaknesses, and typical use‑cases, helping developers and researchers select the optimal format for their specific AI workloads.

AI deploymentGGUFModel Formats
0 likes · 21 min read
Large Model Format Showdown: Hugging Face, TensorFlow, ONNX, TorchScript, GGUF
JD Tech Talk
JD Tech Talk
Feb 12, 2025 · Artificial Intelligence

Deploying a Private DeepSeek Large Language Model on JD Cloud with Ollama and Knowledge‑Base Tools

This guide explains how to privately deploy the DeepSeek large language model using a JD Cloud virtual computer, set up Ollama as the LLM service, run various model versions, and integrate local knowledge bases through CherryStudio, Page Assist, and AnythingLLM for offline and network‑enabled AI applications.

AI deploymentDeepSeekJD Cloud
0 likes · 16 min read
Deploying a Private DeepSeek Large Language Model on JD Cloud with Ollama and Knowledge‑Base Tools
Alibaba Cloud Big Data AI Platform
Alibaba Cloud Big Data AI Platform
Jan 10, 2025 · Artificial Intelligence

Deploy DeepSeek‑V3 LLM on Alibaba Cloud with One‑Click Model Gallery

This article introduces the 671‑billion‑parameter DeepSeek‑V3 Mixture‑of‑Experts LLM, explains the PAI‑Model Gallery platform that aggregates top AI models, and provides a step‑by‑step guide to deploy DeepSeek‑V3 on Alibaba Cloud’s PAI‑EAS service with zero‑code configuration.

AI deploymentAlibaba CloudDeepSeek-V3
0 likes · 5 min read
Deploy DeepSeek‑V3 LLM on Alibaba Cloud with One‑Click Model Gallery
Architecture and Beyond
Architecture and Beyond
Nov 23, 2024 · Artificial Intelligence

A Comprehensive Overview of AIGC Engineering Architecture and Its Core Roles

This article examines the AIGC engineering architecture, detailing its data, model, fine‑tuning, inference, application, and monitoring layers, and explains the distinct responsibilities and challenges of application engineers, algorithm engineers, and “alchemy” specialists, highlighting how this structured approach accelerates generative AI productization.

AI deploymentAIGCEngineering Architecture
0 likes · 24 min read
A Comprehensive Overview of AIGC Engineering Architecture and Its Core Roles
Fighter's World
Fighter's World
Oct 26, 2024 · Artificial Intelligence

Key Considerations for Deploying Large Language Models in Cloud Services

The article reflects on Alibaba Cloud's large‑model deployments, outlines four service scenarios, examines three fundamental questions about foundation models, and offers a prioritized roadmap—including prompt engineering, RAG, and organizational changes—to effectively bring LLMs to production.

AI deploymentAlibaba CloudCloud Services
0 likes · 8 min read
Key Considerations for Deploying Large Language Models in Cloud Services
AntTech
AntTech
Sep 6, 2024 · Artificial Intelligence

Large Model Industry Trustworthy Application Framework Research Report

Ant Group and the China Academy of Information and Communications Technology released a research report outlining a trustworthy application framework for large models in rigorous sectors such as finance and healthcare, detailing technical safeguards, industry case studies, and guidance for scalable, secure AI deployment.

AI GovernanceAI deploymentHealthcare AI
0 likes · 3 min read
Large Model Industry Trustworthy Application Framework Research Report
Baobao Algorithm Notes
Baobao Algorithm Notes
Aug 27, 2024 · Artificial Intelligence

Unlock Free GLM-4-Flash API: Step-by-Step Guide, Code Samples, and Logic Puzzle Test

This article explores the free GLM-4-Flash API from Zhipu AI, detailing its lightweight architecture, performance specs, a logic‑puzzle demonstration, and provides a comprehensive step‑by‑step tutorial—including data upload, model fine‑tuning, deployment commands and example code for building a LangChain‑based knowledge‑base retrieval system.

AI deploymentFine-tuningFree API
0 likes · 11 min read
Unlock Free GLM-4-Flash API: Step-by-Step Guide, Code Samples, and Logic Puzzle Test
ByteDance Cloud Native
ByteDance Cloud Native
Aug 12, 2024 · Cloud Native

How to Deploy NVIDIA NIM AI Models on Volcengine VKE in Minutes

This guide walks you through deploying large language models with NVIDIA NIM on Volcengine's Kubernetes Engine (VKE), covering environment setup, model optimization, Helm chart deployment, monitoring integration, and the key advantages of using NIM as a cloud‑native AI micro‑service.

AI deploymentGPUKubernetes
0 likes · 12 min read
How to Deploy NVIDIA NIM AI Models on Volcengine VKE in Minutes
ByteDance Cloud Native
ByteDance Cloud Native
Aug 7, 2024 · Artificial Intelligence

Deploy Stable Diffusion in 5 Minutes with Volcengine’s Continuous Delivery CP

Learn how to quickly launch a Stable Diffusion WebUI service in just five minutes using Volcengine’s cloud‑native continuous delivery platform, which abstracts Kubernetes complexities, provides pre‑configured AI templates, serverless VCI deployment, automatic scaling, API gateway access, and includes a Python client for image generation.

AI deploymentCloud NativePython
0 likes · 14 min read
Deploy Stable Diffusion in 5 Minutes with Volcengine’s Continuous Delivery CP
Architects' Tech Alliance
Architects' Tech Alliance
Jul 15, 2024 · Artificial Intelligence

Why Model-as-a-Service (MaaS) Is Shaping the Future of AI Deployment

This article examines the Model-as-a-Service (MaaS) paradigm, tracing its origins, defining its expanded capabilities for large‑model ecosystems, outlining the full‑stack services it offers, and analyzing current industry adoption, deployment models, and the technical and regulatory challenges that must be addressed for scalable AI rollout.

AI InfrastructureAI deploymentCloud AI
0 likes · 11 min read
Why Model-as-a-Service (MaaS) Is Shaping the Future of AI Deployment
DevOps
DevOps
Apr 18, 2024 · Artificial Intelligence

Expert Round‑Table on AIGC: Technology vs. Market Beliefs, Domestic Model Challenges, and Enterprise Deployment in China

The article presents a 2024 AIGC round‑table where Chinese experts discuss whether to follow a technology‑first or market‑first approach, the challenges of compute, algorithms and data, domestic versus foreign large‑model strategies, multi‑model deployment in enterprises, and criteria for evaluating successful AIGC applications.

AI deploymentAIGCChina AI
0 likes · 14 min read
Expert Round‑Table on AIGC: Technology vs. Market Beliefs, Domestic Model Challenges, and Enterprise Deployment in China
DataFunTalk
DataFunTalk
Dec 19, 2023 · Artificial Intelligence

Enterprise Large‑Model Deployment and Data Governance: Insights from Deepexi’s President

The article examines how enterprises can adopt domain‑specific large models by balancing demand‑side cost‑reduction needs with supply‑side mature training techniques, discusses team composition, fine‑tuning methods, data governance for unstructured data, and outlines Deepexi’s product ecosystem designed to improve efficiency, performance, and user experience.

AI deploymentEnterprise AIcost economics
0 likes · 13 min read
Enterprise Large‑Model Deployment and Data Governance: Insights from Deepexi’s President
DataFunSummit
DataFunSummit
Dec 16, 2023 · Artificial Intelligence

Enterprise Large Model Deployment: Data Governance, Fine‑Tuning Strategies, and Cost Economics

The article examines how enterprises can adopt domain‑specific large models by addressing talent and cost challenges, outlining self‑supervised pre‑training, instruction fine‑tuning, data governance for unstructured data, dataset balance, model‑type selection, and integrated product solutions to achieve efficient, high‑performance AI deployments.

AI deploymentData GovernanceEnterprise AI
0 likes · 13 min read
Enterprise Large Model Deployment: Data Governance, Fine‑Tuning Strategies, and Cost Economics
DataFunSummit
DataFunSummit
Dec 13, 2023 · Artificial Intelligence

Enterprise Large‑Model Deployment: Data Governance, Fine‑Tuning Strategies, and Cost Economics

The article explores how enterprises can adopt domain‑specific large language models by addressing talent and cost challenges, outlining training pipelines, data governance for unstructured data, dataset balancing, fine‑tuning techniques, and a product ecosystem that lowers deployment barriers while optimizing performance and economics.

AI deploymentData Governancecost economics
0 likes · 13 min read
Enterprise Large‑Model Deployment: Data Governance, Fine‑Tuning Strategies, and Cost Economics
DataFunTalk
DataFunTalk
Oct 20, 2023 · Artificial Intelligence

Building the ATLAS Automated Machine Learning Platform at Du Xiaoman: Architecture, Practices, and Optimizations

This article describes how Du Xiaoman tackled the high cost, instability, and long cycles of AI algorithm deployment by building the ATLAS automated machine learning platform, detailing its four‑stage workflow, component platforms, scaling and efficiency techniques, and practical Q&A for practitioners.

AI deploymentAutoMLData Parallelism
0 likes · 22 min read
Building the ATLAS Automated Machine Learning Platform at Du Xiaoman: Architecture, Practices, and Optimizations
UCloud Tech
UCloud Tech
Aug 30, 2023 · Artificial Intelligence

Unlocking Llama 2: Architecture, Training Insights, and Cloud Deployment Guide

This article explores Meta's Llama 2 large language model—its performance, expanded training data, architectural details, evaluation results, RLHF fine‑tuning process, and step‑by‑step deployment on UCloud UK8S using Docker and Kubernetes—providing a comprehensive guide for AI practitioners.

AI deploymentLlama-2RLHF
0 likes · 11 min read
Unlocking Llama 2: Architecture, Training Insights, and Cloud Deployment Guide
Volcano Engine Developer Services
Volcano Engine Developer Services
Jun 30, 2023 · Cloud Native

Deploy Langchain‑ChatGLM on Volcengine VKE: A Step‑by‑Step Cloud‑Native Guide

This tutorial walks you through preparing a VKE cluster, pulling the Langchain‑ChatGLM container image, creating the necessary Deployment and Service resources, and adding a local knowledge base, enabling you to run a Langchain‑based ChatGLM service with GPU support on Volcengine’s cloud‑native platform.

AI deploymentChatGLMGPU
0 likes · 6 min read
Deploy Langchain‑ChatGLM on Volcengine VKE: A Step‑by‑Step Cloud‑Native Guide
Tencent Cloud Developer
Tencent Cloud Developer
May 24, 2023 · Artificial Intelligence

Deploying Stable Diffusion on Tencent Cloud: A Step‑by‑Step Guide

Deploy Stable Diffusion on Tencent Cloud by building a Docker image, pushing it to TCR, creating a GPU‑enabled TKE cluster with CFS storage, configuring qGPU sharing, exposing the service via Cloud Native API Gateway, optimizing inference with TACO Kit, storing results in COS, and applying content moderation.

AI deploymentGPUKubernetes
0 likes · 19 min read
Deploying Stable Diffusion on Tencent Cloud: A Step‑by‑Step Guide
Alibaba Cloud Native
Alibaba Cloud Native
May 5, 2023 · Artificial Intelligence

Deploy Stable Diffusion on Alibaba Cloud Serverless in Minutes

This guide shows how to map a Stable Diffusion container's dynamic paths to Alibaba Cloud NAS, configure Serverless Devs, deploy the application, and upload model files via a visual admin UI or NAS commands, enabling fast AI image generation on a cloud‑native platform.

AI deploymentAlibaba CloudFunction Compute
0 likes · 8 min read
Deploy Stable Diffusion on Alibaba Cloud Serverless in Minutes
Alibaba Cloud Native
Alibaba Cloud Native
Apr 24, 2023 · Artificial Intelligence

Deploy Stable Diffusion WebUI on Alibaba Cloud Function Compute in One Command

This guide walks you through deploying the open‑source Stable Diffusion WebUI on Alibaba Cloud Function Compute using Serverless Devs, covering prerequisites, a single‑line deployment command, configuration details, access URL, and practical tips for handling GPU rendering and cold‑start latency.

AI deploymentCloud NativeFunction Compute
0 likes · 5 min read
Deploy Stable Diffusion WebUI on Alibaba Cloud Function Compute in One Command
DataFunTalk
DataFunTalk
Feb 18, 2023 · Artificial Intelligence

Building the ATLAS Automated Machine Learning Platform at Du Xiaoman: Architecture, Optimization, and Practical Insights

This article details Du Xiaoman's development of the ATLAS automated machine learning platform, covering business scenarios, AI algorithm deployment challenges, the end‑to‑end production workflow, platform components such as annotation, data, training and deployment, as well as optimization techniques like AutoML, meta‑learning, NAS, and large‑scale parallelism, concluding with lessons learned and future directions.

AI deploymentAutoMLMachine Learning Platform
0 likes · 20 min read
Building the ATLAS Automated Machine Learning Platform at Du Xiaoman: Architecture, Optimization, and Practical Insights
Architects' Tech Alliance
Architects' Tech Alliance
Nov 7, 2022 · Artificial Intelligence

FastDeploy: One-Click AI Model Deployment Across GPUs, CPUs, and Edge Devices

FastDeploy is an open‑source toolkit that standardizes AI model APIs and enables developers to deploy vision, NLP, and speech models on diverse hardware—including GPUs, CPUs, Jetson, ARM, and various NPUs—using just three lines of code or a single command, while delivering end‑to‑end performance optimizations.

AI deploymentCPUEdge Computing
0 likes · 11 min read
FastDeploy: One-Click AI Model Deployment Across GPUs, CPUs, and Edge Devices
Baidu Geek Talk
Baidu Geek Talk
Apr 13, 2022 · Artificial Intelligence

Smart Retail Product Recognition Solution Using PaddlePaddle PP-ShiTu

The article presents PaddlePaddle’s PP‑ShiTu‑based smart retail product recognition solution, detailing a complete pipeline—from data preparation and model optimization to low‑latency deployment—that overcomes high‑similarity packaging, rapid SKU changes, and costly retraining, achieving over 98 % Top‑1 recall with 0.2‑second CPU inference.

AI deploymentImage ClassificationPP-ShiTu
0 likes · 7 min read
Smart Retail Product Recognition Solution Using PaddlePaddle PP-ShiTu
Baidu Geek Talk
Baidu Geek Talk
Apr 1, 2022 · Artificial Intelligence

How Paddle Lite & PaddleSlim Supercharge Edge AI Inference Performance

With the rapid rise of edge computing, deploying AI models for tasks like object detection, OCR, and speech recognition on resource‑constrained devices faces speed challenges; the upgraded Paddle Lite inference engine and PaddleSlim compression tools claim up to 23% faster inference and significant model size reductions, offering a practical solution.

AI deploymentInference OptimizationPaddle-Lite
0 likes · 6 min read
How Paddle Lite & PaddleSlim Supercharge Edge AI Inference Performance
DaTaobao Tech
DaTaobao Tech
Mar 11, 2022 · Artificial Intelligence

How Alibaba’s MNN Engine Achieves 350% CPU Speedup and Sparse Acceleration

Alibaba’s MNN, a lightweight high‑performance deep‑learning inference engine, earned top honors in China’s 2022 “Science & Innovation China” awards, and delivers impressive gains such as 350% speedup on X86 CPUs, 2.1‑2.3× acceleration on ARM with sparse models, plus integrated OpenCV/Numpy functionality for edge AI deployment.

AI deploymentAlibabaDeep Learning
0 likes · 4 min read
How Alibaba’s MNN Engine Achieves 350% CPU Speedup and Sparse Acceleration
DataFunTalk
DataFunTalk
Sep 14, 2021 · Artificial Intelligence

AI Model Deployment on Edge Devices: Adaptation, Optimization, and Continuous Iteration – Interview Insights

The article shares a programmer's interview experience at Baidu, discussing how to adapt AI algorithms for edge deployment, balance model performance and efficiency, apply model compression techniques, and continuously iterate models, while also promoting an upcoming AI deployment online course.

AI deploymentEdge Computingframework support
0 likes · 6 min read
AI Model Deployment on Edge Devices: Adaptation, Optimization, and Continuous Iteration – Interview Insights
iQIYI Technical Product Team
iQIYI Technical Product Team
Jul 3, 2020 · Artificial Intelligence

Optimizing Video Inference Services for High GPU Utilization in AI Applications

By moving decoding, color conversion, preprocessing, inference, and re‑encoding entirely onto the GPU and enabling batch processing with flexible Python scripts, iQIYI’s video‑image enhancement service achieved ten‑fold throughput, over 90 % GPU utilization, and dramatically lower resource use, accelerating AI video inference deployment.

AI deploymentDeepStreamGPU Optimization
0 likes · 14 min read
Optimizing Video Inference Services for High GPU Utilization in AI Applications
Huawei Cloud Developer Alliance
Huawei Cloud Developer Alliance
Jun 2, 2020 · Artificial Intelligence

How to Transition into AI: Real Stories, Tools, and a Practical Roadmap

This article shares personal journeys of five Huawei AI experts, recommends essential AI books, walks through setting up PyCharm with ModelArts for hands‑on model training, and outlines a three‑stage AI career roadmap—from practical coding to mastering principles and deploying inference—offering actionable guidance for anyone looking to break into artificial intelligence.

AI career transitionAI deploymentModelArts
0 likes · 34 min read
How to Transition into AI: Real Stories, Tools, and a Practical Roadmap