Tagged articles

74 articles

Page 1 of 1

May 11, 2026 · Artificial Intelligence

Why OpenAI Is Investing $4 B in a New Deployment Company and Launching Daybreak for Cyber Defense

OpenAI announced a $4 billion‑backed Deployment Company to embed AI across core business functions and unveiled Daybreak, an AI‑powered network‑defense tool, signaling a shift from pure model provision to end‑to‑end enterprise AI deployment.

AI deploymentDaybreakEnterprise AI

0 likes · 7 min read

Why OpenAI Is Investing $4 B in a New Deployment Company and Launching Daybreak for Cyber Defense

DataFunSummit

Apr 22, 2026 · Artificial Intelligence

From Flawed RAG to Production‑Ready: Deep Dive into Scaling Retrieval‑Augmented Generation

This expert roundtable dissects why RAG often fails in production—low recall, hallucinations, cost overruns—and walks through concrete diagnostics, hybrid search designs, knowledge‑engineering tricks, GraphRAG and Agentic RAG advances, plus practical deployment, security, and cost‑optimization guidelines.

AI deploymentAgentic RAGHybrid Search

0 likes · 20 min read

From Flawed RAG to Production‑Ready: Deep Dive into Scaling Retrieval‑Augmented Generation

Coder Trainee

Apr 20, 2026 · Artificial Intelligence

How to Install and Configure Ollama Locally for a CRM AI Engine

This guide walks through installing Ollama on Windows 10, downloading a Chinese‑friendly LLM such as Qwen2, configuring a CRM’s application‑dev.yml to point to the local Ollama service, restarting the backend, and handling optional CORS settings, highlighting zero‑cost, privacy, and stability benefits.

AI deploymentCRM integrationLocal-LLM

0 likes · 4 min read

How to Install and Configure Ollama Locally for a CRM AI Engine

DataFunTalk

Apr 16, 2026 · Operations

Deploy Your AI Hermes Agent in Minutes with PPHermes Cloud Sandbox

This guide walks you through installing Python, obtaining a PPIO API key, installing the PPHermes CLI, launching a Hermes Agent sandbox in the cloud, and managing its lifecycle, with optional integration to Feishu/Lark and AI‑agent skill usage.

AI deploymentCLIDevOps

0 likes · 10 min read

Deploy Your AI Hermes Agent in Minutes with PPHermes Cloud Sandbox

AI Large-Model Wave and Transformation Guide

Apr 12, 2026 · Industry Insights

How to Choose the Right Large Language Model in 2025: A Six‑Dimension Guide

This article analyzes the rapid growth of large language models, presents a six‑dimensional classification framework, compares open‑source and closed‑source options, and offers a step‑by‑step selection checklist for enterprises seeking the most suitable model for their specific needs.

AI deploymentAI trendsEnterprise AI

0 likes · 10 min read

How to Choose the Right Large Language Model in 2025: A Six‑Dimension Guide

AI Large-Model Wave and Transformation Guide

Apr 11, 2026 · Artificial Intelligence

How to Build a Full‑Cycle Model Engineering System for Scalable AI

This article outlines a comprehensive, six‑part model engineering framework that transforms AI capabilities into reusable business functions, defines a stable technical stack, establishes model selection and architecture guidelines, implements rigorous control, data, and training processes, and explains how these layers synergize for reliable, scalable deployment.

AI deploymentModel TrainingOperations

0 likes · 27 min read

How to Build a Full‑Cycle Model Engineering System for Scalable AI

Woodpecker Software Testing

Apr 3, 2026 · Artificial Intelligence

Why 80% of AI Projects Fail: Bridging Model Evaluation from Theory to Real‑World Impact

The article explains that most AI project failures stem from unrealistic evaluation rather than model intelligence, and outlines concrete practices—business‑aligned metrics, scenario sandboxes, human‑in‑the‑loop reviews, and auditable documentation—to make model evaluation truly actionable.

AI deploymentAI reliabilityMLOps

0 likes · 7 min read

Why 80% of AI Projects Fail: Bridging Model Evaluation from Theory to Real‑World Impact

Yunqi AI+

Mar 14, 2026 · Industry Insights

Why Building an AI Knowledge Base Becomes an All‑Hands Initiative Once AI Goes Deep

The article explains how scaling AI agents reveals fragmented, inconsistent internal documentation, and argues that high‑quality production knowledge bases require a company‑wide, role‑based process, concrete writing rules, continuous inspection, and cross‑department ownership to ensure AI answers remain accurate and user‑focused.

AI deploymentAI knowledge baseData Quality

0 likes · 14 min read

Why Building an AI Knowledge Base Becomes an All‑Hands Initiative Once AI Goes Deep

Woodpecker Software Testing

Mar 1, 2026 · Artificial Intelligence

Four Hidden Model Evaluation Pitfalls That Undermine AI Deployments

The article examines four common yet hidden model evaluation mistakes—confusing attractive metrics with business impact, using static test sets, ignoring statistical significance, and lacking fine‑grained attribution—illustrating each with real‑world cases and offering concrete practices to build a more robust, business‑aligned evaluation pipeline.

A/B testingAI deploymentMetrics

0 likes · 8 min read

Four Hidden Model Evaluation Pitfalls That Undermine AI Deployments

AI Large-Model Wave and Transformation Guide

Feb 27, 2026 · Artificial Intelligence

How to Deploy Dify and Ollama Locally on Windows 11: A Step‑by‑Step Guide

This article walks through enabling Hyper‑V on Windows 11 Pro, configuring Docker Desktop with Chinese mirrors, adjusting storage, installing Ubuntu via WSL, cloning and setting up Dify, running Docker Compose, and linking Ollama's LLM so the AI agent runs entirely on a local machine.

AI deploymentDifyDocker

0 likes · 6 min read

How to Deploy Dify and Ollama Locally on Windows 11: A Step‑by‑Step Guide

Fighter's World

Dec 12, 2025 · Artificial Intelligence

Why OpenAI’s Forward Deployed Engineering Takes Six Months to Deliver Usable AI

The article explains how OpenAI’s Forward Deployed Engineering (FDE) team bridges the gap between powerful models and real‑world value by embedding engineers on‑site, iterating over a 6‑week technical rollout followed by a 4‑month trust‑building phase, and using eval‑driven development to turn custom solutions into reusable products.

AI deploymentEval-driven developmentForward Deployed Engineering

0 likes · 35 min read

Why OpenAI’s Forward Deployed Engineering Takes Six Months to Deliver Usable AI

AI2ML AI to Machine Learning

Dec 3, 2025 · Artificial Intelligence

2026 Forecast: How Large‑Model AI Will Evolve After 2025 Breakthroughs

The article reviews the major 2025 breakthroughs in multimodal, open‑source, and deployment technologies for large models and outlines four 2026 trends—including ToC vs. ToB service split, dual‑hand data generation, MoE routing advances, and AI4Science breakthroughs—that will shape the next wave of AI development.

AI deploymentAI4ScienceMixture of Experts

0 likes · 6 min read

2026 Forecast: How Large‑Model AI Will Evolve After 2025 Breakthroughs

Tech Stroll Journey

Nov 9, 2025 · Backend Development

How to Deploy AnythingLLM Locally with Docker for Enterprise Document RAG

This guide walks through setting up a Ubuntu VM, installing Docker, pulling the AnythingLLM image, configuring storage, launching the container, and using it to ingest and query local documents with a DeepSeek‑R1 model.

AI deploymentAnythingLLMDeepSeek

0 likes · 6 min read

How to Deploy AnythingLLM Locally with Docker for Enterprise Document RAG

Sohu Tech Products

Nov 5, 2025 · Artificial Intelligence

How nndeploy Simplifies the Last Mile of On-Device AI Deployment

nndeploy is an open‑source, high‑performance on‑device AI deployment framework that abstracts the repetitive “last‑mile” workflow into a visual drag‑and‑drop DAG, offering multi‑platform inference, optimization, and ready‑to‑use model configs, enabling developers to go from prototype to production in minutes.

AI deploymentedge AInndeploy

0 likes · 15 min read

How nndeploy Simplifies the Last Mile of On-Device AI Deployment

21CTO

Nov 5, 2025 · Artificial Intelligence

How Block Scaled AI Agents to 12,000 Employees in Just 8 Weeks

Block, a fintech giant, deployed AI agents across all 12,000 staff in eight weeks by adopting the Model Context Protocol, simplifying installation, offering model choice, automating tool management, and building a supportive community, revealing key lessons for enterprise AI adoption.

AI agentsAI deploymentEnterprise AI

0 likes · 10 min read

How Block Scaled AI Agents to 12,000 Employees in Just 8 Weeks

Alibaba Cloud Native

Sep 17, 2025 · Cloud Native

Deploy Dify AI Platform on Alibaba Cloud SAE in Two Simple Steps

This guide explains how to quickly deploy the open‑source Dify AI platform on Alibaba Cloud's Serverless Application Engine, covering the challenges of private deployment, the benefits of SAE's one‑click solution, and detailed step‑by‑step configuration to achieve a production‑grade AI service.

AI deploymentAlibaba CloudDify

0 likes · 5 min read

Deploy Dify AI Platform on Alibaba Cloud SAE in Two Simple Steps

JD Cloud Developers

Aug 6, 2025 · Artificial Intelligence

Deploy JoyAgent: Step‑by‑Step Guide to Launch an Open‑Source AI Agent

This article walks you through deploying JD Cloud’s open‑source JoyAgent AI agent, covering lightweight cloud host creation, opening firewall ports, configuring model and search engine settings, and finally starting the service to access the Genie interface via a public IP.

AI deploymentJD CloudJoyAgent

0 likes · 4 min read

Deploy JoyAgent: Step‑by‑Step Guide to Launch an Open‑Source AI Agent

Tencent Cloud Developer

Jul 24, 2025 · Artificial Intelligence

How Architects Turn Tech Roadblocks into Career Wins

This article showcases a prize‑driven Q&A session where senior architects answer real‑world questions on AI deployment, backend language choices, large‑model integration, and career advancement, offering practical guidance and a chance to win exclusive Tencent Cloud merchandise.

AI deploymentTechnology Selectioncareer advice

0 likes · 7 min read

How Architects Turn Tech Roadblocks into Career Wins

MaGe Linux Operations

Jul 21, 2025 · Artificial Intelligence

Master Multi‑GPU Load Balancing for OLLAMA: From Zero to Production

This guide walks you through configuring OLLAMA for multi‑GPU load balancing, covering hardware checks, CUDA setup, native and Docker deployment methods, detailed parameter tuning, advanced sharding strategies, troubleshooting, performance optimization, and production‑grade monitoring to maximize throughput and stability of large language models.

AI deploymentCUDAOllama

0 likes · 16 min read

Master Multi‑GPU Load Balancing for OLLAMA: From Zero to Production

Data Thinking Notes

Jul 13, 2025 · Artificial Intelligence

How to Build an Enterprise Knowledge Base with Dify: Full Setup Guide

This article walks developers through the entire process of deploying Dify locally, configuring model providers, creating and segmenting a knowledge base with RAG, choosing indexing methods, and integrating the knowledge base into a chatbot application, complete with code snippets and visual guides.

AI deploymentDifyKnowledge Base

0 likes · 11 min read

How to Build an Enterprise Knowledge Base with Dify: Full Setup Guide

Data Thinking Notes

Jul 6, 2025 · Artificial Intelligence

How Quantization Shrinks Giant AI Models for Edge Devices

This article explains why quantizing massive AI models is essential for deploying them on resource‑constrained devices, outlines core quantization concepts, techniques, and methods, compares their pros and cons, and presents practical application scenarios such as smartphones, autonomous driving, IoT, and edge computing.

AI deploymentModel QuantizationPerformance Optimization

0 likes · 9 min read

How Quantization Shrinks Giant AI Models for Edge Devices

Instant Consumer Technology Team

Jul 3, 2025 · Artificial Intelligence

Why Buying an AI Appliance Is a Strategic Pitfall for Enterprises

Enterprises rushing to purchase DeepSeek AI appliances and smart‑agent platforms often face hidden technical, data, and organizational challenges that turn promised "plug‑and‑play" solutions into costly missteps, highlighting the need for realistic strategy, robust data governance, and continuous capability building.

AI capability buildingAI deploymentData Governance

0 likes · 28 min read

Why Buying an AI Appliance Is a Strategic Pitfall for Enterprises

DataFunTalk

Jul 2, 2025 · Artificial Intelligence

How Multimodal Large Models Are Revolutionizing Complex Document OCR

In a detailed interview, Zhao Chenyang explains how multimodal large models (VLM) overcome the limitations of traditional OCR in mixed layouts, table reconstruction, and handwritten text by leveraging self‑supervised pre‑training, lightweight fine‑tuning, and hybrid pipelines that dramatically cut annotation costs and improve recall rates.

AI deploymentMultimodal AIdocument OCR

0 likes · 13 min read

How Multimodal Large Models Are Revolutionizing Complex Document OCR

JavaEdge

Jun 27, 2025 · Artificial Intelligence

Why Inference Engines Are Essential for Deploying Large Language Models in Production

The article explains what inference engines are, why they are needed beyond raw Python scripts, and outlines best practices such as model quantization, batching, and parallelism, while comparing popular open‑source and commercial options for production AI workloads.

AI deploymentBatchingInference Engine

0 likes · 14 min read

Why Inference Engines Are Essential for Deploying Large Language Models in Production

Data Thinking Notes

Jun 4, 2025 · Artificial Intelligence

How DeepSeek AI Model is Revolutionizing China’s State Enterprises – Over 100 Deployment Cases

The DeepSeek large language model has been extensively deployed across more than 100 central and local Chinese state‑owned enterprises, spanning sectors such as energy, manufacturing, transportation, finance, telecommunications, construction, and public services, driving intelligent transformation through applications like smart scheduling, risk assessment, intelligent customer service, and AI‑enhanced office automation.

AI deploymentDeepSeekIndustrial AI

0 likes · 38 min read

How DeepSeek AI Model is Revolutionizing China’s State Enterprises – Over 100 Deployment Cases

Alibaba Cloud Native

May 16, 2025 · Artificial Intelligence

Boost AI Reliability with MetaGPT’s Multi‑Agent Collaboration on Serverless Function AI

This guide explains how MetaGPT’s multi‑agent architecture eliminates the logical gaps of single‑agent systems, improves task stability, and can be rapidly deployed on Alibaba Cloud’s Serverless Function AI platform with step‑by‑step instructions, configuration details, and example applications.

AI deploymentFunction AIMetaGPT

0 likes · 8 min read

Boost AI Reliability with MetaGPT’s Multi‑Agent Collaboration on Serverless Function AI

Alibaba Cloud Developer

May 14, 2025 · Artificial Intelligence

Deploy Alibaba’s Qwen3 LLM in 10 Minutes with Bailei Platform

Learn how to quickly set up Alibaba Cloud’s Bailei platform to call the open-source Qwen3 large language model, explore its cost‑effective performance, dual‑mode reasoning, multilingual support, and enhanced agent capabilities, and follow step‑by‑step instructions for API key configuration, Cherry Studio integration, and tool‑calling setup.

AI deploymentAlibaba CloudMLOps

0 likes · 6 min read

Deploy Alibaba’s Qwen3 LLM in 10 Minutes with Bailei Platform

Architect's Guide

May 2, 2025 · Artificial Intelligence

Deploying a Local High‑Performance AI Service with Spring AI, Ollama, Redis, and Docker

This tutorial walks developers through setting up a low‑cost, containerized AI service on Windows by installing Docker, deploying Redis and Ollama containers, pulling the DeepSeek‑R1 model, and integrating everything with Spring AI to enable continuous conversation support.

AI deploymentDockerJava

0 likes · 12 min read

Deploying a Local High‑Performance AI Service with Spring AI, Ollama, Redis, and Docker

ByteDance Cloud Native

Apr 9, 2025 · Artificial Intelligence

How to Deploy ComfyUI Cluster Edition on Volcengine for Multi‑User AI Workflows

This guide explains how to launch the ComfyUI Cluster Edition on Volcengine, covering its enterprise features such as multi‑user collaboration, resource isolation, built‑in plugins, flexible mounting, and step‑by‑step deployment using VKE, CP, and API Gateway to enable efficient, scalable AI image generation.

AI deploymentComfyUIMulti-user collaboration

0 likes · 10 min read

How to Deploy ComfyUI Cluster Edition on Volcengine for Multi‑User AI Workflows

Architect

Apr 1, 2025 · Artificial Intelligence

When to Fine‑Tune Large Language Models vs. Relying on Prompting and RAG

The article explains why most projects should start with prompt engineering or simple agent workflows, outlines the scenarios where model fine‑tuning adds real value, compares fine‑tuning with Retrieval‑Augmented Generation, and offers practical criteria for deciding which approach to adopt.

AI deploymentLoRARAG

0 likes · 9 min read

When to Fine‑Tune Large Language Models vs. Relying on Prompting and RAG

MaGe Linux Operations

Mar 8, 2025 · Artificial Intelligence

How Cloud‑Large Models and Edge‑Small Models Can Revolutionize AI Deployment

The article explains why combining powerful cloud AI models with lightweight edge models is essential for overcoming compute‑cost trade‑offs, privacy constraints, and scenario gaps, and provides a four‑step guide, real‑world case studies, and future directions for collaborative AI deployment.

AI deploymentCloud AIEdge Computing

0 likes · 8 min read

How Cloud‑Large Models and Edge‑Small Models Can Revolutionize AI Deployment

Open Source Linux

Mar 5, 2025 · Artificial Intelligence

How DeepSeek‑R1 Redefines Prompt Engineering and Real‑World AI Deployment

The article analyzes DeepSeek‑R1’s low‑cost inference architecture, Chinese language optimizations, novel prompt‑engineering techniques, and the practical challenges of deploying large domestic models, offering insights into vertical AI applications and the evolving open‑source ecosystem in China.

AI deploymentDeepSeekModel Optimization

0 likes · 8 min read

How DeepSeek‑R1 Redefines Prompt Engineering and Real‑World AI Deployment

Alibaba Cloud Big Data AI Platform

Mar 4, 2025 · Artificial Intelligence

Deploy a High‑Performance RAG Service with Hologres, DeepSeek, and PAI‑EAS

This guide walks you through building a Retrieval‑Augmented Generation (RAG) system by integrating Alibaba Cloud's Hologres vector store, the Proxima high‑performance vector engine, and DeepSeek large language models via PAI‑EAS, covering prerequisites, deployment steps, configuration, and inference verification.

AI deploymentDeepSeekHologres

0 likes · 12 min read

Deploy a High‑Performance RAG Service with Hologres, DeepSeek, and PAI‑EAS

JD Cloud Developers

Feb 17, 2025 · Artificial Intelligence

Why DeepSeek Is Outpacing ChatGPT: Cost, Performance, and Local Deployment

The article compares DeepSeek with ChatGPT, highlighting DeepSeek’s superior performance in math and reasoning, lower cost, hardware requirements, and how to locally deploy the R1 model using Ollama and a GUI, illustrating its potential to reshape the AI landscape.

AI deploymentChatGPT comparisonDeepSeek

0 likes · 10 min read

Why DeepSeek Is Outpacing ChatGPT: Cost, Performance, and Local Deployment

Liangxu Linux

Feb 16, 2025 · Artificial Intelligence

Build a Free Private AI with DeepSeek, Ollama, and Local Knowledge Base

This guide explains how to locally deploy the open‑source DeepSeek model using Ollama, enhance interaction with Chatbox and Page Assist, and connect a local knowledge base via AnythingLLM's RAG architecture, providing step‑by‑step instructions, hardware requirements, and API examples for a self‑hosted AI system.

AI deploymentAnythingLLMDeepSeek

0 likes · 22 min read

Build a Free Private AI with DeepSeek, Ollama, and Local Knowledge Base

Java Tech Enthusiast

Feb 15, 2025 · Artificial Intelligence

DeepSeek-R1: High-Performance AI Inference Model

DeepSeek‑R1 is a high‑performance AI inference model that leverages reinforcement‑learning techniques to boost reasoning on complex tasks, has become a Chinese‑New‑Year sensation, and requires substantial hardware resources for local deployment, especially the full‑scale 671‑billion‑parameter version.

AI deploymentAI inferenceAI model

0 likes · 4 min read

DeepSeek-R1: High-Performance AI Inference Model

Ops Development & AI Practice

Feb 14, 2025 · Artificial Intelligence

Large Model Format Showdown: Hugging Face, TensorFlow, ONNX, TorchScript, GGUF

This comprehensive guide examines the leading large‑model storage formats—including Hugging Face Transformers, TensorFlow SavedModel, ONNX, TorchScript, and GGUF—detailing their file structures, serialization methods, strengths, weaknesses, and typical use‑cases, helping developers and researchers select the optimal format for their specific AI workloads.

AI deploymentGGUFModel Formats

0 likes · 21 min read

Large Model Format Showdown: Hugging Face, TensorFlow, ONNX, TorchScript, GGUF

JD Tech Talk

Feb 12, 2025 · Artificial Intelligence

Deploying a Private DeepSeek Large Language Model on JD Cloud with Ollama and Knowledge‑Base Tools

This guide explains how to privately deploy the DeepSeek large language model using a JD Cloud virtual computer, set up Ollama as the LLM service, run various model versions, and integrate local knowledge bases through CherryStudio, Page Assist, and AnythingLLM for offline and network‑enabled AI applications.

AI deploymentDeepSeekJD Cloud

0 likes · 16 min read

Deploying a Private DeepSeek Large Language Model on JD Cloud with Ollama and Knowledge‑Base Tools

JD Cloud Developers

Feb 11, 2025 · Artificial Intelligence

Deploy Full‑Power DeepSeek AI Locally Without High‑End Hardware

This guide walks you through registering on Silicon Flow, accessing DeepSeek models online, and using Cherry Studio to set up a private, low‑spec‑requirement deployment that keeps conversations stored locally for privacy and security.

AI deploymentCherry StudioDeepSeek

0 likes · 5 min read

Deploy Full‑Power DeepSeek AI Locally Without High‑End Hardware

Su San Talks Tech

Feb 3, 2025 · Artificial Intelligence

Run DeepSeek Locally with Ollama: Step‑by‑Step AI Chat Setup

This guide walks you through downloading Ollama, selecting the DeepSeek‑r1 model, installing it via a terminal command, and configuring ChatBoxAI for seamless local AI conversations, all with clear screenshots for each step.

AI deploymentChatbotDeepSeek

0 likes · 2 min read

Run DeepSeek Locally with Ollama: Step‑by‑Step AI Chat Setup

Alibaba Cloud Big Data AI Platform

Jan 10, 2025 · Artificial Intelligence

Deploy DeepSeek‑V3 LLM on Alibaba Cloud with One‑Click Model Gallery

This article introduces the 671‑billion‑parameter DeepSeek‑V3 Mixture‑of‑Experts LLM, explains the PAI‑Model Gallery platform that aggregates top AI models, and provides a step‑by‑step guide to deploy DeepSeek‑V3 on Alibaba Cloud’s PAI‑EAS service with zero‑code configuration.

AI deploymentAlibaba CloudDeepSeek-V3

0 likes · 5 min read

Deploy DeepSeek‑V3 LLM on Alibaba Cloud with One‑Click Model Gallery

Architecture and Beyond

Nov 23, 2024 · Artificial Intelligence

A Comprehensive Overview of AIGC Engineering Architecture and Its Core Roles

This article examines the AIGC engineering architecture, detailing its data, model, fine‑tuning, inference, application, and monitoring layers, and explains the distinct responsibilities and challenges of application engineers, algorithm engineers, and “alchemy” specialists, highlighting how this structured approach accelerates generative AI productization.

AI deploymentAIGCEngineering Architecture

0 likes · 24 min read

A Comprehensive Overview of AIGC Engineering Architecture and Its Core Roles

Alibaba Cloud Big Data AI Platform

Nov 20, 2024 · Artificial Intelligence

How to Efficiently Deploy and Fine‑Tune DistilQwen2 on Alibaba Cloud PAI

This guide walks you through the full workflow of using DistilQwen2 on Alibaba Cloud's PAI platform, covering environment setup, model deployment, fine‑tuning with SFT/DPO, evaluation, compression, and distillation, while providing practical code snippets and resource links.

AI deploymentDistilQwen2PAI-QuickStart

0 likes · 17 min read

How to Efficiently Deploy and Fine‑Tune DistilQwen2 on Alibaba Cloud PAI

Fighter's World

Oct 26, 2024 · Artificial Intelligence

Key Considerations for Deploying Large Language Models in Cloud Services

The article reflects on Alibaba Cloud's large‑model deployments, outlines four service scenarios, examines three fundamental questions about foundation models, and offers a prioritized roadmap—including prompt engineering, RAG, and organizational changes—to effectively bring LLMs to production.

AI deploymentAlibaba CloudCloud Services

0 likes · 8 min read

Key Considerations for Deploying Large Language Models in Cloud Services

AntTech

Sep 6, 2024 · Artificial Intelligence

Large Model Industry Trustworthy Application Framework Research Report

Ant Group and the China Academy of Information and Communications Technology released a research report outlining a trustworthy application framework for large models in rigorous sectors such as finance and healthcare, detailing technical safeguards, industry case studies, and guidance for scalable, secure AI deployment.

AI GovernanceAI deploymentHealthcare AI

0 likes · 3 min read

Large Model Industry Trustworthy Application Framework Research Report

Baobao Algorithm Notes

Aug 27, 2024 · Artificial Intelligence

Unlock Free GLM-4-Flash API: Step-by-Step Guide, Code Samples, and Logic Puzzle Test

This article explores the free GLM-4-Flash API from Zhipu AI, detailing its lightweight architecture, performance specs, a logic‑puzzle demonstration, and provides a comprehensive step‑by‑step tutorial—including data upload, model fine‑tuning, deployment commands and example code for building a LangChain‑based knowledge‑base retrieval system.

AI deploymentFine-tuningFree API

0 likes · 11 min read

Unlock Free GLM-4-Flash API: Step-by-Step Guide, Code Samples, and Logic Puzzle Test

ByteDance Cloud Native

Aug 12, 2024 · Cloud Native

How to Deploy NVIDIA NIM AI Models on Volcengine VKE in Minutes

This guide walks you through deploying large language models with NVIDIA NIM on Volcengine's Kubernetes Engine (VKE), covering environment setup, model optimization, Helm chart deployment, monitoring integration, and the key advantages of using NIM as a cloud‑native AI micro‑service.

AI deploymentGPUKubernetes

0 likes · 12 min read

How to Deploy NVIDIA NIM AI Models on Volcengine VKE in Minutes

ByteDance Cloud Native

Aug 7, 2024 · Artificial Intelligence

Deploy Stable Diffusion in 5 Minutes with Volcengine’s Continuous Delivery CP

Learn how to quickly launch a Stable Diffusion WebUI service in just five minutes using Volcengine’s cloud‑native continuous delivery platform, which abstracts Kubernetes complexities, provides pre‑configured AI templates, serverless VCI deployment, automatic scaling, API gateway access, and includes a Python client for image generation.

AI deploymentCloud NativePython

0 likes · 14 min read

Deploy Stable Diffusion in 5 Minutes with Volcengine’s Continuous Delivery CP

Architects' Tech Alliance

Jul 15, 2024 · Artificial Intelligence

Why Model-as-a-Service (MaaS) Is Shaping the Future of AI Deployment

This article examines the Model-as-a-Service (MaaS) paradigm, tracing its origins, defining its expanded capabilities for large‑model ecosystems, outlining the full‑stack services it offers, and analyzing current industry adoption, deployment models, and the technical and regulatory challenges that must be addressed for scalable AI rollout.

AI InfrastructureAI deploymentCloud AI

0 likes · 11 min read

Why Model-as-a-Service (MaaS) Is Shaping the Future of AI Deployment

Alibaba Cloud Native

Jun 14, 2024 · Cloud Native

Deploy GPT‑SoVITS Voice‑Clone Model on Alibaba Cloud Function Compute in Minutes

This guide explains how to quickly host the open‑source GPT‑SoVITS text‑to‑speech model on Alibaba Cloud Function Compute, covering its application scenarios, cloud‑native architecture, step‑by‑step deployment, voice training workflow, and how to generate speech using provided demos.

AI deploymentAlibaba CloudFunction Compute

0 likes · 9 min read

Deploy GPT‑SoVITS Voice‑Clone Model on Alibaba Cloud Function Compute in Minutes

Alibaba Cloud Big Data AI Platform

Jun 14, 2024 · Artificial Intelligence

Unlock Qwen2: Fast‑Track LLM Fine‑Tuning and Deployment with Alibaba Cloud PAI‑QuickStart

Qwen2, Alibaba Cloud's new open‑source LLM series, offers five model sizes with GQA acceleration, and through PAI‑QuickStart developers can zero‑code fine‑tune, evaluate, and deploy these models using cloud resources, SDKs, and OpenAI‑compatible APIs.

AI deploymentPAI-QuickStartPython SDK

0 likes · 10 min read

Unlock Qwen2: Fast‑Track LLM Fine‑Tuning and Deployment with Alibaba Cloud PAI‑QuickStart

JD Cloud Developers

May 8, 2024 · Artificial Intelligence

Deploy Meta’s LLaMA 3 on JD Cloud: A Complete Step‑by‑Step Tutorial

Meta’s newly released LLaMA 3 models (8B and 70B) boast record‑breaking performance, and this guide walks you through the community buzz, technical specs, and a detailed JD Cloud workflow—from provisioning a GPU instance to running the model in a Jupyter environment.

AI deploymentJD CloudLlama3

0 likes · 6 min read

Deploy Meta’s LLaMA 3 on JD Cloud: A Complete Step‑by‑Step Tutorial

DevOps

Apr 18, 2024 · Artificial Intelligence

Expert Round‑Table on AIGC: Technology vs. Market Beliefs, Domestic Model Challenges, and Enterprise Deployment in China

The article presents a 2024 AIGC round‑table where Chinese experts discuss whether to follow a technology‑first or market‑first approach, the challenges of compute, algorithms and data, domestic versus foreign large‑model strategies, multi‑model deployment in enterprises, and criteria for evaluating successful AIGC applications.

AI deploymentAIGCChina AI

0 likes · 14 min read

Expert Round‑Table on AIGC: Technology vs. Market Beliefs, Domestic Model Challenges, and Enterprise Deployment in China

DataFunTalk

Dec 19, 2023 · Artificial Intelligence

Enterprise Large‑Model Deployment and Data Governance: Insights from Deepexi’s President

The article examines how enterprises can adopt domain‑specific large models by balancing demand‑side cost‑reduction needs with supply‑side mature training techniques, discusses team composition, fine‑tuning methods, data governance for unstructured data, and outlines Deepexi’s product ecosystem designed to improve efficiency, performance, and user experience.

AI deploymentEnterprise AIcost economics

0 likes · 13 min read

Enterprise Large‑Model Deployment and Data Governance: Insights from Deepexi’s President

DataFunSummit

Dec 16, 2023 · Artificial Intelligence

Enterprise Large Model Deployment: Data Governance, Fine‑Tuning Strategies, and Cost Economics

The article examines how enterprises can adopt domain‑specific large models by addressing talent and cost challenges, outlining self‑supervised pre‑training, instruction fine‑tuning, data governance for unstructured data, dataset balance, model‑type selection, and integrated product solutions to achieve efficient, high‑performance AI deployments.

AI deploymentData GovernanceEnterprise AI

0 likes · 13 min read

Enterprise Large Model Deployment: Data Governance, Fine‑Tuning Strategies, and Cost Economics

DataFunSummit

Dec 13, 2023 · Artificial Intelligence

Enterprise Large‑Model Deployment: Data Governance, Fine‑Tuning Strategies, and Cost Economics

The article explores how enterprises can adopt domain‑specific large language models by addressing talent and cost challenges, outlining training pipelines, data governance for unstructured data, dataset balancing, fine‑tuning techniques, and a product ecosystem that lowers deployment barriers while optimizing performance and economics.

AI deploymentData Governancecost economics

0 likes · 13 min read

Open Source Tech Hub

Nov 25, 2023 · Artificial Intelligence

How to Deploy FastGPT Locally with Docker Compose: A Step‑by‑Step Guide

This guide walks you through installing Docker, configuring Docker‑Compose, setting up FastGPT’s config files, launching the containers, and creating a private knowledge base to enable AI‑driven question answering on your own server.

AI deploymentDocker ComposeFastGPT

0 likes · 10 min read

How to Deploy FastGPT Locally with Docker Compose: A Step‑by‑Step Guide

DataFunTalk

Oct 20, 2023 · Artificial Intelligence

Building the ATLAS Automated Machine Learning Platform at Du Xiaoman: Architecture, Practices, and Optimizations

This article describes how Du Xiaoman tackled the high cost, instability, and long cycles of AI algorithm deployment by building the ATLAS automated machine learning platform, detailing its four‑stage workflow, component platforms, scaling and efficiency techniques, and practical Q&A for practitioners.

AI deploymentAutoMLData Parallelism

0 likes · 22 min read

Building the ATLAS Automated Machine Learning Platform at Du Xiaoman: Architecture, Practices, and Optimizations

UCloud Tech

Aug 30, 2023 · Artificial Intelligence

Unlocking Llama 2: Architecture, Training Insights, and Cloud Deployment Guide

This article explores Meta's Llama 2 large language model—its performance, expanded training data, architectural details, evaluation results, RLHF fine‑tuning process, and step‑by‑step deployment on UCloud UK8S using Docker and Kubernetes—providing a comprehensive guide for AI practitioners.

AI deploymentLlama-2RLHF

0 likes · 11 min read

Unlocking Llama 2: Architecture, Training Insights, and Cloud Deployment Guide

Alibaba Cloud Developer

Jul 26, 2023 · Artificial Intelligence

How to Fine‑Tune and Deploy Llama 2 on Alibaba Cloud PAI: Step‑by‑Step Guide

This guide walks you through using Alibaba Cloud's PAI platform to quickly fine‑tune Llama 2 with LoRA or full‑parameter methods, deploy the models as online inference services, and launch an interactive WebUI, covering preparation, data formatting, training jobs, and deployment details.

AI deploymentAlibaba CloudFine-tuning

0 likes · 15 min read

How to Fine‑Tune and Deploy Llama 2 on Alibaba Cloud PAI: Step‑by‑Step Guide

Volcano Engine Developer Services

Jun 30, 2023 · Cloud Native

Deploy Langchain‑ChatGLM on Volcengine VKE: A Step‑by‑Step Cloud‑Native Guide

This tutorial walks you through preparing a VKE cluster, pulling the Langchain‑ChatGLM container image, creating the necessary Deployment and Service resources, and adding a local knowledge base, enabling you to run a Langchain‑based ChatGLM service with GPU support on Volcengine’s cloud‑native platform.

AI deploymentChatGLMGPU

0 likes · 6 min read

Deploy Langchain‑ChatGLM on Volcengine VKE: A Step‑by‑Step Cloud‑Native Guide

Alibaba Cloud Native

Jun 9, 2023 · Cloud Native

Deploy Stable Diffusion on Serverless Kubernetes (ASK) with Knative – A Complete Guide

This article explains how Shuhe Technology leveraged Alibaba Cloud's Serverless Kubernetes (ASK) and Knative to deploy, scale, and monitor over 500 AI model services—including Stable Diffusion—achieving 60% cost savings, zero‑to‑scale elasticity, and rapid deployment cycles.

AI deploymentASKCloud Native

0 likes · 13 min read

Deploy Stable Diffusion on Serverless Kubernetes (ASK) with Knative – A Complete Guide

Tencent Cloud Developer

May 24, 2023 · Artificial Intelligence

Deploying Stable Diffusion on Tencent Cloud: A Step‑by‑Step Guide

Deploy Stable Diffusion on Tencent Cloud by building a Docker image, pushing it to TCR, creating a GPU‑enabled TKE cluster with CFS storage, configuring qGPU sharing, exposing the service via Cloud Native API Gateway, optimizing inference with TACO Kit, storing results in COS, and applying content moderation.

AI deploymentGPUKubernetes

0 likes · 19 min read

Deploying Stable Diffusion on Tencent Cloud: A Step‑by‑Step Guide

Alibaba Cloud Native

May 5, 2023 · Artificial Intelligence

Deploy Stable Diffusion on Alibaba Cloud Serverless in Minutes

This guide shows how to map a Stable Diffusion container's dynamic paths to Alibaba Cloud NAS, configure Serverless Devs, deploy the application, and upload model files via a visual admin UI or NAS commands, enabling fast AI image generation on a cloud‑native platform.

AI deploymentAlibaba CloudFunction Compute

0 likes · 8 min read

Deploy Stable Diffusion on Alibaba Cloud Serverless in Minutes

Alibaba Cloud Native

Apr 24, 2023 · Artificial Intelligence

Deploy Stable Diffusion WebUI on Alibaba Cloud Function Compute in One Command

This guide walks you through deploying the open‑source Stable Diffusion WebUI on Alibaba Cloud Function Compute using Serverless Devs, covering prerequisites, a single‑line deployment command, configuration details, access URL, and practical tips for handling GPU rendering and cold‑start latency.

AI deploymentCloud NativeFunction Compute

0 likes · 5 min read

Deploy Stable Diffusion WebUI on Alibaba Cloud Function Compute in One Command

DataFunTalk

Feb 18, 2023 · Artificial Intelligence

Building the ATLAS Automated Machine Learning Platform at Du Xiaoman: Architecture, Optimization, and Practical Insights

This article details Du Xiaoman's development of the ATLAS automated machine learning platform, covering business scenarios, AI algorithm deployment challenges, the end‑to‑end production workflow, platform components such as annotation, data, training and deployment, as well as optimization techniques like AutoML, meta‑learning, NAS, and large‑scale parallelism, concluding with lessons learned and future directions.

AI deploymentAutoMLMachine Learning Platform

0 likes · 20 min read

Building the ATLAS Automated Machine Learning Platform at Du Xiaoman: Architecture, Optimization, and Practical Insights

Architects' Tech Alliance

Nov 7, 2022 · Artificial Intelligence

FastDeploy: One-Click AI Model Deployment Across GPUs, CPUs, and Edge Devices

FastDeploy is an open‑source toolkit that standardizes AI model APIs and enables developers to deploy vision, NLP, and speech models on diverse hardware—including GPUs, CPUs, Jetson, ARM, and various NPUs—using just three lines of code or a single command, while delivering end‑to‑end performance optimizations.

AI deploymentCPUEdge Computing

0 likes · 11 min read

FastDeploy: One-Click AI Model Deployment Across GPUs, CPUs, and Edge Devices

Baidu Geek Talk

Apr 13, 2022 · Artificial Intelligence

Smart Retail Product Recognition Solution Using PaddlePaddle PP-ShiTu

The article presents PaddlePaddle’s PP‑ShiTu‑based smart retail product recognition solution, detailing a complete pipeline—from data preparation and model optimization to low‑latency deployment—that overcomes high‑similarity packaging, rapid SKU changes, and costly retraining, achieving over 98 % Top‑1 recall with 0.2‑second CPU inference.

AI deploymentImage ClassificationPP-ShiTu

0 likes · 7 min read

Smart Retail Product Recognition Solution Using PaddlePaddle PP-ShiTu

Baidu Geek Talk

Apr 1, 2022 · Artificial Intelligence

How Paddle Lite & PaddleSlim Supercharge Edge AI Inference Performance

With the rapid rise of edge computing, deploying AI models for tasks like object detection, OCR, and speech recognition on resource‑constrained devices faces speed challenges; the upgraded Paddle Lite inference engine and PaddleSlim compression tools claim up to 23% faster inference and significant model size reductions, offering a practical solution.

AI deploymentInference OptimizationPaddle-Lite

0 likes · 6 min read

How Paddle Lite & PaddleSlim Supercharge Edge AI Inference Performance

DaTaobao Tech

Mar 11, 2022 · Artificial Intelligence

How Alibaba’s MNN Engine Achieves 350% CPU Speedup and Sparse Acceleration

Alibaba’s MNN, a lightweight high‑performance deep‑learning inference engine, earned top honors in China’s 2022 “Science & Innovation China” awards, and delivers impressive gains such as 350% speedup on X86 CPUs, 2.1‑2.3× acceleration on ARM with sparse models, plus integrated OpenCV/Numpy functionality for edge AI deployment.

AI deploymentAlibabaDeep Learning

0 likes · 4 min read

How Alibaba’s MNN Engine Achieves 350% CPU Speedup and Sparse Acceleration

DataFunTalk

Sep 14, 2021 · Artificial Intelligence

AI Model Deployment on Edge Devices: Adaptation, Optimization, and Continuous Iteration – Interview Insights

The article shares a programmer's interview experience at Baidu, discussing how to adapt AI algorithms for edge deployment, balance model performance and efficiency, apply model compression techniques, and continuously iterate models, while also promoting an upcoming AI deployment online course.

AI deploymentEdge Computingframework support

0 likes · 6 min read

AI Model Deployment on Edge Devices: Adaptation, Optimization, and Continuous Iteration – Interview Insights

iQIYI Technical Product Team

Jul 3, 2020 · Artificial Intelligence

Optimizing Video Inference Services for High GPU Utilization in AI Applications

By moving decoding, color conversion, preprocessing, inference, and re‑encoding entirely onto the GPU and enabling batch processing with flexible Python scripts, iQIYI’s video‑image enhancement service achieved ten‑fold throughput, over 90 % GPU utilization, and dramatically lower resource use, accelerating AI video inference deployment.

AI deploymentDeepStreamGPU Optimization

0 likes · 14 min read

Optimizing Video Inference Services for High GPU Utilization in AI Applications

Huawei Cloud Developer Alliance

Jun 2, 2020 · Artificial Intelligence

How to Transition into AI: Real Stories, Tools, and a Practical Roadmap

This article shares personal journeys of five Huawei AI experts, recommends essential AI books, walks through setting up PyCharm with ModelArts for hands‑on model training, and outlines a three‑stage AI career roadmap—from practical coding to mastering principles and deploying inference—offering actionable guidance for anyone looking to break into artificial intelligence.

AI career transitionAI deploymentModelArts

0 likes · 34 min read

How to Transition into AI: Real Stories, Tools, and a Practical Roadmap

360 Zhihui Cloud Developer

Sep 14, 2017 · Artificial Intelligence

Running TensorFlow on Kubernetes: A Practical Guide to Scalable AI Workloads

This article explains how to deploy TensorFlow on Kubernetes, addressing resource isolation, GPU scheduling, and distributed training challenges by introducing a custom TensorFlow‑on‑K8s system with client, task, and autospec modules, plus container design for reliable job execution.

AI deploymentGPU schedulingKubernetes

0 likes · 9 min read

Running TensorFlow on Kubernetes: A Practical Guide to Scalable AI Workloads