Tag

AI deployment

0 views collected around this technical thread.

Architect's Guide
Architect's Guide
May 2, 2025 · Artificial Intelligence

Deploying a Local High‑Performance AI Service with Spring AI, Ollama, Redis, and Docker

This tutorial walks developers through setting up a low‑cost, containerized AI service on Windows by installing Docker, deploying Redis and Ollama containers, pulling the DeepSeek‑R1 model, and integrating everything with Spring AI to enable continuous conversation support.

AI deploymentDockerJava
0 likes · 12 min read
Deploying a Local High‑Performance AI Service with Spring AI, Ollama, Redis, and Docker
ByteDance Cloud Native
ByteDance Cloud Native
Apr 9, 2025 · Artificial Intelligence

How to Deploy ComfyUI Cluster Edition on Volcengine for Multi‑User AI Workflows

This guide explains how to launch the ComfyUI Cluster Edition on Volcengine, covering its enterprise features such as multi‑user collaboration, resource isolation, built‑in plugins, flexible mounting, and step‑by‑step deployment using VKE, CP, and API Gateway to enable efficient, scalable AI image generation.

AI deploymentComfyUIMulti-user collaboration
0 likes · 10 min read
How to Deploy ComfyUI Cluster Edition on Volcengine for Multi‑User AI Workflows
Architect
Architect
Apr 1, 2025 · Artificial Intelligence

When to Fine‑Tune Large Language Models vs. Relying on Prompting and RAG

The article explains why most projects should start with prompt engineering or simple agent workflows, outlines the scenarios where model fine‑tuning adds real value, compares fine‑tuning with Retrieval‑Augmented Generation, and offers practical criteria for deciding which approach to adopt.

AI deploymentLoRAPrompt Engineering
0 likes · 9 min read
When to Fine‑Tune Large Language Models vs. Relying on Prompting and RAG
Java Tech Enthusiast
Java Tech Enthusiast
Feb 15, 2025 · Artificial Intelligence

DeepSeek-R1: High-Performance AI Inference Model

DeepSeek‑R1 is a high‑performance AI inference model that leverages reinforcement‑learning techniques to boost reasoning on complex tasks, has become a Chinese‑New‑Year sensation, and requires substantial hardware resources for local deployment, especially the full‑scale 671‑billion‑parameter version.

AI deploymentAI inferenceAI model
0 likes · 4 min read
DeepSeek-R1: High-Performance AI Inference Model
JD Tech Talk
JD Tech Talk
Feb 12, 2025 · Artificial Intelligence

Deploying a Private DeepSeek Large Language Model on JD Cloud with Ollama and Knowledge‑Base Tools

This guide explains how to privately deploy the DeepSeek large language model using a JD Cloud virtual computer, set up Ollama as the LLM service, run various model versions, and integrate local knowledge bases through CherryStudio, Page Assist, and AnythingLLM for offline and network‑enabled AI applications.

AI deploymentDeepSeekJD Cloud
0 likes · 16 min read
Deploying a Private DeepSeek Large Language Model on JD Cloud with Ollama and Knowledge‑Base Tools
Architecture and Beyond
Architecture and Beyond
Nov 23, 2024 · Artificial Intelligence

A Comprehensive Overview of AIGC Engineering Architecture and Its Core Roles

This article examines the AIGC engineering architecture, detailing its data, model, fine‑tuning, inference, application, and monitoring layers, and explains the distinct responsibilities and challenges of application engineers, algorithm engineers, and “alchemy” specialists, highlighting how this structured approach accelerates generative AI productization.

AI deploymentAIGCEngineering Architecture
0 likes · 24 min read
A Comprehensive Overview of AIGC Engineering Architecture and Its Core Roles
AntTech
AntTech
Sep 6, 2024 · Artificial Intelligence

Large Model Industry Trustworthy Application Framework Research Report

Ant Group and the China Academy of Information and Communications Technology released a research report outlining a trustworthy application framework for large models in rigorous sectors such as finance and healthcare, detailing technical safeguards, industry case studies, and guidance for scalable, secure AI deployment.

AI deploymentHealthcare AILarge Models
0 likes · 3 min read
Large Model Industry Trustworthy Application Framework Research Report
ByteDance Cloud Native
ByteDance Cloud Native
Aug 12, 2024 · Cloud Native

How to Deploy NVIDIA NIM AI Models on Volcengine VKE in Minutes

This guide walks you through deploying large language models with NVIDIA NIM on Volcengine's Kubernetes Engine (VKE), covering environment setup, model optimization, Helm chart deployment, monitoring integration, and the key advantages of using NIM as a cloud‑native AI micro‑service.

AI deploymentCloud NativeGPU
0 likes · 12 min read
How to Deploy NVIDIA NIM AI Models on Volcengine VKE in Minutes
ByteDance Cloud Native
ByteDance Cloud Native
Aug 7, 2024 · Artificial Intelligence

Deploy Stable Diffusion in 5 Minutes with Volcengine’s Continuous Delivery CP

Learn how to quickly launch a Stable Diffusion WebUI service in just five minutes using Volcengine’s cloud‑native continuous delivery platform, which abstracts Kubernetes complexities, provides pre‑configured AI templates, serverless VCI deployment, automatic scaling, API gateway access, and includes a Python client for image generation.

AI deploymentCloud NativeContinuous Delivery
0 likes · 14 min read
Deploy Stable Diffusion in 5 Minutes with Volcengine’s Continuous Delivery CP
DevOps
DevOps
Apr 18, 2024 · Artificial Intelligence

Expert Round‑Table on AIGC: Technology vs. Market Beliefs, Domestic Model Challenges, and Enterprise Deployment in China

The article presents a 2024 AIGC round‑table where Chinese experts discuss whether to follow a technology‑first or market‑first approach, the challenges of compute, algorithms and data, domestic versus foreign large‑model strategies, multi‑model deployment in enterprises, and criteria for evaluating successful AIGC applications.

AI deploymentAIGCChina AI
0 likes · 14 min read
Expert Round‑Table on AIGC: Technology vs. Market Beliefs, Domestic Model Challenges, and Enterprise Deployment in China
DataFunTalk
DataFunTalk
Dec 19, 2023 · Artificial Intelligence

Enterprise Large‑Model Deployment and Data Governance: Insights from Deepexi’s President

The article examines how enterprises can adopt domain‑specific large models by balancing demand‑side cost‑reduction needs with supply‑side mature training techniques, discusses team composition, fine‑tuning methods, data governance for unstructured data, and outlines Deepexi’s product ecosystem designed to improve efficiency, performance, and user experience.

AI deploymentcost economicsdata governance
0 likes · 13 min read
Enterprise Large‑Model Deployment and Data Governance: Insights from Deepexi’s President
DataFunSummit
DataFunSummit
Dec 16, 2023 · Artificial Intelligence

Enterprise Large Model Deployment: Data Governance, Fine‑Tuning Strategies, and Cost Economics

The article examines how enterprises can adopt domain‑specific large models by addressing talent and cost challenges, outlining self‑supervised pre‑training, instruction fine‑tuning, data governance for unstructured data, dataset balance, model‑type selection, and integrated product solutions to achieve efficient, high‑performance AI deployments.

AI deploymentLarge Modelscost economics
0 likes · 13 min read
Enterprise Large Model Deployment: Data Governance, Fine‑Tuning Strategies, and Cost Economics
DataFunSummit
DataFunSummit
Dec 13, 2023 · Artificial Intelligence

Enterprise Large‑Model Deployment: Data Governance, Fine‑Tuning Strategies, and Cost Economics

The article explores how enterprises can adopt domain‑specific large language models by addressing talent and cost challenges, outlining training pipelines, data governance for unstructured data, dataset balancing, fine‑tuning techniques, and a product ecosystem that lowers deployment barriers while optimizing performance and economics.

AI deploymentcost economicsdata governance
0 likes · 13 min read
Enterprise Large‑Model Deployment: Data Governance, Fine‑Tuning Strategies, and Cost Economics
DataFunTalk
DataFunTalk
Oct 20, 2023 · Artificial Intelligence

Building the ATLAS Automated Machine Learning Platform at Du Xiaoman: Architecture, Practices, and Optimizations

This article describes how Du Xiaoman tackled the high cost, instability, and long cycles of AI algorithm deployment by building the ATLAS automated machine learning platform, detailing its four‑stage workflow, component platforms, scaling and efficiency techniques, and practical Q&A for practitioners.

AI deploymentAutoMLData Parallelism
0 likes · 22 min read
Building the ATLAS Automated Machine Learning Platform at Du Xiaoman: Architecture, Practices, and Optimizations
Tencent Cloud Developer
Tencent Cloud Developer
May 24, 2023 · Artificial Intelligence

Deploying Stable Diffusion on Tencent Cloud: A Step‑by‑Step Guide

Deploy Stable Diffusion on Tencent Cloud by building a Docker image, pushing it to TCR, creating a GPU‑enabled TKE cluster with CFS storage, configuring qGPU sharing, exposing the service via Cloud Native API Gateway, optimizing inference with TACO Kit, storing results in COS, and applying content moderation.

AI deploymentGPUKubernetes
0 likes · 19 min read
Deploying Stable Diffusion on Tencent Cloud: A Step‑by‑Step Guide
DataFunTalk
DataFunTalk
Feb 18, 2023 · Artificial Intelligence

Building the ATLAS Automated Machine Learning Platform at Du Xiaoman: Architecture, Optimization, and Practical Insights

This article details Du Xiaoman's development of the ATLAS automated machine learning platform, covering business scenarios, AI algorithm deployment challenges, the end‑to‑end production workflow, platform components such as annotation, data, training and deployment, as well as optimization techniques like AutoML, meta‑learning, NAS, and large‑scale parallelism, concluding with lessons learned and future directions.

AI deploymentAutoMLData Engineering
0 likes · 20 min read
Building the ATLAS Automated Machine Learning Platform at Du Xiaoman: Architecture, Optimization, and Practical Insights
Baidu Geek Talk
Baidu Geek Talk
Apr 13, 2022 · Artificial Intelligence

Smart Retail Product Recognition Solution Using PaddlePaddle PP-ShiTu

The article presents PaddlePaddle’s PP‑ShiTu‑based smart retail product recognition solution, detailing a complete pipeline—from data preparation and model optimization to low‑latency deployment—that overcomes high‑similarity packaging, rapid SKU changes, and costly retraining, achieving over 98 % Top‑1 recall with 0.2‑second CPU inference.

AI deploymentPP-ShiTuPaddlePaddle
0 likes · 7 min read
Smart Retail Product Recognition Solution Using PaddlePaddle PP-ShiTu
DataFunTalk
DataFunTalk
Sep 14, 2021 · Artificial Intelligence

AI Model Deployment on Edge Devices: Adaptation, Optimization, and Continuous Iteration – Interview Insights

The article shares a programmer's interview experience at Baidu, discussing how to adapt AI algorithms for edge deployment, balance model performance and efficiency, apply model compression techniques, and continuously iterate models, while also promoting an upcoming AI deployment online course.

AI deploymentEdge Computingframework support
0 likes · 6 min read
AI Model Deployment on Edge Devices: Adaptation, Optimization, and Continuous Iteration – Interview Insights
iQIYI Technical Product Team
iQIYI Technical Product Team
Jul 3, 2020 · Artificial Intelligence

Optimizing Video Inference Services for High GPU Utilization in AI Applications

By moving decoding, color conversion, preprocessing, inference, and re‑encoding entirely onto the GPU and enabling batch processing with flexible Python scripts, iQIYI’s video‑image enhancement service achieved ten‑fold throughput, over 90 % GPU utilization, and dramatically lower resource use, accelerating AI video inference deployment.

AI deploymentDeepStreamFFmpeg
0 likes · 14 min read
Optimizing Video Inference Services for High GPU Utilization in AI Applications
360 Zhihui Cloud Developer
360 Zhihui Cloud Developer
Sep 14, 2017 · Artificial Intelligence

Running TensorFlow on Kubernetes: A Practical Guide to Scalable AI Workloads

This article explains how to deploy TensorFlow on Kubernetes, addressing resource isolation, GPU scheduling, and distributed training challenges by introducing a custom TensorFlow‑on‑K8s system with client, task, and autospec modules, plus container design for reliable job execution.

AI deploymentGPU schedulingKubernetes
0 likes · 9 min read
Running TensorFlow on Kubernetes: A Practical Guide to Scalable AI Workloads