Tagged articles

Model Deployment

155 articles · Page 2 of 2

May 18, 2023 · Artificial Intelligence

Local Deployment, Inference, and Fine‑tuning of the Vicuna‑7B Large Language Model

This article details the step‑by‑step process of preparing the environment, merging weights, installing dependencies, running inference, evaluating Vicuna‑7B against other models, and attempting fine‑tuning, while highlighting performance results, encountered issues, and future work for large language model deployment.

GPUModel DeploymentVicuna

0 likes · 11 min read

Local Deployment, Inference, and Fine‑tuning of the Vicuna‑7B Large Language Model

HelloTech

Apr 19, 2023 · Cloud Native

How FaaS Transforms AI Platforms: Lessons from Haro’s Cloud‑Native Journey

The article analyzes the operational, stability, and cost challenges of Haro’s AI platform, explains why a serverless FaaS architecture—specifically Knative—was selected, and details the implementation steps, performance gains, and future scenarios for AI workloads.

AI platformCloud NativeFaaS

0 likes · 8 min read

How FaaS Transforms AI Platforms: Lessons from Haro’s Cloud‑Native Journey

HelloTech

Apr 12, 2023 · Artificial Intelligence

Integrating Machine Learning Ranking into Elasticsearch: Architecture, Components, and Performance

The team embedded a full machine‑learning ranking pipeline as an Elasticsearch plug‑in—combining real‑time and offline feature stores, hot‑loadable model jars via Dragonfly, an MLeap execution engine, and a DSL for feature definition—replacing the coarse‑ranking logistic‑regression with a tree model that adds ~10 ms latency but yields a 1.2 % AB‑test lift, while maintaining high throughput, low CPU usage, and supporting future batch deep‑learning rescoring.

Model Deploymentfeature engineeringonline prediction

0 likes · 16 min read

Integrating Machine Learning Ranking into Elasticsearch: Architecture, Components, and Performance

Tencent Advertising Technology

Mar 30, 2023 · Artificial Intelligence

Tencent's Taiji Machine Learning Platform: End-to-End MLOps for Advertising

Tencent’s Taiji machine learning platform, a cloud‑native, distributed parameter‑server system, provides end‑to‑end MLOps for advertising by integrating data ingestion, feature engineering, model training, evaluation, deployment, and monitoring, supporting massive models up to billions of parameters while improving efficiency, scalability, and resource management.

MLOpsMachine Learning PlatformModel Deployment

0 likes · 18 min read

Tencent's Taiji Machine Learning Platform: End-to-End MLOps for Advertising

Xianyu Technology

Dec 21, 2022 · Artificial Intelligence

Xianyu Recommendation System: Architecture, Challenges, and Deployment

The Xianyu recommendation system, built by backend expert Wan Xiaoyong, evolved from offline scoring to a full‑graph, serverless recall‑ranking pipeline that tackles C2C uncertainties through centralized feature engineering, model compression, staged deployment, flexible experimentation, robust governance, and plans for automated attribution and interpretability.

.aiBig DataModel Deployment

0 likes · 10 min read

Xianyu Recommendation System: Architecture, Challenges, and Deployment

Alipay Experience Technology

Dec 9, 2022 · Artificial Intelligence

How Ant Group Scales Mobile AI with Thousand‑by‑Thousand Models

This article reviews Ant Group's "thousand‑by‑thousand" model strategy for mobile AI, detailing the challenges of device compute and user heterogeneity, the engineering upgrades for offline development and online release, and the measurable business improvements achieved.

.aiAnt GroupModel Deployment

0 likes · 12 min read

How Ant Group Scales Mobile AI with Thousand‑by‑Thousand Models

SQB Blog

Nov 18, 2022 · Artificial Intelligence

Boosting AI Model Development with Alibaba's EasyModeling Framework

This article introduces the EasyModeling framework built on Alibaba Cloud's PAI platform, detailing its modular design, high reusability, integration with deep‑learning libraries, automated hyper‑parameter tuning, deployment scenarios, and a real‑world case study using RoBERTa for dish‑name standardization, demonstrating significant performance gains.

AI modelingAlibaba CloudAutoML

0 likes · 13 min read

Boosting AI Model Development with Alibaba's EasyModeling Framework

Zuoyebang Tech Team

Nov 17, 2022 · Artificial Intelligence

Scaling Deep Learning Model Serving: High‑Concurrency, Low‑Latency Solutions

This article examines the challenges of deploying dozens of deep‑learning models at Zuoyebang and compares three serving architectures—Gunicorn + Flask + Transformers, Tornado + PyTorch, and Tornado + Triton—highlighting performance trade‑offs and presenting a final high‑concurrency, low‑latency solution in production.

Deep LearningHigh concurrencyInference Serving

0 likes · 11 min read

Scaling Deep Learning Model Serving: High‑Concurrency, Low‑Latency Solutions

Efficient Ops

Nov 7, 2022 · Artificial Intelligence

Unlocking AI Project Success with the New MLOps Maturity Assessment

This article outlines the background, standards, evaluation items, process, and registration details of a newly launched MLOps development management maturity assessment designed to accelerate AI model delivery and improve operational efficiency across teams.

AI OperationsMLOpsModel Deployment

0 likes · 6 min read

Unlocking AI Project Success with the New MLOps Maturity Assessment

vivo Internet Technology

Oct 9, 2022 · Artificial Intelligence

vivo Machine Learning Platform: Architecture Design and Practice

vivo’s machine‑learning platform, built for its massive app‑store and e‑commerce ecosystem, streamlines data processing, model training, and deployment through quota‑based resource management, a custom ultra‑large‑scale TensorFlow‑vlps framework, OpenAPI‑driven training, and Jupyter‑integrated interactive development, boosting efficiency for billions of samples and features.

MLOpsMachine Learning PlatformModel Deployment

0 likes · 12 min read

vivo Machine Learning Platform: Architecture Design and Practice

Meituan Technology Team

Sep 22, 2022 · Artificial Intelligence

Quantization Deployment Scheme for YOLOv6: Methods, Optimizations, and Performance Evaluation

The paper proposes a full quantization pipeline for YOLOv6 that combines a re‑parameterization optimizer, partial PTQ, channel‑wise distillation, graph‑scale merging, and GPU‑offloaded preprocessing, enabling an INT8 model to retain ~42 % mAP while delivering over 200 % throughput increase and 40 % QPS gain versus FP16.

Channel DistillationModel DeploymentPTQ

0 likes · 16 min read

Quantization Deployment Scheme for YOLOv6: Methods, Optimizations, and Performance Evaluation

Zuoyebang Tech Team

Sep 15, 2022 · Artificial Intelligence

How We Replaced BERT with a Lightweight TextCNN to Slash GPU Costs

This article describes the production challenges of using BERT for large‑scale text classification at Zuoyebang, explores lightweight alternatives such as knowledge distillation, pruning and quantization, and details a teacher‑student‑active‑learning pipeline that trains a TextCNN model to match BERT performance while dramatically reducing GPU consumption and improving throughput.

Active LearningBERTKnowledge Distillation

0 likes · 13 min read

How We Replaced BERT with a Lightweight TextCNN to Slash GPU Costs

Meituan Technology Team

Jun 16, 2022 · Artificial Intelligence

Edge AI Re‑ranking in Meituan/Dianping Search: Architecture, Algorithms, and Deployment

Meituan/Dianping’s edge‑AI re‑ranking system moves large‑scale deep‑learning models onto users’ devices, using dense networks and cloud‑served embeddings, advanced feedback‑sequence and multi‑view attention models, and aggressive compression to deliver real‑time, privacy‑preserving search personalization that boosts click‑through rates by up to 0.43 %.

Model DeploymentRe‑rankingedge AI

0 likes · 25 min read

Edge AI Re‑ranking in Meituan/Dianping Search: Architecture, Algorithms, and Deployment

DataFunTalk

Mar 31, 2022 · Artificial Intelligence

Comprehensive Guide to TensorFlow: Modeling, Deployment, and Operations

This article provides an in‑depth overview of the TensorFlow ecosystem, covering Keras modeling productivity tools, classic model examples, AutoKeras and KerasTuner for automated search, data preprocessing pipelines, performance profiling, model optimization, and multiple deployment strategies for servers, browsers, and edge devices.

AutoMLKerasModel Deployment

0 likes · 20 min read

Comprehensive Guide to TensorFlow: Modeling, Deployment, and Operations

ELab Team

Mar 16, 2022 · Artificial Intelligence

Reverse Dictionary Made Easy: Harness WantWords and Hugging Face for Quick NLP Model Integration

This article introduces the open‑source WantWords reverse‑dictionary project, explains its token‑based processing pipeline, walks through Python implementation and model invocation with Hugging Face’s Transformers, reviews NLP’s historical evolution, and shows how front‑end developers can quickly integrate NLP models into products.

Artificial IntelligenceBERTHugging Face

0 likes · 13 min read

Reverse Dictionary Made Easy: Harness WantWords and Hugging Face for Quick NLP Model Integration

NetEase Cloud Music Tech Team

Mar 9, 2022 · Industry Insights

How NetEase Cloud Music Built a Real‑Time Live‑Stream Recommendation System

This article details the architecture, incremental model training, feature engineering, and deployment strategies that enabled NetEase Cloud Music to achieve real‑time live‑stream recommendation, covering business background, multi‑objective modeling, real‑time feature pipelines, sample attribution, feature admission, and online performance results.

Incremental LearningLive StreamingModel Deployment

0 likes · 26 min read

How NetEase Cloud Music Built a Real‑Time Live‑Stream Recommendation System

MaGe Linux Operations

Jan 30, 2022 · Artificial Intelligence

PyTorch vs TensorFlow in 2022: Which Framework Wins for Your Needs?

This article compares PyTorch and TensorFlow in 2022 across model availability, deployment ease, and ecosystem support, using data from HuggingFace, research papers, and industry tools, and offers tailored recommendations for industry engineers, researchers, educators, career changers, hobbyists, and beginners.

.aiDeep LearningModel Deployment

0 likes · 20 min read

PyTorch vs TensorFlow in 2022: Which Framework Wins for Your Needs?

Youzan Coder

Jan 17, 2022 · Artificial Intelligence

Model Deployment Challenges and a Seldon‑Based Cloud‑Native Solution

The team replaced the cumbersome ABox deployment stack with Seldon‑based cloud‑native serving on Kubernetes, unifying TensorFlow and other framework models, adding GPU sharing, automated CRUD, per‑model ingress, monitoring, and log collection, achieving scalable, fault‑tolerant, zero‑downtime model deployment.

AI servingCloud NativeGPU

0 likes · 11 min read

Model Deployment Challenges and a Seldon‑Based Cloud‑Native Solution

DataFunTalk

Nov 21, 2021 · Artificial Intelligence

Design Considerations for Next‑Generation AI Platforms: Programming Languages, Runtime Environment, Scheduler, and Model Deployment

The article examines three key design dimensions of modern AI platforms—programming language choice, runtime environment isolation, and scheduling/resource management—while also discussing challenges in model deployment such as algorithm diversity, resource usage patterns, and architectural generality, proposing Kubernetes‑based solutions and Arrow‑based data sharing to achieve efficient, scalable AI services.

KubernetesModel DeploymentPython

0 likes · 14 min read

Design Considerations for Next‑Generation AI Platforms: Programming Languages, Runtime Environment, Scheduler, and Model Deployment

DataFunSummit

Nov 20, 2021 · Artificial Intelligence

Design Dimensions of Next‑Generation AI Platforms: Programming Languages, Runtime Environments, and Model Deployment

The article examines three key design dimensions of modern AI platforms—choice of programming language, runtime environment isolation, and model deployment—highlighting how Python’s dominance, container‑based resource management, and efficient data sharing shape platform architecture and performance.

AI PlatformsApache ArrowKubernetes

0 likes · 13 min read

Design Dimensions of Next‑Generation AI Platforms: Programming Languages, Runtime Environments, and Model Deployment

Taobao Frontend Technology

Sep 23, 2021 · Artificial Intelligence

Build and Deploy ML Models with Pipcook 2.0 in Under 20 Seconds

Discover how Pipcook 2.0 dramatically speeds up machine‑learning workflows for web developers—cutting installation to under 20 seconds, enabling rapid model training, prediction, and deployment via concise JSON pipelines, with step‑by‑step guidance, code snippets, and practical examples for image and text classification.

AI PipelineModel DeploymentPipcook

0 likes · 12 min read

Build and Deploy ML Models with Pipcook 2.0 in Under 20 Seconds

Baidu Geek Talk

Sep 13, 2021 · Artificial Intelligence

Upgrading WanFang Academic Paper Retrieval System with PaddleNLP

WanFang upgraded its academic paper retrieval system by adopting PaddleNLP’s Chinese pre‑trained Sentence‑BERT models, using weakly supervised SimCSE data and Milvus vector indexing, compressing the transformer for TensorRT‑accelerated inference, achieving 70% better matching quality and 2600 QPS latency‑optimized performance.

Model DeploymentPaddleNLPSentence-BERT

0 likes · 8 min read

Upgrading WanFang Academic Paper Retrieval System with PaddleNLP

Amap Tech

Jun 4, 2021 · Artificial Intelligence

Deploying Multiple CNN Models on Low‑End Devices with MNN: Memory Tricks and Error Debugging

This article explains how a high‑traffic map service captures road features using client‑side computer‑vision models, details the deployment of many CNNs with the lightweight MNN engine on memory‑constrained devices, and shares practical memory‑saving techniques, inference scheduling, and error‑analysis methods.

AndroidMNNMemory optimization

0 likes · 12 min read

Deploying Multiple CNN Models on Low‑End Devices with MNN: Memory Tricks and Error Debugging

Meituan Technology Team

May 13, 2021 · Artificial Intelligence

Design and Practice of Turing OS: An Online Service Framework for Machine Learning and Deep Learning at Meituan

Meituan’s Turing OS unifies the end‑to‑end machine‑learning lifecycle—data preprocessing, feature generation, model training, deployment, online prediction and A/B testing—through a lightweight SDK, plugin‑based algorithms, DAG orchestration, sandbox validation and replay tools, cutting algorithm iteration from days to hours while handling billions of daily predictions.

Algorithm PlatformModel DeploymentScalable Architecture

0 likes · 31 min read

Design and Practice of Turing OS: An Online Service Framework for Machine Learning and Deep Learning at Meituan

DataFunSummit

Mar 28, 2021 · Artificial Intelligence

Deploying Scikit‑learn and HMMlearn Models as High‑Performance Online Prediction Services Using ONNX

This article demonstrates how to convert traditional scikit‑learn and hmmlearn machine‑learning models into ONNX format and integrate them into a C++ gRPC service for fast online inference, covering environment setup, model conversion, custom operators, performance testing, and end‑to‑end pipeline construction.

C#Model DeploymentONNX

0 likes · 22 min read

Deploying Scikit‑learn and HMMlearn Models as High‑Performance Online Prediction Services Using ONNX

360 Quality & Efficiency

Mar 26, 2021 · Operations

Deploying a Code Clone Detection Model with TorchServe

This article explains how to build a code clone detection service using a CodeBERT classification model, create a custom TorchServe handler, package the model with torch-model-archiver, launch the service, and test it with example code pairs to demonstrate clone and non‑clone predictions.

Model DeploymentPyTorchTorchServe

0 likes · 8 min read

Deploying a Code Clone Detection Model with TorchServe

Amap Tech

Feb 1, 2021 · Artificial Intelligence

AMAP-TECH Algorithm Competition: Dynamic Road Condition Analysis Using In-Vehicle Video

The AMAP‑TECH competition challenged participants to infer real‑time road conditions from in‑vehicle video, prompting the authors to combine lane‑wise vehicle detection with LightGBM and later an end‑to‑end DenseNet‑GRU model, augment data, ensemble five networks, and achieve a 0.7237 F1 score while outlining future deployment and research directions.

Deep LearningModel Deploymentcomputer vision

0 likes · 15 min read

AMAP-TECH Algorithm Competition: Dynamic Road Condition Analysis Using In-Vehicle Video

Suning Technology

Nov 14, 2020 · Artificial Intelligence

Designing Real-Time AI Algorithms for Unmanned Retail Stores

This lecture details the end‑to‑end AI architecture for unmanned stores, covering algorithm module selection, calibration, face recognition, multi‑task detection, tracking, recommendation, data collection, augmentation, model training, and GPU‑accelerated deployment to achieve real‑time performance and high accuracy.

Deep LearningModel DeploymentReal-time AI

0 likes · 15 min read

Designing Real-Time AI Algorithms for Unmanned Retail Stores

360 Tech Engineering

Aug 17, 2020 · Artificial Intelligence

Deploying TensorFlow 2.x Models with TensorFlow Serving: Concepts, Setup, and Usage

This guide explains the core concepts of TensorFlow Serving, shows how to prepare Docker images, save TensorFlow 2.x models in various formats, configure version policies, warm‑up models, start the service, and invoke it via gRPC or HTTP with complete code examples.

DockerHTTPModel Deployment

0 likes · 11 min read

Deploying TensorFlow 2.x Models with TensorFlow Serving: Concepts, Setup, and Usage

360 Quality & Efficiency

Aug 14, 2020 · Artificial Intelligence

Deploying TensorFlow 2.x Models with TensorFlow Serving: Architecture, Setup, and Usage

This article explains the core concepts of TensorFlow Serving, shows how to prepare the environment with Docker, convert TensorFlow 2.x models to the SavedModel format, configure version policies, warm‑up the service, and invoke predictions via gRPC or HTTP interfaces.

DockerHTTPModel Deployment

0 likes · 11 min read

Deploying TensorFlow 2.x Models with TensorFlow Serving: Architecture, Setup, and Usage

Laravel Tech Community

May 31, 2020 · Mobile Development

Deploying and Training Deep Learning Models on iOS and Android: Core ML, NNAPI, and TensorFlow Lite

This article explains how to train and deploy convolutional neural networks directly on iOS and Android devices using Core ML, NNAPI, and TensorFlow Lite, compares performance with desktop TensorFlow, and provides practical code snippets and build‑time tips for mobile AI development.

AndroidCore MLModel Deployment

0 likes · 7 min read

Deploying and Training Deep Learning Models on iOS and Android: Core ML, NNAPI, and TensorFlow Lite

Baidu App Technology

May 29, 2020 · Mobile Development

How MML Simplifies Mobile AI Deployment: Architecture, Tools, and Code Walkthrough

This article explains the background of on‑device AI, introduces the Mobile Machine Learning (MML) framework and its layered architecture, details the core utilities such as model decryption and task scheduling, and provides a step‑by‑step code guide for initializing, preprocessing, inference, post‑processing, and releasing resources on mobile platforms.

AndroidMMLModel Deployment

0 likes · 9 min read

How MML Simplifies Mobile AI Deployment: Architecture, Tools, and Code Walkthrough

JD Tech Talk

Feb 13, 2020 · Artificial Intelligence

Full-Process Traceability Management for Machine Learning Models: Challenges, Methods, and Solutions

This article analyzes the challenges of managing the entire machine‑learning lifecycle, reviews existing traceability approaches, and proposes comprehensive methods for versioned management of model training, prediction, and online service to improve efficiency, reproducibility, and maintenance of AI systems.

AI workflowModel DeploymentTraceability

0 likes · 18 min read

Full-Process Traceability Management for Machine Learning Models: Challenges, Methods, and Solutions

Alibaba Terminal Technology

Jan 3, 2020 · Frontend Development

How Frontend Code Was Auto-Generated for Alibaba’s Double‑11 Event

This article explains how Alibaba's Frontend Intelligent Project automatically generated 79.34% of the Double‑11 page code by recognizing business modules from visual drafts using data‑augmented samples, traditional multi‑class machine‑learning models, and a pipeline of preprocessing, model training, deployment, and OOD handling.

AutomationModel Deploymentcode generation

0 likes · 15 min read

How Frontend Code Was Auto-Generated for Alibaba’s Double‑11 Event

UCloud Tech

Dec 10, 2019 · Artificial Intelligence

Train and Deploy a CIFAR‑10 Image Classification Model with UAI Platform

This tutorial walks university students through the complete workflow of using the CIFAR‑10 dataset to train a convolutional neural network for image classification and then deploying the model as an online inference service on the UAI‑Train and UAI‑Inference platforms.

CIFAR-10Deep LearningDocker

0 likes · 6 min read

Train and Deploy a CIFAR‑10 Image Classification Model with UAI Platform

360 Quality & Efficiency

Dec 6, 2019 · Artificial Intelligence

Deploying YOLO V3 with TensorFlow Serving: Environment Setup, Model Conversion, Service Deployment, and Performance Comparison

This article explains how to prepare the Docker environment, install TensorFlow Serving (CPU and GPU versions), convert a YOLO V3 checkpoint to SavedModel, deploy the model as a service, warm‑up and manage versions, invoke it via gRPC and HTTP, and compare CPU versus GPU inference performance.

.aiDockerGPU

0 likes · 9 min read

Deploying YOLO V3 with TensorFlow Serving: Environment Setup, Model Conversion, Service Deployment, and Performance Comparison

Ctrip Technology

Nov 21, 2019 · Artificial Intelligence

Designing and Deploying an NLP Model for Airline Ticket Customer Service

This article describes the end‑to‑end development of a multi‑class NLP system for Ctrip airline ticket customer service, covering problem analysis, data preprocessing, sample balancing, model architecture (TextCNN and Bi‑GRU), training strategies, performance evaluation, and online customization to achieve high accuracy in intent recognition.

Bi-GRUDeep LearningModel Deployment

0 likes · 16 min read

Designing and Deploying an NLP Model for Airline Ticket Customer Service

Alibaba Cloud Developer

Sep 3, 2019 · Artificial Intelligence

Unlocking Scalable Private‑Domain Recommendations with a “4+N” Architecture

This article describes a systematic, standardized, and automated “4+N” recommendation framework that unifies features, samples, models, and pipelines to accelerate private‑domain marketing recommendations across multiple scenarios while improving accuracy, efficiency, and business impact.

AI ArchitectureDeep LearningModel Deployment

0 likes · 12 min read

Unlocking Scalable Private‑Domain Recommendations with a “4+N” Architecture

DataFunTalk

Aug 22, 2019 · Artificial Intelligence

End‑to‑End Group Risk Perception Modeling: From Requirement Mining to Deployment

This article presents a comprehensive workflow for group risk perception, covering business requirement mining, data acquisition and understanding, feature engineering, model training and evaluation, deployment, and practical user applications, with detailed objectives, methods, and deliverables for each stage.

Model Deploymentdata mininggroup behavior analysis

0 likes · 11 min read

End‑to‑End Group Risk Perception Modeling: From Requirement Mining to Deployment

360 Tech Engineering

Apr 24, 2019 · Artificial Intelligence

Introduction to Kubeflow and Its Installation Process

This article introduces Kubeflow, explains the typical machine‑learning model lifecycle, outlines Kubeflow’s core components and Kubernetes advantages, provides detailed server and storage configuration, walks through ksonnet and Kubeflow installation steps, and shows how to verify deployments and access the Kubeflow UI.

AI platformKubeflowKubernetes

0 likes · 6 min read

Introduction to Kubeflow and Its Installation Process

JD Tech

Mar 8, 2019 · Artificial Intelligence

Integrated Engineering & Algorithm Platform for AI Visual Applications

This article describes a comprehensive, end‑to‑end AI visual algorithm platform that unifies data collection, annotation, model training, deployment, testing, quality evaluation, and service gateways, illustrating how such integration improves transparency, efficiency, and quality across use cases like background removal, face swapping, and clothing recommendation.

.aiAlgorithm PlatformClothing Recommendation

0 likes · 13 min read

Integrated Engineering & Algorithm Platform for AI Visual Applications

JD Tech

Mar 4, 2019 · Artificial Intelligence

A Practical Guide to H2O AutoML: Installation, Python Workflow, Model Training, and Deployment

This article introduces the open‑source H2O platform, walks through installing the Python package, demonstrates data import, model building with GBM and AutoML, evaluates results, explains model deployment via POJO/MOJO, and discusses the visual Flow UI and broader implications of automated modeling.

AutoMLH2OModel Deployment

0 likes · 12 min read

A Practical Guide to H2O AutoML: Installation, Python Workflow, Model Training, and Deployment

JD Tech Talk

Mar 1, 2019 · Artificial Intelligence

Introduction to H2O AutoML: Overview, Practical Workflow, and Model Deployment

This article introduces the open‑source H2O platform, explains how to install and use its Python API for data loading, preprocessing, model training with GBM and AutoML, evaluates results with AUC, and describes model deployment via POJO/MOJO as well as the visual Flow UI, concluding with reflections on the role of automated modeling in data science.

AutoMLH2OModel Deployment

0 likes · 12 min read

Introduction to H2O AutoML: Overview, Practical Workflow, and Model Deployment

21CTO

Dec 25, 2018 · Artificial Intelligence

Demystifying Learning to Rank: From Core Concepts to Scalable Online Architecture

This article offers a comprehensive, system‑engineer‑focused guide to Learning to Rank, covering fundamental machine‑learning concepts, evaluation metrics, training approaches, and a detailed online ranking architecture with feature, recall, and model governance, illustrated by real‑world examples from Meituan‑Dianping.

A/B testingLearning-to-RankModel Deployment

0 likes · 32 min read

Demystifying Learning to Rank: From Core Concepts to Scalable Online Architecture

Meituan Technology Team

Oct 11, 2018 · Artificial Intelligence

Deploying and Optimizing TensorFlow Serving for High‑Performance CTR Prediction

Meituan’s user‑growth team built a Wide‑Deep CTR prediction model, trained offline with Spark‑generated TFRecords, and deployed it via TensorFlow Serving on YARN, then applied request‑side multithreading, offline one‑hot preprocessing, XLA JIT compilation, and dedicated loading threads to cut end‑to‑end latency from ~18 ms to ~6 ms and eliminate model‑switch spikes.

Model DeploymentTensorFlow Servingdistributed training

0 likes · 15 min read

Deploying and Optimizing TensorFlow Serving for High‑Performance CTR Prediction

Qizhuo Club

Aug 17, 2018 · Artificial Intelligence

43 Essential Rules for Building Robust Machine Learning Systems

These 43 practical rules, adapted from Martin Zinkevich’s “Rules of ML,” guide engineers through terminology, pipeline design, feature engineering, monitoring, and model deployment, offering actionable advice to avoid common pitfalls and build reliable, scalable machine‑learning‑driven products.

Model Deploymentbest practicesengineering

0 likes · 41 min read

43 Essential Rules for Building Robust Machine Learning Systems

Architecture Digest

Jul 29, 2018 · Artificial Intelligence

Design and Implementation of a Machine Learning Data Platform at Getui

This article describes Getui's end‑to‑end machine‑learning data platform, covering business use cases, the full ML workflow from data ingestion and feature engineering to model training, deployment, monitoring, and the practical tools and solutions adopted to address common challenges in large‑scale AI projects.

.aiData PlatformJupyter

0 likes · 11 min read

Design and Implementation of a Machine Learning Data Platform at Getui

Architecture Digest

Jul 27, 2018 · Artificial Intelligence

Comprehensive Guide to Deploying Deep Learning Models in Production

This article provides a step‑by‑step tutorial on deploying trained deep‑learning models to production, covering client‑server architecture, load balancing with Nginx, using Gunicorn and Flask, cloud platform choices, autoscaling, CI/CD pipelines, and additional tools such as TensorFlow Serving and Docker.

APICloud ComputingDocker

0 likes · 11 min read

Comprehensive Guide to Deploying Deep Learning Models in Production

Tencent Cloud Developer

May 29, 2018 · Artificial Intelligence

Intelligent Titanium TI-ONE: Tencent Cloud's One-Stop Machine Learning IDE

Intelligent Titanium TI-ONE is a one‑stop ML IDE on Tencent Cloud offering integrated data preparation, drag‑and‑drop algorithm development, automatic hyperparameter tuning, multi‑level collaboration, one‑click model deployment, and support for major frameworks such as TensorFlow, PyTorch, Angel and XGBoost, plus commercial features via GaiaStack.

AI platformMachine Learning PlatformModel Deployment

0 likes · 10 min read

Intelligent Titanium TI-ONE: Tencent Cloud's One-Stop Machine Learning IDE

UCloud Tech

Jul 20, 2017 · Artificial Intelligence

Build a Real-Time Facial Expression Recognition Service with UCloud AI-as-a-Service

This guide walks you through training an Inception‑V3 model on the FER2013 dataset with TensorFlow 1.1, packaging the model, and deploying a scalable facial expression recognition API using UCloud's AI‑as‑a‑Service platform, including performance testing against GPU benchmarks.

.aiFacial Expression RecognitionModel Deployment

0 likes · 11 min read

Build a Real-Time Facial Expression Recognition Service with UCloud AI-as-a-Service

Qunar Tech Salon

May 15, 2017 · Artificial Intelligence

Building an Algorithm Platform for Machine Learning Deployment at Qunar

The article describes how a three‑stage algorithm platform was designed and implemented to automate model deployment, unify feature processing, and provide service‑oriented model evaluation, debugging, and monitoring for machine‑learning applications in a large e‑commerce environment.

AI servicesAlgorithm PlatformModel Deployment

0 likes · 10 min read

Building an Algorithm Platform for Machine Learning Deployment at Qunar

Ctrip Technology

May 6, 2017 · Artificial Intelligence

Building an Algorithm Platform: Deployment Challenges, Feature Processing, and Serviceization

The article describes how Ctrip's algorithm platform was built in three stages to address deployment friction, reusable feature engineering, and model training, detailing the technical problems, Java/Python integration, code interfaces, transform configurations, and the eventual service‑oriented architecture.

Algorithm PlatformModel Deploymentjava

0 likes · 10 min read

Building an Algorithm Platform: Deployment Challenges, Feature Processing, and Serviceization

Qunar Tech Salon

Jan 24, 2017 · Artificial Intelligence

Practical Approaches to Deploying Machine Learning Models: Real‑time SOA, PMML, Rserve, and Spark

This article shares practical engineering experiences for deploying machine learning models in various scenarios—real‑time low‑volume predictions via Rserve or Python‑httpserve, high‑throughput real‑time serving with PMML‑wrapped Java classes, and offline batch predictions using simple shell scripts—detailing tools, performance considerations, and implementation steps.

Artificial IntelligenceModel DeploymentPMML

0 likes · 11 min read

Practical Approaches to Deploying Machine Learning Models: Real‑time SOA, PMML, Rserve, and Spark

Ctrip Technology

Jan 5, 2017 · Artificial Intelligence

Practical Approaches to Deploying Machine Learning Models: PMML, Rserve, and Spark in Production

This article shares practical engineering experiences for deploying machine learning models in production, covering three typical scenarios—real‑time small data, real‑time large data, and offline predictions—and detailing how to use PMML, Rserve, Spark, shell scripts, and related tools to meet performance and operational requirements.

Model DeploymentPMMLRserve

0 likes · 12 min read

Practical Approaches to Deploying Machine Learning Models: PMML, Rserve, and Spark in Production

dbaplus Community

Dec 25, 2015 · Artificial Intelligence

Detecting Fraudulent ModemPOOL Terminals with K‑Means Clustering

This article details how telecom operators can identify fraudulent ModemPOOL (cat‑pool) terminals and predict churn using data‑driven clustering and day‑interval warning models, covering metric selection, data exploration, k‑means clustering, model deployment, and performance evaluation.

ClusteringK-MeansModel Deployment

0 likes · 18 min read

Detecting Fraudulent ModemPOOL Terminals with K‑Means Clustering