Serverless AI Inference with TensorFlow Serving on Tencent Cloud SCF
This tutorial shows how to package a TensorFlow SavedModel for MNIST, upload it to Tencent Cloud Object Storage, create a Python 2.7 SCF function that loads the model with TensorFlow Serving, and expose it via API Gateway as a scalable, server‑less AI inference endpoint.
This article provides a step‑by‑step tutorial on how to deploy an AI inference service using TensorFlow Serving inside Tencent Cloud's Serverless Cloud Function (SCF) platform. It explains the concept of AI serving, where a trained model is loaded into a production environment to perform inference on incoming data.
Overview : In typical AI projects, the focus is on model training and tuning, but the final deployment—AI serving—requires the model to be accessible via an API so that downstream applications can request predictions. Using SCF’s automatic scaling, the service can handle variable traffic while minimizing resource waste.
Model Preparation : The tutorial uses the MNIST dataset as an example. After training a TensorFlow model, the model is exported as a SavedModel (saved_model.pb and variables files) and placed under an export/4 directory.
Function Code :
#!/usr/bin/env python2.7
import os, sys, base64, urllib, json
import tensorflow as tf
import numpy as np
import utils
os.environ['TF_CPP_MIN_LOG_LEVEL'] = '2'
cur_dir = os.getcwd()
model_dir = cur_dir + "/export/4"
sess = tf.Session(graph=tf.Graph())
meta_graph_def = tf.saved_model.loader.load(sess, [tf.saved_model.tag_constants.SERVING], model_dir)
x = sess.graph.get_tensor_by_name('x:0')
y = sess.graph.get_tensor_by_name('y:0')
def run(event):
# ... (handle base64 or URL image, preprocess, run inference)
return True, result
def demo(event, context):
res, num = run(event)
return num
def apigw_interface(event, context):
if 'requestContext' not in event.keys():
return {"errMsg": "request not from api gw"}
body_info = json.loads(event['body'])
_, num = run(body_info)
return {"result": num}The utils.py module provides helper functions for downloading images and converting them into the required input array:
#!/usr/bin/env python2.7
import urllib2, numpy as np
from PIL import Image
def download_image(img_url, local_file):
# download image and save to local_file
return True
def get_image_array(img_path):
# resize, threshold, normalize and reshape to (1, 784)
return arrPackaging and Deployment : All source files (mnist.py, utils.py, the exported model, and required libraries such as PIL) are placed in a directory mnist_demo and zipped. The zip package is uploaded to Tencent Cloud Object Storage (COS) and referenced in the SCF console as the function code source.
Function Creation : In the SCF console, a new function (e.g., testai ) is created with runtime Python 2.7, code source set to the COS zip, and the handler specified as mnist.apigw_interface .
Testing the Function : A test event JSON is provided, containing either an image_base64 field or an image_url field. The function decodes/downloads the image, runs the TensorFlow model, and returns a JSON result such as {"result": 0} .
API Gateway Integration : An API Gateway service is created, linking the testai function as the backend. The API endpoint (POST /ai) forwards requests to the function, and the response is returned as JSON. The API can be published to a public environment and invoked via tools like Postman.
Summary : By combining TensorFlow Serving with serverless cloud functions, AI inference can be offered as a scalable, cost‑effective API without managing servers. The approach supports both CPU and GPU execution, enabling rapid deployment and easy model updates.
Tencent Cloud Developer
Official Tencent Cloud community account that brings together developers, shares practical tech insights, and fosters an influential tech exchange community.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.