Deploy Langchain‑ChatGLM on Volcengine VKE: A Step‑by‑Step Cloud‑Native Guide
This tutorial walks you through preparing a VKE cluster, pulling the Langchain‑ChatGLM container image, creating the necessary Deployment and Service resources, and adding a local knowledge base, enabling you to run a Langchain‑based ChatGLM service with GPU support on Volcengine’s cloud‑native platform.
Langchain is a framework for building applications with large language models, providing components to connect LLMs with external data sources. This article explains how to deploy Langchain‑ChatGLM on Volcengine’s VKE platform.
What is Langchain‑ChatGLM
Langchain‑ChatGLM combines the Langchain framework with the ChatGLM‑6B model, enabling conversational AI with knowledge base integration.
Step 1: Prepare VKE Cluster
Log in to the Volcengine console, create a VKE cluster (version 1.24) with VPC‑CNI networking and enable public access. Choose to create GPU nodes (ecs.gni2.3xlarge NVIDIA A10) and install the nvidia‑device‑plugin component.
Step 2: Download Code, Model and Build Image
The code and models are open‑source and can be obtained from GitHub and HuggingFace:
Code: https://github.com/imClumsyPanda/langchain-ChatGLM
ChatGLM‑6B model: https://huggingface.co/THUDM/chatglm-6b
Embedding model: https://huggingface.co/GanymedeNil/text2vec-large-chinese
A pre‑built container image (cr-demo-cn-beijing.cr.volces.com/vke-ai/langchain-chatglm:v0.0.1) containing the models (~24 GB) is provided for quick deployment.
Step 3: Create Langchain‑ChatGLM Service
In the VKE console, create a Deployment named langchain-new with one replica, using the provided image and requesting one GPU:
apiVersion: apps/v1
kind: Deployment
metadata:
name: langchain-new
spec:
replicas: 1
selector:
matchLabels:
app: langchain-new
template:
metadata:
labels:
app: langchain-new
spec:
containers:
- image: cr-demo-cn-beijing.cr.volces.com/vke-ai/langchain-chatglm:v0.0.1
name: langchain
resources:
limits:
nvidia.com/gpu: "1"Expose the deployment with a LoadBalancer Service on port 80 mapping to container port 7860:
apiVersion: v1
kind: Service
metadata:
name: langchain-new
spec:
ports:
- name: langchain
port: 80
protocol: TCP
targetPort: 7860
selector:
app: langchain-new
type: LoadBalancerStep 4: Add a Local Knowledge Base
In the running service, switch to “knowledge base Q&A” mode, create a new knowledge base (name must be non‑Chinese), upload files or folders, and the Langchain model will automatically index the content.
Final Demonstration
After the Service is created, access the external IP to interact with the Langchain‑ChatGLM service, optionally via Ingress ALB or API Gateway.
Related Links
Volcengine homepage: https://www.volcengine.com
Container Service (VKE): https://www.volcengine.com/product/vke
Image Registry: https://www.volcengine.com/product/cr
Model image: cr-demo-cn-beijing.cr.volces.com/vke-ai/langchain-chatglm:v0.0.1
Volcano Engine Developer Services
The Volcano Engine Developer Community, Volcano Engine's TOD community, connects the platform with developers, offering cutting-edge tech content and diverse events, nurturing a vibrant developer culture, and co-building an open-source ecosystem.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
