Fine‑Tune and Deploy Llama 2 on Alibaba Cloud PAI in Minutes
This guide walks you through using Meta's open‑source Llama 2 models on Alibaba Cloud's PAI platform, covering low‑code LoRA fine‑tuning, full‑parameter fine‑tuning with PAI‑DSW, and rapid WebUI deployment via PAI‑EAS, complete with step‑by‑step instructions, code snippets, and resource requirements.
Best Practice 1: Low‑code LoRA fine‑tuning and deployment
Meta released Llama 2 (7B, 13B, 70B) with free research and commercial use. Alibaba Cloud PAI quickly adapts these models, offering full‑parameter fine‑tuning, LoRA fine‑tuning, and inference services.
1. Enter PAI‑Quick‑Start
a. Log in to the PAI console: https://pai.console.aliyun.com/
b. Open the workspace and select “Quick‑Start”.
2. Choose the Llama 2 model
Select the “Generative AI – Large‑Language‑Model” category and pick llama-2-7b-chat-hf (or other sizes).
3. Deploy the model
Use the model detail page to create an online inference service. Ensure at least 64 GiB memory and 24 GiB GPU VRAM.
4. Call the inference service
After deployment, use the WebUI to send prediction requests or call the API via the “Use via API” link.
Best Practice 2: Full‑parameter fine‑tuning with PAI‑DSW
PAI‑DSW provides an interactive modeling environment for full‑parameter fine‑tuning of Llama‑2‑7B‑Chat.
1. Log in and download the model
import os
dsw_region = os.environ.get("dsw_region")
url_link = {
"cn-shanghai": "https://atp-modelzoo-sh.oss-cn-shanghai-internal.aliyuncs.com/release/tutorials/llama2/llama2-7b.tar.gz",
"cn-hangzhou": "https://atp-modelzoo.oss-cn-hangzhou-internal.aliyuncs.com/release/tutorials/llama2/llama2-7b.tar.gz",
"cn-shenzhen": "https://atp-modelzoo-sz.oss-cn-shenzhen-internal.aliyuncs.com/release/tutorials/llama2/llama2-7b.tar.gz",
"cn-beijing": "https://atp-modelzoo-bj.oss-cn-beijing-internal.aliyuncs.com/release/tutorials/llama2/llama2-7b.tar.gz"
}
path = url_link[dsw_region]
os.environ['LINK_CHAT'] = path
!wget $LINK_CHAT
!tar -zxvf llama2-7b.tar.gz2. Install the environment
! wget https://atp-modelzoo-sh.oss-cn-shanghai.aliyuncs.com/release/tutorials/llama2/ColossalAI.tar.gz
! tar -zxvf ColossalAI.tar.gz
! pip install ColossalAI/.
! pip install ColossalAI/applications/Chat/.
! pip install transformers==4.30.0
! pip install gradio==3.113. Prepare data
Use the default instruction‑tuning dataset attached to the model card or upload your own JSON with instruction, output, and id fields.
[
{
"instruction": "以下文本是否属于世界主题?为什么美国人很少举行阅兵?",
"output": "是",
"id": 0
},
{
"instruction": "以下文本是否属于世界主题?重磅!事业单位车改时间表已出!",
"output": "不是",
"id": 1
}
]4. Submit the training job
Configure the dataset and use the default optimized hyper‑parameters. Monitor progress via the job detail page; successful jobs save the model to OSS.
5. Deploy the fine‑tuned model
Upload the trained model to OSS and follow the same deployment steps as in Best Practice 1.
# encoding=utf-8
import oss2, os
AK='yourAccessKeyId'
SK='yourAccessKeySecret'
endpoint='yourEndpoint'
dir='your model output dir'
auth = oss2.Auth(AK, SK)
bucket = oss2.Bucket(auth, endpoint, 'examplebucket')
for filename in os.listdir(dir):
current_file_path = dir+filename
file_path = '需要上传地址'
bucket.put_object_from_file(file_path, current_file_path)Best Practice 3: Quick WebUI deployment with PAI‑EAS
PAI‑EAS enables one‑click deployment of Llama 2 as an online service or AI‑Web application.
1. Open the PAI‑EAS model service page
Log in, navigate to Workspace → Model Deployment → Model Online Service (EAS).
2. Configure service parameters
Service name (e.g., chatllm_llama2_13b)
Deployment mode: Image‑based AI‑Web app
Image: chat-llm-webui (latest version)
Run command for 13B:
python webui/webui_server.py --listen --port=8000 --model-path=meta-llama/Llama-2-13b-chat-hf --precision=fp16Run command for 7B:
python webui/webui_server.py --listen --port=8000 --model-path=meta-llama/Llama-2-7b-chat-hfGPU instance (e.g., ecs.gn6e-c12g1.3xlarge for 13B, A10/GU30 for 7B)
System disk: 50 GB
3. Deploy and launch WebUI
After deployment, click “View Web App”, open the WebUI, and test the model by entering prompts such as “请提供一个理财学习计划”.
These three best‑practice workflows demonstrate how to quickly fine‑tune, deploy, and serve Llama 2 models on Alibaba Cloud PAI, covering both low‑code LoRA and full‑parameter training scenarios.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Alibaba Cloud Big Data AI Platform
The Alibaba Cloud Big Data AI Platform builds on Alibaba’s leading cloud infrastructure, big‑data and AI engineering capabilities, scenario algorithms, and extensive industry experience to offer enterprises and developers a one‑stop, cloud‑native big‑data and AI capability suite. It boosts AI development efficiency, enables large‑scale AI deployment across industries, and drives business value.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
