Create a Free Multimodal Calorie Counter with GLM‑4V‑Flash in Minutes
This guide shows how to install the ZhipuAI SDK, obtain a free GLM‑4V‑Flash API key, craft prompts for image‑based calorie estimation, and build a Python demo that calculates food calories, BMI, and personalized diet advice using a multimodal large model.
Background
The author discusses how recent free multimodal large models, especially GLM‑4V‑Flash from ZhipuAI, enable rapid development of image‑based health applications that previously required extensive data collection, model fine‑tuning, and backend infrastructure.
Setup
Install the required Python package: !pip install zhipuai After installation, verify the output shows the installed versions of dependencies.
Obtain an API key from the BigModel platform (https://www.bigmodel.cn/usercenter/proj-mgmt/apikeys) and keep it for the following calls.
Prompt Design
Use a system prompt that positions the model as a health‑management assistant, asking it to identify ingredients from an uploaded image, estimate their calories, combine the user’s height and weight to compute BMI, and provide dietary recommendations.
API Call Example
Python code to call the model:
import base64
from zhipuai import ZhipuAI
client = ZhipuAI(api_key="YOUR_API_KEY") # replace with your API key
response = client.chat.completions.create(
model="glm-4v-flash",
messages=[
{
"role": "user",
"content": [
{"type": "text", "text": "你是一个健康管理助手,可以根据用户上传的图片,给出可能的配料和估算对应的热量值,并结合用户的身高和体重情况,计算出BMI值,给出对应的饮食和健康建议。"},
{"type": "image_url", "image_url": {"url": "https://files.mdnice.com/user/47494/44d32f14-ab83-4a0b-ae4a-31ab9e1fdbf4.png"}}
]
}
]
)
print(response.choices[0].message.content)The image URL above shows a cheeseburger.
Sample Response for Calorie Estimation
The model returns an ingredient list with approximate calories and a total of 700‑900 kcal. It also notes that values are approximate.
Extended Interaction – BMI Calculation
After receiving the calorie breakdown, the user can ask for BMI and diet advice:
client.chat.completions.create(
model="glm-4v-flash",
messages=[
{...previous messages...},
{"role": "user", "content": [{"type": "text", "text": "我现在身高175cm,体重85kg,请计算我的BMI指数,并告诉我减脂期间适合吃图中的食物吗?"}]}
]
)The model replies that the BMI is 27.78 (overweight) and suggests limiting the burger, offering lower‑calorie alternatives such as chicken‑breast salad, whole‑grain sandwich, lean beef or turkey, tofu burger, and vegetable soup.
Other Use Cases
Similar prompts can translate foreign restaurant menus, recommend dishes, or identify plants and animals. Example code for a travel‑assistant prompt is provided, and the model returns a translated menu with recommended items.
Conclusion
GLM‑4V‑Flash’s free, high‑quality multimodal capabilities allow developers and product managers to prototype complex vision‑language applications with only a few lines of prompt‑driven code, dramatically lowering cost and time‑to‑market.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Baobao Algorithm Notes
Author of the BaiMian large model, offering technology and industry insights.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
