How to Scrape Meituan Food Data with Python: Step-by-Step Guide
This tutorial explains how to analyze Meituan food page URLs, use browser developer tools to locate AJAX JSON responses, construct Python requests with proper headers, extract restaurant information via regular expressions, and save the results to a local file.
1. Analyze Meituan food page URL parameters
Search point: Meituan food, address: Beijing, keyword: hot pot.
2) Crawled URL
https://bj.meituan.com/s/%E7%81%AB%E9%94%85/
3) Explanation
The URL automatically encodes Chinese characters, so the two characters for “hot pot” become %E7%81%AB%E9%94%85. By parsing the URL we can see that “bj” stands for Beijing and the part after /s/ is the search keyword.
2. Analyze page data source (F12 developer tools)
Open the F12 developer tools and refresh the page; the URL does not change when navigating to the second page, indicating AJAX data loading.
Find the corresponding response file in the XHR panel.
The data is exchanged in JSON format. The request URLs for page 2 and page 3 are:
Page 2: https://apimobile.meituan.com/group/v4/poi/pcsearch/1?uuid=xxx&userid=-1&limit=32&offset=32&cateId=-1&q=%E7%81%AB%E9%94%85
Page 3: https://apimobile.meituan.com/group/v4/poi/pcsearch/1?uuid=xxx&userid=-1&limit=32&offset=64&cateId=-1&q=%E7%81%AB%E9%94%85
Comparison shows that the offset parameter increases by 32 each page, limit is the number of items per request, and q is the search keyword. The trailing 1 is the city ID for Beijing.
3. Construct request to fetch Meituan food data
Loop through each page and collect data. The full Python script is:
import requests
import re
import json
def start():
for w in range(0, 1600, 32):
# Page number = w/32; limit set to 50 pages (max 1600 items) to avoid excessive requests.
try:
# Replace the placeholder xxx with your own UUID.
url = 'https://apimobile.meituan.com/group/v4/poi/pcsearch/1?uuid=xxx&userid=-1&limit=32&offset=' + str(w) + '&cateId=-1&q=%E7%81%AB%E9%94%85'
# Headers can be copied from the browser's network panel.
headers = {
'Accept': '*/*',
'Accept-Encoding': 'gzip, deflate, br',
'Accept-Language': 'zh-CN,zh;q=0.9',
'Connection': 'keep-alive',
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/70.0.3538.25 Safari/537.36 Core/1.70.3741.400 QQBrowser/10.5.3863.400',
'Host': 'apimobile.meituan.com',
'Origin': 'https://bj.meituan.com',
'Referer': 'https://bj.meituan.com/s/%E7%81%AB%E9%94%85/'
}
response = requests.get(url, headers=headers)
# Use regex because the JSON structure does not expose the title field directly.
titles = re.findall('"","title":"(.*?)","address":"', response.text)
addresses = re.findall('"address":"(.*?)",', response.text)
avgprices = re.findall('"avgprice":(.*?),', response.text)
avgscores = re.findall('"avgscore":(.*?),', response.text)
comments = re.findall('"comments":(.*?),', response.text)
print(len(titles), len(addresses), len(avgprices), len(avgscores), len(comments))
for o in range(len(titles)):
title = titles[o]
address = addresses[o]
avgprice = avgprices[o]
avgscore = avgscores[o]
comment = comments[o]
file_data(title, address, avgprice, avgscore, comment)
except Exception:
continue
def file_data(title, address, avgprice, avgscore, comment):
data = {
'店铺名称': title,
'店铺地址': address,
'平均消费价格': avgprice,
'店铺评分': avgscore,
'评价人数': comment
}
with open('美团美食.txt', 'a', encoding='utf-8') as fb:
fb.write(json.dumps(data, ensure_ascii=False) + '
')
if __name__ == '__main__':
start()Running the script produces the following result:
Local file content:
4. Summary
By changing the search keyword and city, you can modify the URL parameters accordingly. Remember to adjust the request headers as needed; practicing these steps will help you become familiar with AJAX‑based data extraction.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Python Crawling & Data Mining
Life's short, I code in Python. This channel shares Python web crawling, data mining, analysis, processing, visualization, automated testing, DevOps, big data, AI, cloud computing, machine learning tools, resources, news, technical articles, tutorial videos and learning materials. Join us!
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
