Python Web Scraping Tutorial: Downloading YY Live Videos via API
This tutorial demonstrates how to use Python's requests library to fetch video data from the YY Live API, parse the JSON response, extract video URLs, and programmatically download and save the videos, including code snippets for single and batch downloads.
This article introduces a practical Python web‑scraping example that targets the YY Live platform. It explains why the YY Live API is chosen, shows the API endpoint https://api-tinyvideo-web.yy.com/home/tinyvideosv2 , and describes how to send HTTP requests using the requests library.
Simulating the request
<code>url = 'https://api-tinyvideo-web.yy.com/home/tinyvideosv2'
headers = {
'user-agent': 'Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/84.0.4147.105 Safari/537.36'
}
response = requests.get(url=url, headers=headers)
data = response.json()
</code>The JSON response contains a list of video records. Each record holds a resurl field with the actual video link and a yyNum identifier that can be used to name the file.
Parsing the data
<code>for d in data_list:
video_title = str(d['yyNum']) + '.mp4'
video_url = d['resurl']
video_content = requests.get(url=video_url, headers=headers).content
with open('video\\' + video_title, mode='wb') as f:
f.write(video_content)
print('保存完成:', video_title)
</code>The script saves each video to a local video directory and prints a confirmation message.
Looped downloading
Because the API returns different data on each call, the script can be wrapped in a loop to fetch multiple pages and download more videos.
<code>url = 'https://api-tinyvideo-web.yy.com/home/tinyvideosv2'
headers = {
'user-agent': 'Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/84.0.4147.105 Safari/537.36'
}
for _ in range(page+1):
response = requests.get(url=url, headers=headers)
data = response.json()
data_list = data['data']['data']
print(data_list)
</code>Complete script
<code>import requests
def fire(page):
url = 'https://api-tinyvideo-web.yy.com/home/tinyvideosv2'
headers = {
'user-agent': 'Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/84.0.4147.105 Safari/537.36'
}
for _ in range(page+1):
response = requests.get(url=url, headers=headers)
data = response.json()
data_list = data['data']['data']
print(data_list)
for d in data_list:
video_title = str(d['yyNum']) + '.mp4'
video_url = d['resurl']
video_content = requests.get(url=video_url, headers=headers).content
with open('video\\' + video_title, mode='wb') as f:
f.write(video_content)
print('保存完成:', video_title)
if __name__ == '__main__':
fire(10)
</code>The article concludes by noting that this approach leverages the public API for quick results, and future posts will explore page‑scraping techniques to collect videos from different hosts and categories.
Python Programming Learning Circle
A global community of Chinese Python developers offering technical articles, columns, original video tutorials, and problem sets. Topics include web full‑stack development, web scraping, data analysis, natural language processing, image processing, machine learning, automated testing, DevOps automation, and big data.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.