Backend Development 14 min read

How to Build a Bilibili Video Downloader with Python: From Scraping to GUI

This article walks through building a Python tool that extracts Bilibili video and audio URLs via developer tools, downloads the media files, merges them using moviepy, and wraps the process in a simple PySimpleGUI interface, complete with full source code.

Python Crawling & Data Mining

Aug 2, 2022

How to Build a Bilibili Video Downloader with Python: From Scraping to GUI

1. Principle Overview

The principle is simple: obtain the video resource's source URL, crawl the binary content, and write it to the local file system.

2. Webpage Analysis

Open the Bilibili video page, press F12 to open Developer Tools, go to the Network tab, select "All", sort by size, and locate the request that likely contains the video source URL.

Example video URL: https://www.bilibili.com/video/BV1BU4y1H7E3

Copy the url part, return to the Elements panel, use Ctrl+F to search for it, and find the JSON node that holds the real video file address.

Parsing the JSON reveals the video and audio URLs.

When opening the extracted URL directly, a 403 error appears because the request lacks proper headers.

3. Video Crawling

In the page source we use a regular expression to extract the window.__playinfo__ JSON.

import requests
import re
import json
url = 'https://www.bilibili.com/video/BV1BU4y1H7E3'
headers = {
    "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/87.0.4280.66 Safari/537.36",
    "referer": "https://www.bilibili.com"
}
resp = requests.get(url, headers=headers)
playinfo = re.findall(r'<script>window.__playinfo__=(.*?)</script>', resp.text)[0]
playinfo_data = json.loads(playinfo)

The JSON contains video and audio sections; we extract their base_url fields.

# video and audio URLs
video_url = json_data['data']['dash']['video'][0]['base_url']
audio_url = json_data['data']['dash']['audio'][0]['base_url']

Multiple base_url entries correspond to different qualities. The example selects the first (4K) entry.

4. Save to Local

With the URLs and a title extracted from the page, we download the files using requests and write them in chunks.

def down_file(file_url, file_type):
    headers = {
        "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/87.0.4280.66 Safari/537.36",
        "referer": "https://www.bilibili.com"
    }
    resp = requests.get(url=file_url, headers=headers)
    print(resp.status_code)
    chunk_size = 1024
    file_size = int(resp.headers['content-length'])
    file_size_mb = file_size / 1024 / 1024
    print(f'File size: {file_size_mb:.2f} MB')
    start = time.time()
    with open(title + '.' + file_type, 'wb') as f:
        done = 0
        for chunk in resp.iter_content(chunk_size=chunk_size):
            f.write(chunk)
            done += len(chunk)
            print(f'\rProgress: {done/file_size*100:.2f}%', end='')
    end = time.time()
    print(f'
Time: {end-start:.2f}s, Speed: {file_size_mb/(end-start):.2f} MB/s')

Running the function yields output such as:

# Video download
>>> down_file(video_url, 'mp4')
200
File name: 【咒术回战】第20集五条悟帅的有些过分了
File size: 42.10 MB
Progress: 100.00%
Time: 5.72 s, Speed: 7.36 MB/s
# Audio download
>>> down_file(audio_url, 'mp3')
200
File name: 【咒术回战】第20集五条悟帅的有些过分了
File size: 5.13 MB
Progress: 100.00%
Time: 0.80 s, Speed: 6.42 MB/s

5. GUI Tool Creation

The download logic is wrapped in a small GUI built with PySimpleGUI.

import PySimpleGUI as sg
sg.theme('SystemDefaultForReal')
layout = [
    [sg.Text('Enter Bilibili video URL:'), sg.InputText(key='url')],
    [sg.Button('Start Download'), sg.Button('Exit')]
]
window = sg.Window('Bilibili Video Downloader', layout)
while True:
    event, values = window.read()
    if event in (sg.WIN_CLOSED, 'Exit'):
        break
    if event == 'Start Download':
        url = values['url']
        title, video_url, audio_url = get_file_info(url)
        down_file(title, video_url, 'mp4')
        down_file(title, audio_url, 'mp3')
        merge(title)
window.close()

6. Full Code

The complete script combines the functions above, including the merge function that uses moviepy to combine video and audio into a single file.

import requests, re, json, time
from moviepy.editor import *
import PySimpleGUI as sg

def get_file_info(url):
    headers = {"User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/87.0.4280.66 Safari/537.36", "referer": "https://www.bilibili.com"}
    resp = requests.get(url, headers=headers)
    playinfo = re.findall(r'<script>window.__playinfo__=(.*?)</script>', resp.text)[0]
    data = json.loads(playinfo)
    title = re.findall(r'<h1 title="(.*?)" class="video-title">', resp.text)[0]
    video_url = data['data']['dash']['video'][0]['base_url']
    audio_url = data['data']['dash']['audio'][0]['base_url']
    return title, video_url, audio_url

def down_file(title, file_url, file_type):
    headers = {"User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/87.0.4280.66 Safari/537.36", "referer": "https://www.bilibili.com"}
    resp = requests.get(url=file_url, headers=headers)
    chunk_size = 1024
    file_size = int(resp.headers['content-length'])
    file_size_mb = file_size / 1024 / 1024
    start = time.time()
    with open(title + '.' + file_type, 'wb') as f:
        done = 0
        for chunk in resp.iter_content(chunk_size=chunk_size):
            f.write(chunk)
            done += len(chunk)
            print(f'\rProgress: {done/file_size*100:.2f}%', end='')
    end = time.time()
    print(f'
Time: {end-start:.2f}s, Speed: {file_size_mb/(end-start):.2f} MB/s')

def merge(title):
    video = VideoFileClip(title + '.mp4')
    audio = AudioFileClip(title + '.mp3')
    video = video.set_audio(audio)
    video.write_videofile(f"{title}(with audio).mp4")

sg.theme('SystemDefaultForReal')
layout = [[sg.Text('Enter Bilibili video URL:'), sg.InputText(key='url')],
          [sg.Button('Start Download'), sg.Button('Exit')]]
window = sg.Window('Bilibili Video Downloader', layout)
while True:
    event, values = window.read()
    if event in (sg.WIN_CLOSED, 'Exit'):
        break
    if event == 'Start Download':
        url = values['url']
        title, video_url, audio_url = get_file_info(url)
        down_file(title, video_url, 'mp4')
        down_file(title, audio_url, 'mp3')
        merge(title)
window.close()

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

GUI Bilibili video download pysimplegui moviepy

Written by

Python Crawling & Data Mining

Life's short, I code in Python. This channel shares Python web crawling, data mining, analysis, processing, visualization, automated testing, DevOps, big data, AI, cloud computing, machine learning tools, resources, news, technical articles, tutorial videos and learning materials. Join us!

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.