Backend Development 8 min read

How to Scrape Bilibili Video Playlists with Python Selenium – Full Code Guide

This article walks you through using Python's Selenium library to programmatically retrieve Bilibili video series playlists, explains the required XPath selectors, provides a complete, runnable code example, and addresses common issues such as ChromeDriver compatibility.

Python Crawling & Data Mining

Sep 23, 2021

How to Scrape Bilibili Video Playlists with Python Selenium – Full Code Guide

Introduction

Recently a member of a Python community shared a script for extracting Bilibili video collections, and this article consolidates that code for learning purposes.

1. Background

Many users think of Bilibili as a video platform and want to use web‑scraping techniques to obtain videos, but Bilibili does not provide straightforward download links. Previously, the You‑Get library was once used for this purpose.

In practice, many Bilibili creators publish dozens or hundreds of videos as a series (e.g., programming tutorials). These series can be identified by their titles, but automating the extraction is not trivial. This article demonstrates how to retrieve such playlists using Selenium and XPath.

2. Implementation Details

The script relies on Selenium to simulate a browser, locate playlist items via XPath, and calculate total duration. Below is the complete code you can run and adapt.

# coding: utf-8
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.support.wait import WebDriverWait

class Item:
    page_num = ""
    part = ""
    duration = ""

    def __init__(self, page_num, part, duration):
        self.page_num = page_num
        self.part = part
        self.duration = duration

    def get_second(self):
        str_list = self.duration.split(":")
        total = 0
        for i, item in enumerate(str_list):
            total += pow(60, len(str_list) - i - 1) * int(item)
        return total

def get_bilili_page_items(url):
    options = webdriver.ChromeOptions()
    options.add_argument('--headless')  # run without UI
    options.add_experimental_option('excludeSwitches', ['enable-automation'])
    browser = webdriver.Chrome(options=options)
    print("Opening page...")
    browser.get(url)
    print("Waiting for page to load...")
    wait = WebDriverWait(browser, 10)
    wait.until(EC.visibility_of_element_located((By.XPATH, '//*[@class="list-box"]/li/a')))
    print("Fetching data...")
    elements = browser.find_elements_by_xpath('//*[@class="list-box"]/li')
    item_list = []
    total_seconds = 0
    for elem in elements:
        link = elem.find_element_by_tag_name('a')
        parts = link.text.split('
')
        print(' '.join(parts))
        item = Item(parts[0], parts[1], parts[2])
        total_seconds += item.get_second()
        item_list.append(item)
    print("Total items:", len(item_list))
    print("Total duration (minutes):", round(total_seconds / 60, 2))
    print("Total duration (hours):", round(total_seconds / 3600.0, 2))
    browser.close()
    return item_list

# Example usage – replace the URL with the desired playlist page
get_bilili_page_items("https://www.bilibili.com/video/BV1Eb411u7Fw")

The selector used is an XPath expression that targets the list items of a Bilibili playlist. To scrape a different series, simply change the URL in the final function call.

3. Common Issues

During execution you may encounter a ChromeDriver version mismatch error. Resolve it by downloading the matching driver from the official site:

https://chromedriver.storage.googleapis.com/index.html

After updating the driver, the script should run without further problems.

4. Conclusion

This guide shows how to obtain Bilibili video series data using Python Selenium and XPath, calculates total video duration, and provides troubleshooting tips for driver compatibility. Feel free to modify the script for other playlists and explore further automation possibilities.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Web Scraping Selenium XPath

Written by

Python Crawling & Data Mining

Life's short, I code in Python. This channel shares Python web crawling, data mining, analysis, processing, visualization, automated testing, DevOps, big data, AI, cloud computing, machine learning tools, resources, news, technical articles, tutorial videos and learning materials. Join us!

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.