Fetching Movie Information from IMDb Using Python Requests and BeautifulSoup
This tutorial demonstrates how to use Python's requests and BeautifulSoup libraries to query IMDb for movie details such as title, director, cast, rating, and summary, illustrated with a sample lookup for the film Inception.
This tutorial introduces a Python project that retrieves movie information from IMDb using the requests and BeautifulSoup libraries.
First, install the required packages:
pip install requests
pip install beautifulsoup4Then import the modules and define the get_movie_info function:
import requests
from bs4 import BeautifulSoup
def get_movie_info(movie_name):
"""
根据电影名称获取电影信息
"""
url = f"https://www.imdb.com/find?q={movie_name}&s=tt&ttype=ft&ref_=fn_ft"
page = requests.get(url).text
soup = BeautifulSoup(page, 'html.parser')
# 找到第一条搜索结果
movie_link = soup.find("td", {"class": "result_text"}).find("a")["href"]
# 访问电影详情页
movie_page = requests.get(f"https://www.imdb.com{movie_link}").text
# 解析电影详情页源代码
movie_soup = BeautifulSoup(movie_page, 'html.parser')
# 获取标题
movie_title = movie_soup.find("div", {"class": "title_wrapper"}).h1.text.strip()
# 获取导演
movie_director = movie_soup.find("div", {"class": "credit_summary_item"}).find_all("a")[0].text
# 获取演员
movie_cast = [cast.text for cast in movie_soup.find("div", {"class": "cast_list"}).find_all("span", {"class": "itemprop"})]
# 获取评分
movie_rating = movie_soup.find("span", {"itemprop": "ratingValue"}).text
# 获取简介
movie_summary = movie_soup.find("div", {"class": "summary_text"}).text.strip()
return {'title': movie_title, 'director': movie_director, 'cast': movie_cast, 'rating': movie_rating, 'summary': movie_summary}
print(get_movie_info("Inception"))The script searches IMDb for the given movie, extracts the first result's link, fetches the movie page, and parses the title, director, cast, rating, and summary, returning them in a dictionary.
Test Development Learning Exchange
Test Development Learning Exchange
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.