Create a Python iQIYI Movie Scraper with GUI – Full Step‑by‑Step Guide
Learn how to build a Python web scraper that extracts iQIYI movie titles, actors, and scores, parses the data with regex or BeautifulSoup, displays results in a Tkinter GUI with a combobox, and saves the information to a file—all explained with code snippets and screenshots.
Project Goal
By selecting the iQIYI movie option, the program prints high‑quality movie information (name, actor, rating) into a text box for the user to view and then watch.
Dependencies
Development tools: Sublime Text 3 and 360 Browser.
Implementation Steps
First, open the iQIYI page and inspect it (F12). The movie list is wrapped in a <ul> element containing multiple <li> items.
Two parsing approaches are shown: using a regular expression or a more structured method.
Parsing Code
def parse_page(self, html):
# compile regex for movie content
pattern = re.compile('<li.*?qy-mod-li.*?text-score">(.*?)<.*?title.*?>(.*?)<.*?title.*?>(.*?)<', re.S)
items = re.findall(pattern, html) # find all matches
for item in items:
yield {
'Movie_Name': item[1], # movie name
'Movie_actor': item[2].strip()[3:], # actor
'Movie_score': item[0] # rating
}GUI Code
class gui:
def __init__(self):
self.root = tk.Tk()
self.root.title("iQIYI Hot Drama Search v1.0")
self.root.geometry("700x600")
self.lb = tk.Label(self.root, text='请选择搜索类型')
self.tt = tk.Text(self.root, width=40, height=30)
self.cb = ttk.Combobox(self.root, width=12)
self.cb['values'] = ('请选择-----','综合排序','热播榜','好评榜','新上线')
self.cb.current(0)
self.cb.bind("<<ComboboxSelected>>", self.go)
self.lb.place(x=30, y=30)
self.cb.place(x=154, y=30)
self.tt.place(x=30, y=60, width=400, height=600)
self.root.mainloop()File Writing
def write_to_file(self, content):
with open('movie.txt', 'a', encoding='utf8') as f:
f.write(json.dumps(content, ensure_ascii=False) + '
')Combobox Event
# dropdown event
def go(self, *arg):
if self.cb.get() == '请选择-----':
self.tt.delete('1.0', 'end')
elif self.cb.get() == '综合排序':
self.tt.delete('1.0', 'end')
self.main('https://list.iqiyi.com/www/1/-------------24-1-1-iqiyi--.html')
elif self.cb.get() == '热播榜':
self.tt.delete('1.0', 'end')
self.main('https://list.iqiyi.com/www/1/-------------11-1-1-iqiyi--.html')
elif self.cb.get() == '好评榜':
self.tt.delete('1.0', 'end')
self.main('https://list.iqiyi.com/www/1/-------------8-1-1-iqiyi--.html')
elif self.cb.get() == '新上线':
self.tt.delete('1.0', 'end')
self.main('https://list.iqiyi.com/www/1/-------------4-1-1-iqiyi--.html')Main Function
# main function
def main(self, url):
html = self.get_page(url)
for item in self.parse_page(html):
self.tt.insert('insert', item)
self.tt.insert('insert', '
')
self.tt.update()
self.write_to_file(item)The resulting interface shows the extracted movies, as illustrated below.
Conclusion
This Python project creates a functional iQIYI hot‑drama search tool that is beginner‑friendly and demonstrates practical web‑scraping, GUI design, and data persistence techniques.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Python Crawling & Data Mining
Life's short, I code in Python. This channel shares Python web crawling, data mining, analysis, processing, visualization, automated testing, DevOps, big data, AI, cloud computing, machine learning tools, resources, news, technical articles, tutorial videos and learning materials. Join us!
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
