Build a Python Web Scraper that Emails Daily Vocabulary Reminders
This tutorial walks you through creating a Python script that crawls an online dictionary to fetch word‑meaning pairs, formats them, and automatically emails the list each day, using libraries like lxml, requests, BeautifulSoup, and smtplib for scheduling and delivery.
Preface
The author, a Python enthusiast, shares a small project inspired by a code snippet that periodically reminds users to study English vocabulary after a recent CET‑4/6 exam.
Implementation Idea
The solution combines two parts: a Python web crawler that extracts words and their Chinese meanings from a web page, and an email‑sending routine that formats the data and delivers it to a specified recipient.
Implementation Process
The complete source code is shown below; configure your email address, authorization code, and recipient before running.
from lxml import etree
import requests
import random
import smtplib
import schedule
import time
from bs4 import BeautifulSoup
from email.mime.text import MIMEText
from email.header import Header
# account = input('请输入你的邮箱:')
# password = input('请输入你的密码:')
# receiver = input('请输入收件人的邮箱:')
account = '{0}'.format('请输入你的邮箱:')
password = '{0}'.format('请输入你的密码:')
receiver = '{0}'.format('请输入收件人的邮箱:')
def recipe_spider():
num = 0
list_all = ''
words = []
meaning = []
choice = random.choice([(11, 226), (12, 105), (122, 35), (123, 25)])
url = "http://word.iciba.com/?action=words&class=" + str(choice[0]) + "&course=" + str(
random.randint(1, choice[1]))
r = requests.get(url)
r.encoding = r.apparent_encoding
if r.status_code == 200:
text = r.text
doc = etree.HTML(text)
words = doc.xpath('//*[@class="word_main_list"]/li/div[@class="word_main_list_w"]/span//text()')
meaning = doc.xpath('//*[@class="word_main_list"]/li/div[@class="word_main_list_s"]/span//text()')
li = []
for i in range(len(words)):
num += 1
n = '''
%s、 %s %s
''' % (num, words[i].strip(), meaning[i].strip())
list_all = list_all + n
dic = {'words': words[i], 'meaning': meaning[i]}
li.append(dic)
print(li)
return list_all
def send_email(list_all):
global account, password, receiver
mailhost = 'smtp.qq.com'
qqmail = smtplib.SMTP_SSL(mailhost, 465)
#qqmail.connect(mailhost,465)
qqmail.login(account, password)
content = '亲爱的,今天记单词:' + list_all
message = MIMEText(content, 'plain', 'utf-8')
subject = '今天记什么单词'
message['Subject'] = Header(subject, 'utf-8')
try:
qqmail.sendmail(account, receiver, message.as_string())
print('邮件发送成功')
except:
print('邮件发送失败')
qqmail.quit()
def job():
print('开始一次任务')
list_all = recipe_spider()
send_email(list_all)
print('任务完成')
if __name__ == '__main__':
job()After setting the correct SMTP credentials, the script fetches a random set of words, assembles them into a formatted string, and sends the result via email each time it runs.
Conclusion
The article demonstrates a practical Python project that merges web scraping with automated email delivery, offering a convenient way to receive daily vocabulary reminders and illustrating basic backend automation techniques.
Python Crawling & Data Mining
Life's short, I code in Python. This channel shares Python web crawling, data mining, analysis, processing, visualization, automated testing, DevOps, big data, AI, cloud computing, machine learning tools, resources, news, technical articles, tutorial videos and learning materials. Join us!
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
