How to Build a Python Web Scraper that Sends Daily Vocabulary Emails

This tutorial shows how to create a Python web crawler that fetches English words, formats them, and automatically emails the list each day using SMTP and a scheduling library, providing a practical solution for daily vocabulary practice.

Python Crawling & Data Mining
Python Crawling & Data Mining
Python Crawling & Data Mining
How to Build a Python Web Scraper that Sends Daily Vocabulary Emails

Introduction

Hello, I am a Python enthusiast sharing a small project that combines web crawling and automated email sending to help you remember vocabulary.

Implementation Idea

The approach consists of two parts: first, a Python web crawler extracts words and their meanings from an online dictionary; second, an email-sending routine formats the data and sends it via SMTP.

Implementation Process

Below is the complete code for the project.

from lxml import etree
import requests
import random
import smtplib
import schedule
import time
from bs4 import BeautifulSoup
from email.mime.text import MIMEText
from email.header import Header

account = '{0}'.format('请输入你的邮箱:')
password = '{0}'.format('请输入你的密码:')
receiver = '{0}'.format('请输入收件人的邮箱:')

def recipe_spider():
    num = 0
    list_all = ''
    words = []
    meaning = []
    choice = random.choice([(11, 226), (12, 105), (122, 35), (123, 25)])
    url = "http://word.iciba.com/?action=words&class=" + str(choice[0]) + "&course=" + str(
        random.randint(1, choice[1]))
    r = requests.get(url)
    r.encoding = r.apparent_encoding
    if r.status_code == 200:
        text = r.text
        doc = etree.HTML(text)
        words = doc.xpath('//*[@class="word_main_list"]/li/div[@class="word_main_list_w"]/span//text()')
        meaning = doc.xpath('//*[@class="word_main_list"]/li/div[@class="word_main_list_s"]/span//text()')
        li = []
        for i in range(len(words)):
            num += 1
            n = "
%s、 %s     %s
" % (num, words[i].strip(), meaning[i].strip())
            list_all = list_all + n
            dic = {'words': words[i], 'meaning': meaning[i]}
            li.append(dic)
        print(li)
    return list_all

def send_email(list_all):
    global account, password, receiver
    mailhost = 'smtp.qq.com'
    qqmail = smtplib.SMTP_SSL(mailhost, 465)
    qqmail.login(account, password)
    content = '亲爱的,今天记单词:' + list_all
    message = MIMEText(content, 'plain', 'utf-8')
    subject = '今天记什么单词'
    message['Subject'] = Header(subject, 'utf-8')
    try:
        qqmail.sendmail(account, receiver, message.as_string())
        print('邮件发送成功')
    except:
        print('邮件发送失败')
    qqmail.quit()

def job():
    print('开始一次任务')
    list_all = recipe_spider()
    send_email(list_all)
    print('任务完成')

if __name__ == '__main__':
    job()

After entering your email address, authorization code, and recipient, the script fetches words, formats them, and sends them via SMTP. The schedule library can trigger the job function daily, providing automatic reminders.

Running the script will send an email containing the word list, as illustrated below.

You can also schedule the script to run automatically each day, helping you accumulate vocabulary for exams such as the CET-4 and CET-6.

Conclusion

This article demonstrates a small project that combines Python web crawling and automated email sending to remind users to study vocabulary.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

PythonWeb Scrapingschedulelxml
Python Crawling & Data Mining
Written by

Python Crawling & Data Mining

Life's short, I code in Python. This channel shares Python web crawling, data mining, analysis, processing, visualization, automated testing, DevOps, big data, AI, cloud computing, machine learning tools, resources, news, technical articles, tutorial videos and learning materials. Join us!

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.