Operations 10 min read

Automating Report Generation with Python: Data Extraction, Visualization, and Email Delivery

This tutorial explains how data analysts can automate the creation, processing, and distribution of statistical reports using Python by reading various data sources, performing DataFrame calculations, generating visualizations, and sending the results via email, thereby saving time and reducing errors.

Python Programming Learning Circle
Python Programming Learning Circle
Python Programming Learning Circle
Automating Report Generation with Python: Data Extraction, Visualization, and Email Delivery

Project Background

Data analysts often spend excessive time manually creating statistical reports, which detracts from actual data analysis. Automating report generation with Python allows analysts to focus on extracting insights rather than repetitive chart creation.

Purpose of Automation

Automation aims to save time, improve efficiency, and reduce human errors by programmatically handling repetitive reporting tasks.

Scope of Automation

Not all reports need automation; focus on frequently used reports such as customer lists, sales flow, churn, and period‑over‑period comparisons. Consider report frequency, development effort, and overall workflow when deciding which reports to automate.

Implementation Steps

Step 1: Reading Data Sources

Data is loaded into pandas DataFrames from Excel, JSON, SQL, or CSV files. Example code:

import pandas as pd
import json
import pymysql
from sqlalchemy import create_engine

# Open database connection
conn = pymysql.connect(host='localhost', port=3306, user='root', passwd='xxxx', charset='utf8')
engine = create_engine('mysql+pymysql://root:xxxx@localhost/mysql?charset=utf8')

def read_excel(file):
    df_excel = pd.read_excel(file)
    return df_excel

def read_json(file):
    with open(file, 'r') as json_f:
        df_json = pd.read_json(json_f)
        return df_json

def read_sql(table):
    sql_cmd = 'SELECT * FROM %s' % table
    df_sql = pd.read_sql(sql_cmd, engine)
    return df_sql

def read_csv(file):
    df_csv = pd.read_csv(file)
    return df_csv

Step 2: DataFrame Calculations

Using a user information example, clean the data by removing rows with missing or erroneous city values, then compute value counts for the "city_num" column.

df = df[df['city_num'].notna()]
df = df.drop(df[df['city_num'] == 'error'].index)
city_counts = df['city_num'].value_counts()

Generate a pie chart of the top 10 cities:

def pie_chart(df):
    df = df[df['city_num'].notna()]
    df = df.drop(df[df['city_num'] == 'error'].index)
    df = df['city_num'].value_counts()
    df.head(10).plot.pie(subplots=True, figsize=(5,6), autopct='%.2f%%', radius=1.2, startangle=250, legend=False)

pie_chart(read_csv('user_info.csv'))

Alternatively, use pyecharts for a more attractive chart:

from pyecharts.charts import Pie
from pyecharts import options as opts

def echart_pie(user_df):
    user_df = user_df[user_df['city_num'].notna()]
    user_df = user_df.drop(user_df[user_df['city_num'] == 'error'].index)
    user_df = user_df['city_num'].value_counts()
    names = user_df.head(10).index.tolist()
    values = user_df.head(10).values.tolist()
    data = list(zip(names, values))
    pie = Pie()
    pie.add('', data)
    pie.set_global_opts(title_opts=opts.TitleOpts(title='Top 10 Regions'))
    return pie.render_notebook()

user_df = read_csv('user_info.csv')
echart_pie(user_df)

Save the chart as an image:

plt.savefig('fig_cat.png')

Step 3: Automatic Email Sending

Use smtplib and email modules to send the generated report and attachments.

import smtplib
from email import encoders
from email.header import Header
from email.utils import parseaddr, formataddr
from email.mime.application import MIMEApplication
from email.mime.multipart import MIMEMultipart
from email.mime.text import MIMEText

asender = "[email protected]"
areceiver = "[email protected]"
acc = "[email protected]"
asubject = "谢谢关注"
from_addr = "[email protected]"
password = "####"

msg = MIMEMultipart()
msg['Subject'] = asubject
msg['to'] = areceiver
msg['Cc'] = acc
msg['from'] = "fanstuck"

body = "你好,欢迎关注fanstuck,您的关注就是我继续创作的动力!"
msg.attach(MIMEText(body, 'plain', 'utf-8'))

htmlFile = 'C:/Users/10799/problem.html'
html = MIMEApplication(open(htmlFile, 'rb').read())
html.add_header('Content-Disposition', 'attachment', filename='html')
msg.attach(html)

smtp_server = "smtp.163.com"
server = smtplib.SMTP(smtp_server, 25)
server.set_debuglevel(1)
server.login(from_addr, password)
server.sendmail(from_addr, areceiver.split(',') + acc.split(','), msg.as_string())
server.quit()

Running the script sends the report, chart image, and HTML attachment to the specified recipients.

Pythonautomationdata analysisvisualizationEmailreport-generation
Python Programming Learning Circle
Written by

Python Programming Learning Circle

A global community of Chinese Python developers offering technical articles, columns, original video tutorials, and problem sets. Topics include web full‑stack development, web scraping, data analysis, natural language processing, image processing, machine learning, automated testing, DevOps automation, and big data.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.