Generate Automated Survey Reports with Python in Minutes
This guide introduces a Python tool that automates the creation of descriptive and cross‑analysis survey reports in PPT and Excel formats, covering installation, data preparation, quick‑start code snippets, and additional utility functions for comprehensive statistical reporting.
Solution Overview
Tool package: https://github.com/gasongjian/reportgen
Project address: https://github.com/gasongjian/ (star or fork)
Software dependencies: Python 3 (compatible with Python 2, but Chinese part needs a manual patch)
Data requirements: survey data from Wenjuanxing, Wenjuanwang, etc.
Main feature 1: Automatically generate overview reports (PPT) with frequency statistics and charts for each question.
Main feature 2: Automatic cross‑analysis reports (chi‑square test, TGI, CHI metrics, simple conclusions).
Preparation
Required environment:
Install Anaconda (recommended Python 3) for scientific packages.
Install third‑party package python‑pptx: pip install python-pptx Install the report package by placing report/report.py in your working directory or site‑packages.
Note
Python 2.7 version of python‑pptx has a Chinese‑font bug; edit .\\pptx\\chart\\xmlwriter.py lines 1338 and 1373, replace escape(str(name)) with escape(unicode(name)).
Quick Start
For users who do not know or do not want to learn Python 3, see this.
For Windows users, download the reportgen folder from the project and follow the included instructions. A Baidu Cloud link is also provided (password: as84).
3.1 Three‑line code for descriptive statistics report
import report as rpt
# data encoding and import
# 300_300_0.xls is Wenjuanxing text data, 300_300_2.xls is Wenjuanxing numeric data.
data, code = rpt.wenjuanxing(['300_300_0.xls', '300_300_2.xls'])
# generate descriptive statistics report
rpt.summary_chart(data, code, filename=u'调研报告初稿')The above code creates two files in the .\\out\\ folder: 调研报告初稿.pptx: descriptive statistics for each question, supporting single‑choice, multiple‑choice, ranking, matrix single‑choice, etc. 调研报告初稿.xlsx: statistical data (frequency and proportion) for each question.
3.2 Four‑line code for cross‑analysis report
import report as rpt
data, code = rpt.wenjuanxing()
save_dstyle = ['FE', 'TGI', 'CHI'] # choose metrics to save
rpt.cross_chart(data, code, cross_class='Q1', filename=u'性别差异分析', save_dstyle=save_dstyle)This generates five files in .\\out\\: 性别差异分析.pptx: gender‑based differences for each question.
性别差异分析_百分比.xlsx 性别差异分析_FE.xlsx 性别差异分析_TGI.xlsx 性别差异分析_CHI.xlsx3.3 Other useful functions
import report as rpt
# File I/O
data = rpt.read_data(filename)
code = rpt.read_code(filename)
rpt.save_data(data, filename, code)
rpt.save_code(code, filename)
# Single‑variable frequency table
t, t1 = rpt.qtable(data, code, 'Q1')
# Two‑variable cross table
t, t1 = rpt.qtable(data, code, 'Q1', 'Q2')
# Contingency table analysis
cdata = rpt.contingency(fo)
# Goodness‑of‑fit test
rpt.gof_test(fo, fe)
# Chi‑square test
rpt.chi2_test(fo, fe)
# Binomial confidence interval
rpt.binomial_interval(p, n)
# Automatic descriptive report
rpt.summary_chart(data, code, filename=u'描述统计报告', summary_qlist=None, max_column_chart=20)
# Automatic cross report
rpt.cross_chart(data, code, cross_class='Q1', filename=u'交叉分析', cross_qlist=None, plt_dstyle='TGI', save_dstyle=None)Author: JSong, 2017‑02‑28
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
21CTO
21CTO (21CTO.com) offers developers community, training, and services, making it your go‑to learning and service platform.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
