Transform Raw Taobao Data into Stunning Interactive Charts with Python
This article walks you through cleaning messy Taobao product data using pandas and jieba, then visualizing ingredient and shelf‑life information with interactive Pyecharts charts—including pie, bar, table, funnel, and polar graphs—while showing how to combine multiple charts into a single draggable HTML page.
Introduction
Hello, I am a Python enthusiast. In the previous article we pre‑processed Taobao data and performed word‑frequency analysis. This article continues from that dataset and demonstrates how to visualize the results using Python.
Visualization
1. Ingredient Pie Chart
We use a pie chart to display ingredient statistics, making the data look more polished.
# Generate ingredient chart
def get_ingredients_html(df):
# Tokenize ingredient column
names = df.配料表.apply(jieba.lcut).explode()
df1 = names[names.apply(len) > 1].value_counts()
# Write tokenized results to Excel
with pd.ExcelWriter("淘宝商品配料数据.xlsx") as writer:
df1.to_excel(writer, sheet_name="配料")
fpath = r'C:\Users\pdcfi\Desktop\淘宝数据分析\淘宝商品配料数据.xlsx'
# Read data and extract columns
df1 = pd.read_excel(fpath, header=None, skiprows=1, sheet_name='配料', names=['sx', 'sl'])
a = df1['sx'].to_list()[:10]
b = df1['sl'].to_list()[:10]
from pyecharts.charts import Pie
from pyecharts import options as opts
# Create pie chart
pie = (Pie()
.add('', [list(z) for z in zip(a, b)], radius=["20%", "60%"], rosetype="radius")
.set_global_opts(title_opts=opts.TitleOpts(title="淘宝商品数据配料统计", subtitle="8.19"))
.set_series_opts(label_opts=opts.LabelOpts(formatter="{b}: {d}%")))
pie.render('淘宝商品数据配料统计.html')Running the script generates 淘宝商品数据配料统计.html, which displays an interactive pie chart.
2. Shelf‑Life Pie Chart
We also visualize shelf‑life data with a pie chart, which initially looks less appealing.
# Generate shelf‑life pie chart
def get_date_html(df):
# Tokenize shelf‑life column
names = df.保质期.apply(jieba.lcut).explode()
df1 = names[names.apply(len) > 1].value_counts()
# Write tokenized results to Excel
with pd.ExcelWriter("淘宝商品保质期数据.xlsx") as writer:
df1.to_excel(writer, sheet_name="保质期")
fpath = r'C:\Users\pdcfi\Desktop\淘宝数据分析\淘宝商品保质期数据.xlsx'
# Read data and extract columns
df1 = pd.read_excel(fpath, header=None, skiprows=1, names=['bzq', 'rq'])
a = df1['bzq'].to_list()[:10]
b = df1['rq'].to_list()[:10]
from pyecharts.charts import Pie
from pyecharts import options as opts
pie = (Pie()
.add('', [list(z) for z in zip(a, b)], radius=["20%", "60%"], rosetype="radius")
.set_global_opts(title_opts=opts.TitleOpts(title="淘宝商品保质期可视化图表", subtitle="8.19"))
.set_series_opts(label_opts=opts.LabelOpts(formatter="{b}: {d}%")))
pie.render('淘宝商品保质期统计.html')3. Shelf‑Life Bar Chart
Because a pie chart looks odd for shelf‑life, we switch to a bar chart.
# Generate shelf‑life bar chart
def get_date_html(df):
# Tokenize shelf‑life column
names = df.保质期.apply(jieba.lcut).explode()
df1 = names[names.apply(len) > 1].value_counts()
with pd.ExcelWriter("淘宝数据.xlsx") as writer:
df1.to_excel(writer, sheet_name="保质期")
fpath = r'C:\Users\dell\Desktop\崔佬\数据分析综合实战\淘宝数据.xlsx'
df1 = pd.read_excel(fpath, header=None, skiprows=1, names=['bzq', 'rq'])
a = df1['bzq'].to_list()[:50]
b = df1['rq'].to_list()[:50]
from pyecharts.charts import Bar
from pyecharts import options as opts
from pyecharts.globals import ThemeType
bar = (Bar(init_opts=opts.InitOpts(theme=ThemeType.CHALK))
.add_xaxis(a)
.add_yaxis("保质期(天数)", b)
.set_global_opts(title_opts=opts.TitleOpts(title="Bar-DataZoom(slider-保质期)"),
datazoom_opts=opts.DataZoomOpts()))
return bar4. Combine Pie and Bar into One HTML
We place both charts into a draggable layout.
# Combine charts into a draggable page
def page_draggable_layout(df):
from pyecharts import Page
page = Page(layout=Page.DraggablePageLayout)
page.add(
get_ingredients_html(df),
get_date_html(df)
)
page.render("page_draggable_layout.html")5. Table Display
To show the raw DataFrame as a table, we use Pyecharts' Table component (code omitted for brevity). The resulting table is displayed alongside the charts.
6. Adjust Chart Background
We apply a dark theme to the pie chart to make it look more professional.
# Apply CHALK theme to pie chart
pie = (Pie(init_opts=opts.InitOpts(theme=ThemeType.CHALK))
.add('', [list(z) for z in zip(a, b)], radius=["20%", "60%"], rosetype="radius")
.set_global_opts(title_opts=opts.TitleOpts(title="配料统计", subtitle="8.19"))
.set_series_opts(label_opts=opts.LabelOpts(formatter="{b}: {d}%")))
return pie7. Funnel Chart
Using the “食品添加剂” column we create a funnel chart.
# Generate funnel chart
def get_sptj_data(df):
names = df.食品添加剂.apply(jieba.lcut).explode()
df1 = names[names.apply(len) > 1].value_counts()
with pd.ExcelWriter("淘宝数据.xlsx") as writer:
df1.to_excel(writer, sheet_name="食品添加剂")
fpath = r'C:\Users\dell\Desktop\崔佬\数据分析综合实战\淘宝数据.xlsx'
df1 = pd.read_excel(fpath, header=None, skiprows=1, names=['sptj', 'sj'])
a = df1['sptj'].to_list()[:10]
b = df1['sj'].to_list()[:10]
from pyecharts.charts import Funnel
from pyecharts import options as opts
funnel = (Funnel(init_opts=opts.InitOpts(theme=ThemeType.CHALK))
.add("商品", [list(z) for z in zip(a, b)],
label_opts=opts.LabelOpts(position="inside"))
.set_global_opts(title_opts=opts.TitleOpts(title="Funnel-Label(food_add)")))
return funnel8. Polar (Polar) Chart
Finally, we add a polar scatter chart for fun.
# Generate polar chart
def zb_data():
import random
data = [(i, random.randint(1, 100)) for i in range(10)]
from pyecharts.charts import Polar
from pyecharts import options as opts
polar = (Polar()
.add('', data, type_="effectScatter",
effect_opts=opts.EffectOpts(scale=10, period=5),
label_opts=opts.LabelOpts(is_show=False))
.set_global_opts(title_opts=opts.TitleOpts(title="Polar-没啥用,用来装逼,小小明yyds")))
return polarConclusion
Using pandas for data cleaning, jieba for Chinese word segmentation, and Pyecharts for multi‑type visualizations (pie, bar, table, funnel, polar), we turned a chaotic Taobao dataset into a series of interactive charts. The step‑by‑step approach is practical and verified to work.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Python Crawling & Data Mining
Life's short, I code in Python. This channel shares Python web crawling, data mining, analysis, processing, visualization, automated testing, DevOps, big data, AI, cloud computing, machine learning tools, resources, news, technical articles, tutorial videos and learning materials. Join us!
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
