Fundamentals 14 min read

Transform Raw Taobao Data into Stunning Interactive Charts with Python

This article walks you through cleaning messy Taobao product data using pandas and jieba, then visualizing ingredient and shelf‑life information with interactive Pyecharts charts—including pie, bar, table, funnel, and polar graphs—while showing how to combine multiple charts into a single draggable HTML page.

Python Crawling & Data Mining
Python Crawling & Data Mining
Python Crawling & Data Mining
Transform Raw Taobao Data into Stunning Interactive Charts with Python

Introduction

Hello, I am a Python enthusiast. In the previous article we pre‑processed Taobao data and performed word‑frequency analysis. This article continues from that dataset and demonstrates how to visualize the results using Python.

Visualization

1. Ingredient Pie Chart

We use a pie chart to display ingredient statistics, making the data look more polished.

# Generate ingredient chart
def get_ingredients_html(df):
    # Tokenize ingredient column
    names = df.配料表.apply(jieba.lcut).explode()
    df1 = names[names.apply(len) > 1].value_counts()
    # Write tokenized results to Excel
    with pd.ExcelWriter("淘宝商品配料数据.xlsx") as writer:
        df1.to_excel(writer, sheet_name="配料")
    fpath = r'C:\Users\pdcfi\Desktop\淘宝数据分析\淘宝商品配料数据.xlsx'
    # Read data and extract columns
    df1 = pd.read_excel(fpath, header=None, skiprows=1, sheet_name='配料', names=['sx', 'sl'])
    a = df1['sx'].to_list()[:10]
    b = df1['sl'].to_list()[:10]
    from pyecharts.charts import Pie
    from pyecharts import options as opts
    # Create pie chart
    pie = (Pie()
           .add('', [list(z) for z in zip(a, b)], radius=["20%", "60%"], rosetype="radius")
           .set_global_opts(title_opts=opts.TitleOpts(title="淘宝商品数据配料统计", subtitle="8.19"))
           .set_series_opts(label_opts=opts.LabelOpts(formatter="{b}: {d}%")))
    pie.render('淘宝商品数据配料统计.html')

Running the script generates 淘宝商品数据配料统计.html, which displays an interactive pie chart.

2. Shelf‑Life Pie Chart

We also visualize shelf‑life data with a pie chart, which initially looks less appealing.

# Generate shelf‑life pie chart
def get_date_html(df):
    # Tokenize shelf‑life column
    names = df.保质期.apply(jieba.lcut).explode()
    df1 = names[names.apply(len) > 1].value_counts()
    # Write tokenized results to Excel
    with pd.ExcelWriter("淘宝商品保质期数据.xlsx") as writer:
        df1.to_excel(writer, sheet_name="保质期")
    fpath = r'C:\Users\pdcfi\Desktop\淘宝数据分析\淘宝商品保质期数据.xlsx'
    # Read data and extract columns
    df1 = pd.read_excel(fpath, header=None, skiprows=1, names=['bzq', 'rq'])
    a = df1['bzq'].to_list()[:10]
    b = df1['rq'].to_list()[:10]
    from pyecharts.charts import Pie
    from pyecharts import options as opts
    pie = (Pie()
           .add('', [list(z) for z in zip(a, b)], radius=["20%", "60%"], rosetype="radius")
           .set_global_opts(title_opts=opts.TitleOpts(title="淘宝商品保质期可视化图表", subtitle="8.19"))
           .set_series_opts(label_opts=opts.LabelOpts(formatter="{b}: {d}%")))
    pie.render('淘宝商品保质期统计.html')

3. Shelf‑Life Bar Chart

Because a pie chart looks odd for shelf‑life, we switch to a bar chart.

# Generate shelf‑life bar chart
def get_date_html(df):
    # Tokenize shelf‑life column
    names = df.保质期.apply(jieba.lcut).explode()
    df1 = names[names.apply(len) > 1].value_counts()
    with pd.ExcelWriter("淘宝数据.xlsx") as writer:
        df1.to_excel(writer, sheet_name="保质期")
    fpath = r'C:\Users\dell\Desktop\崔佬\数据分析综合实战\淘宝数据.xlsx'
    df1 = pd.read_excel(fpath, header=None, skiprows=1, names=['bzq', 'rq'])
    a = df1['bzq'].to_list()[:50]
    b = df1['rq'].to_list()[:50]
    from pyecharts.charts import Bar
    from pyecharts import options as opts
    from pyecharts.globals import ThemeType
    bar = (Bar(init_opts=opts.InitOpts(theme=ThemeType.CHALK))
           .add_xaxis(a)
           .add_yaxis("保质期(天数)", b)
           .set_global_opts(title_opts=opts.TitleOpts(title="Bar-DataZoom(slider-保质期)"),
                            datazoom_opts=opts.DataZoomOpts()))
    return bar

4. Combine Pie and Bar into One HTML

We place both charts into a draggable layout.

# Combine charts into a draggable page
def page_draggable_layout(df):
    from pyecharts import Page
    page = Page(layout=Page.DraggablePageLayout)
    page.add(
        get_ingredients_html(df),
        get_date_html(df)
    )
    page.render("page_draggable_layout.html")

5. Table Display

To show the raw DataFrame as a table, we use Pyecharts' Table component (code omitted for brevity). The resulting table is displayed alongside the charts.

6. Adjust Chart Background

We apply a dark theme to the pie chart to make it look more professional.

# Apply CHALK theme to pie chart
pie = (Pie(init_opts=opts.InitOpts(theme=ThemeType.CHALK))
       .add('', [list(z) for z in zip(a, b)], radius=["20%", "60%"], rosetype="radius")
       .set_global_opts(title_opts=opts.TitleOpts(title="配料统计", subtitle="8.19"))
       .set_series_opts(label_opts=opts.LabelOpts(formatter="{b}: {d}%")))
return pie

7. Funnel Chart

Using the “食品添加剂” column we create a funnel chart.

# Generate funnel chart
def get_sptj_data(df):
    names = df.食品添加剂.apply(jieba.lcut).explode()
    df1 = names[names.apply(len) > 1].value_counts()
    with pd.ExcelWriter("淘宝数据.xlsx") as writer:
        df1.to_excel(writer, sheet_name="食品添加剂")
    fpath = r'C:\Users\dell\Desktop\崔佬\数据分析综合实战\淘宝数据.xlsx'
    df1 = pd.read_excel(fpath, header=None, skiprows=1, names=['sptj', 'sj'])
    a = df1['sptj'].to_list()[:10]
    b = df1['sj'].to_list()[:10]
    from pyecharts.charts import Funnel
    from pyecharts import options as opts
    funnel = (Funnel(init_opts=opts.InitOpts(theme=ThemeType.CHALK))
              .add("商品", [list(z) for z in zip(a, b)],
                   label_opts=opts.LabelOpts(position="inside"))
              .set_global_opts(title_opts=opts.TitleOpts(title="Funnel-Label(food_add)")))
    return funnel

8. Polar (Polar) Chart

Finally, we add a polar scatter chart for fun.

# Generate polar chart
def zb_data():
    import random
    data = [(i, random.randint(1, 100)) for i in range(10)]
    from pyecharts.charts import Polar
    from pyecharts import options as opts
    polar = (Polar()
             .add('', data, type_="effectScatter",
                  effect_opts=opts.EffectOpts(scale=10, period=5),
                  label_opts=opts.LabelOpts(is_show=False))
             .set_global_opts(title_opts=opts.TitleOpts(title="Polar-没啥用,用来装逼,小小明yyds")))
    return polar

Conclusion

Using pandas for data cleaning, jieba for Chinese word segmentation, and Pyecharts for multi‑type visualizations (pie, bar, table, funnel, polar), we turned a chaotic Taobao dataset into a series of interactive charts. The step‑by‑step approach is practical and verified to work.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

Data visualizationWeb Scraping
Python Crawling & Data Mining
Written by

Python Crawling & Data Mining

Life's short, I code in Python. This channel shares Python web crawling, data mining, analysis, processing, visualization, automated testing, DevOps, big data, AI, cloud computing, machine learning tools, resources, news, technical articles, tutorial videos and learning materials. Join us!

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.