Artificial Intelligence 3 min read

Generating Word Cloud and Pie Chart from a News Article Using Python

This article demonstrates how to scrape a news webpage with Python, extract and segment its Chinese text using jieba, count word frequencies, and visualize the top ten terms as a word cloud and a pie chart with pyecharts.

Python Programming Learning Circle
Python Programming Learning Circle
Python Programming Learning Circle
Generating Word Cloud and Pie Chart from a News Article Using Python

The article shows a step‑by‑step method to process a news article and create visual representations of the most frequent words.

Solution steps: 1) Crawl all text from the news page; 2) Split the text into words; 3) Count each word’s occurrences and keep the top 10; 4) Generate a word cloud and a pie chart.

Code example:

import jieba as jieba
import requests
from bs4 import BeautifulSoup
from pyecharts.charts import WordCloud, Pie

if __name__ == "__main__":
    headers = {
        "user-agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/111.0.0.0 Safari/537.36"
    }
    url = "https://new.qq.com/rain/a/20230315A08LAK00"
    res_html = requests.get(url, headers=headers).text
    soup = BeautifulSoup(res_html, "lxml")
    txt = soup.select(".content-article")[0].text
    words = jieba.lcut(txt)
    counts = {}
    for word in words:
        if len(word) == 1:
            continue
        else:
            counts[word] = counts.get(word, 0) + 1
    sort_data = sorted(counts.items(), key=lambda a: a[1], reverse=True)[:10]
    wc = WordCloud()
    wc.add("", sort_data, word_size_range=[20, 100])
    wc.render("1.html")
    pip = Pie()
    pip.add(series_name="次数", data_pair=sort_data)
    pip.render("2.html")

The resulting word cloud and pie chart display the most frequent words extracted from the article, providing a quick visual insight into its content.

Data VisualizationWeb Scrapingpyechartsjiebatext analysisword cloud
Python Programming Learning Circle
Written by

Python Programming Learning Circle

A global community of Chinese Python developers offering technical articles, columns, original video tutorials, and problem sets. Topics include web full‑stack development, web scraping, data analysis, natural language processing, image processing, machine learning, automated testing, DevOps automation, and big data.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.