Word Frequency Statistics and Word Cloud Generation with Python
This tutorial explains how to count word frequencies using Python's collections.Counter from static, random, or file‑based text sources and then visualize the results with the WordCloud library, covering code examples, parameter settings, and available colormaps.
This article demonstrates how to perform word‑frequency statistics in Python using the collections.Counter class and how to visualize the frequencies with the WordCloud library.
Three methods for obtaining text are presented: (1) a predefined list of words, (2) randomly generated characters, and (3) reading a text file. Each method includes a short code example.
Example of counting frequencies in a fixed list:
from collections import Counter
cnt = Counter()
for word in ['red', 'blue', 'red', 'green', 'blue', 'blue']:
cnt[word] += 1
print(cnt.most_common())Example of generating a random character list and counting its frequencies:
import random
import collections
import string
# Combine uppercase and lowercase letters
str1 = string.ascii_letters
strlist = [random.choice(str1) for i in range(100)]
strcount = collections.Counter(strlist)
# Print the top 10 most common letters
for key, value in strcount.most_common(10):
print(key, value)The article then introduces the WordCloud library for creating word‑cloud images, referencing an external blog for further details. It shows several example images and explains how to adjust parameters such as font_path (e.g., msyh.ttf , msyhbd.ttf , simsun.ttc , simhei.ttf ) and colormap , which controls the color scheme of the cloud.
Common colormap options are listed with brief descriptions, for example: autumn transitions from red to orange to yellow, cool moves from cyan‑green to magenta, jet goes from blue through green, yellow, orange to red, and many others such as gray , hot , pink , prism , etc.
These settings allow users to customize the appearance of the generated word cloud to match their visual preferences or branding requirements.
Python Programming Learning Circle
A global community of Chinese Python developers offering technical articles, columns, original video tutorials, and problem sets. Topics include web full‑stack development, web scraping, data analysis, natural language processing, image processing, machine learning, automated testing, DevOps automation, and big data.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.