How to Scrape and Visualize 20 Years of TIOBE Language Rankings with Python
This tutorial walks you through using Python to crawl the TIOBE index from 2001 to the present, clean the data into a nested dictionary, and create dynamic visualizations that reveal how programming language popularity has shifted over two decades.
In the world of programming languages, debates about the "best" language never cease, but the TIOBE Programming Language Index provides a data‑driven way to track language popularity over time.
In this article we demonstrate how to scrape the TIOBE rankings from May 2001 to the present, clean the data, and build dynamic visualizations that show the evolution of language popularity over nearly twenty years.
01. Data Acquisition
We start by opening the TIOBE index page at https://www.tiobe.com/tiobe-index/ . The required information is embedded directly in the page’s source code, as illustrated below.
02. Data Cleaning
After fetching the HTML, we use regular expressions to extract language names and their popularity scores for each month, then store the results in a nested dictionary such as:
{"2020-01-12": {"Java": 16, "C++": 14, "Python": 10, ...}, "2020-02-13": {"Java": 16.3, "C++": 15.6, ...}, ...}03. Setting Bar Colors
To distinguish each language in the bar chart, we assign a unique color to every bar and sort the nested dictionary chronologically.
04. Final Visualization
We clear the figure’s active axes, sort languages by their monthly scores, and map each language to its assigned color before rendering the chart.
05. Dynamic Display
The chart is displayed in a loop with a pause, creating an animated bar‑chart that updates month by month.
06. More Fancy Animations
If a more polished look is desired, the data can be rendered with JavaScript or tools like Flourish, as shown below.
Conclusion
The animated visualizations reveal that Java and C have consistently held the top two spots, while Python, driven by the AI boom, surged from the bottom to third place, widening the gap with the fourth‑ranked language. Conversely, C++ and PHP have seen declining popularity. Despite fluctuations, every language retains value, and mastering any of them can be beneficial.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Python Crawling & Data Mining
Life's short, I code in Python. This channel shares Python web crawling, data mining, analysis, processing, visualization, automated testing, DevOps, big data, AI, cloud computing, machine learning tools, resources, news, technical articles, tutorial videos and learning materials. Join us!
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
