How to Count Teachers by Country Using pandas merge and join in Python
This article walks through a Python data‑analysis task where a user wants to count how many teachers come from each country, demonstrating two solutions with pandas—using the merge() function and the join() method—complete with code snippets and step‑by‑step explanations.
Introduction
A follower asked how to determine the number of teachers from each country using Python for data analysis. The goal is to count teachers by their associated country, such as how many are from the United States.
The initial idea was to simply count the occurrences of each country, but the data layout made this non‑trivial.
Below are the sample tables illustrating schools, countries, and teachers.
Implementation
Method 1: merge() function
The first solution uses pandas.merge() to join the teacher table with the school‑country table.
import pandas as pd
data1 = {"学校": ["哈佛", "MIT", "清华", "早稻田"], "国家": ["美国", "美国", "中国", "日本"]}
data2 = {"学校": ["哈佛", "MIT", "MIT", "清华", "清华", "早稻田"], "老师": ["John", "Mike", "Jason", "李明", "韩磊", "武田康福"]}
data1 = pd.DataFrame(data1)
data2 = pd.DataFrame(data2)
print(data1)
print(data2)
# Count teachers per country
result = data2.merge(data1, how='left').value_counts('国家')
print(result)
# Full merged table
print(data2.merge(data1, how='left'))This code merges the two DataFrames on the "学校" column and then counts the occurrences of each "国家" value, satisfying the original request.
When the follower later asked for the raw merged table without the count, the value_counts() call can simply be removed, as shown in the next screenshot.
Method 2: join() method
The second solution uses pandas.join(), which joins on the index by default. The following image shows the code implementation.
Because join() aligns on the index, it also produces the desired merged result, fulfilling the follower’s requirement.
Conclusion
The article demonstrates how to solve a practical data‑grouping problem using pandas' merge() and join() functions, providing clear code examples and explanations that deepen the reader’s understanding of these essential data‑manipulation tools.
Python Crawling & Data Mining
Life's short, I code in Python. This channel shares Python web crawling, data mining, analysis, processing, visualization, automated testing, DevOps, big data, AI, cloud computing, machine learning tools, resources, news, technical articles, tutorial videos and learning materials. Join us!
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
