How to Group Connected People Using Pandas and NetworkX in Python
An experienced Python user demonstrates how to group related individuals into connected components using pandas for data manipulation and networkx for graph analysis, providing complete code examples, visualizations, and step-by-step explanations to help readers solve similar connectivity problems.
1. Introduction
Hello, I'm Pi Pi. A group member asked how to use ChatGPT to solve a data analysis problem.
2. Implementation
The first solution uses pandas to assign a group number to each person based on connections. The following code demonstrates the process:
import pandas as pd
data = [
['刘备', '关羽'], ['刘备', '张飞'],
['曹操', '夏侯'], ['张飞', '诸葛'],
['夏侯', '荀彧'], ['孙权', '鲁肃']
]
df = pd.DataFrame(data, columns=['发起', '接收'])
# Create an empty dictionary to store name‑to‑group mapping
groups = {}
# Iterate over each row of the DataFrame
for _, row in df.iterrows():
sender = row['发起']
receiver = row['接收']
# If the sender is not yet in the mapping, assign a new group
if sender not in groups:
group = max(groups.values()) + 1 if groups else 1
groups[sender] = group
# If the receiver is not yet in the mapping, assign the same group as the sender
if receiver not in groups:
group = groups[sender]
groups[receiver] = group
# Add the group column to the DataFrame
df['组别'] = df['发起'].map(groups)
print(df)
# Output the groups as a dictionary
result = {}
for k, v in groups.items():
if v not in result:
result[v] = k
else:
result[v] += "," + k
print(result)Running the script produces a DataFrame with a new "组别" column and a dictionary that maps each group to its members.
Another approach leverages the networkx library to find connected components directly:
import networkx as nx
g = nx.Graph()
data = [
['刘备', '关羽'], ['刘备', '张飞'],
['曹操', '夏侯'], ['张飞', '诸葛'],
['夏侯', '荀彧'], ['孙权', '鲁肃']
]
g.add_edges_from(data)
for sub_g in nx.connected_components(g):
g_node = g.subgraph(sub_g).nodes()
print(g_node)The output lists each connected component of the graph.
Networkx can also draw the graph:
from matplotlib import pyplot as plt
import networkx as nx
plt.rcParams['font.sans-serif'] = ['SimHei']
plt.rcParams['axes.unicode_minus'] = False
g = nx.Graph()
data = [
['刘备', '关羽'], ['刘备', '张飞'],
['曹操', '夏侯'], ['张飞', '诸葛'],
['夏侯', '荀彧'], ['孙权', '鲁肃']
]
g.add_edges_from(data)
nx.draw_networkx(g)The resulting plot visualizes the relationships among the individuals.
3. Summary
This article showcases a typical graph connectivity problem in Python, offering both pandas and networkx solutions with complete code and visual results.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Python Crawling & Data Mining
Life's short, I code in Python. This channel shares Python web crawling, data mining, analysis, processing, visualization, automated testing, DevOps, big data, AI, cloud computing, machine learning tools, resources, news, technical articles, tutorial videos and learning materials. Join us!
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
