Artificial Intelligence 13 min read

Python Implementation of DBSCAN and KMeans for Point Cloud Clustering and Tracking with Hungarian Matching

This article presents a Python project that reads point‑cloud data from CSV files, applies DBSCAN and KMeans clustering, extracts cluster features, and uses the Hungarian algorithm to match clusters across frames for tracking, complete with full source code and result visualization.

Python Programming Learning Circle

May 5, 2024

Python Implementation of DBSCAN and KMeans for Point Cloud Clustering and Tracking with Hungarian Matching

The project demonstrates how to process point‑cloud data collected by a sensor, where each frame contains a variable number of (x, y, z) points, by performing unsupervised clustering with DBSCAN and KMeans and then tracking clusters over time using Hungarian matching.

Data is stored in a CSV file with columns Frame #, X, Y, Z; each row represents a point in a specific frame. Example rows show how frame numbers increase and points are listed.

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from sklearn.cluster import DBSCAN
from sklearn.preprocessing import StandardScaler

def adaption_frame(data, frame_start, frame_end, num_threshold=1000):
    data_x = []
    data_y = []
    data_z = []
    for i in range(frame_start, frame_end):
        target_frame = i
        table_data = data[data['Frame #'] == target_frame]
        x_arr = table_data['X'].values
        data_x = np.concatenate((data_x, x_arr), axis=0)
        y_arr = table_data['Y'].values
        data_y = np.concatenate((data_y, y_arr), axis=0)
        z_arr = table_data['Z'].values
        data_z = np.concatenate((data_z, z_arr), axis=0)
        if data_x.shape[0] > num_threshold:
            break
    return data_x, data_y, data_z

def valid_data(data_x, data_y, data_z):
    condition = (data_x >= -5) & (data_x <= 5) & (data_y >= -5) & (data_y <= 5) & (data_z >= -5) & (data_z <= 5)
    return data_x[condition], data_y[condition], data_z[condition]

def draw_data_origin(data_x, data_y, data_z):
    fig = plt.figure()
    ax = fig.add_subplot(111, projection='3d')
    ax.scatter(data_x, data_y, data_z, s=0.1)
    ax.set_xlabel('X')
    ax.set_ylabel('Y')
    ax.set_zlabel('Z')
    ax.set_title(f'Point Cloud at Frame {1}')
    plt.show()

def dbscan(data_x, data_y, data_z):
    data_input = np.column_stack((data_x, data_y, data_z))
    scaler = StandardScaler()
    data_scaled = scaler.fit_transform(data_input)
    dbscan = DBSCAN(eps=0.3, min_samples=5)
    labels = dbscan.fit_predict(data_scaled)
    num_clusters = len(set(labels)) - (1 if -1 in labels else 0)
    return num_clusters, labels

from sklearn.cluster import KMeans
from scipy.optimize import linear_sum_assignment
from scipy.spatial.distance import cdist

def cluster_kmeans(value, data_x, data_y, data_z):
    points = np.hstack((data_x.reshape(-1,1), data_y.reshape(-1,1), data_z.reshape(-1,1)))
    kmeans = KMeans(n_clusters=value, random_state=0).fit(points)
    return kmeans

def extract_feature(K, labels_order, data_x, data_y, data_z):
    features = []
    for i in range(K):
        x_mean = np.mean(data_x[labels_order[i]])
        y_mean = np.mean(data_y[labels_order[i]])
        z_mean = np.mean(data_z[labels_order[i]])
        cluster_mean = np.hstack((x_mean, y_mean, z_mean))
        cluster_points_size = labels_order[i].shape[0]
        features.append([cluster_mean, cluster_points_size, i])
    return features

def hungarian_match(features_last, features_now):
    centers_last = np.array([f[0] for f in features_last])
    counts_last = np.array([f[1] for f in features_last])
    centers_now = np.array([f[0] for f in features_now])
    counts_now = np.array([f[1] for f in features_now])
    distance_matrix = cdist(centers_last, centers_now)
    cost_matrix = np.abs(counts_last[:, np.newaxis] - counts_now) + distance_matrix * 10
    row_ind, col_ind = linear_sum_assignment(cost_matrix)
    matches = [(features_last[r], features_now[c]) for r, c in zip(row_ind, col_ind)]
    return matches

The main script reads the CSV, iteratively loads frames, filters noisy points, runs DBSCAN to obtain initial cluster counts, then switches to KMeans for a fixed number of clusters, extracts mean positions and sizes, and applies the Hungarian algorithm to associate clusters between consecutive frames, finally visualizing the tracked clusters in a 3D plot.

Result images show the matched clusters (e.g., red and green) over time, illustrating how the algorithm aligns point‑cloud groups across frames for tracking purposes.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Python Clustering data processing point cloud DBSCAN Hungarian algorithm KMeans

Written by

Python Programming Learning Circle

A global community of Chinese Python developers offering technical articles, columns, original video tutorials, and problem sets. Topics include web full‑stack development, web scraping, data analysis, natural language processing, image processing, machine learning, automated testing, DevOps automation, and big data.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.