DBSCAN

Python Implementation

Ray Hsu
Geek Culture

--

Intro

DBSCAN is a machine-learning clustering algorithm. It is also a great algorithm to practice and enhance our programming skills.

This article mainly focuses on Python implementation. Please jump to the last paragraph if you only need the code.

DBSCAN

Density-based spatial clustering of applications with noise (DBSCAN) is a data clustering algorithm proposed by Martin Ester, Hans-Peter Kriegel, Jörg Sander and Xiaowei Xu in 1996. — wiki DBSCAN

It is a density-based clustering non-parametric algorithm: given a set of points in some space, it groups together points that are closely packed together (points with many nearby neighbors), marking as outliers points that lie alone in low-density regions (whose nearest neighbors are too far away). DBSCAN is one of the most common, and most commonly cited, clustering algorithms. — wiki DBSCAN

DBSCAN can cluster density base data. Take the following picture, for example.

If we use the Kmean algorithm to cluster this plot, The result can be like this.

--

--