Mean-Shift Clustering#

Mean-shift clustering is a powerful and flexible non-parametric clustering algorithm that has found applications in various domains such as computer vision, image processing, and bioinformatics. The algorithm was introduced by Funkhouser and Belongie in 2003 and has since become popular due to its ability to find clusters in an unsupervised manner.

How Does Mean-Shift Clustering Work?#

Mean-shift clustering operates by shifting the mean of a set of points in the feature space until it converges to a dense region, called a mode. Each mode found by the algorithm is considered to be a cluster. The algorithm is based on the idea of kernel density estimation, which is a way of estimating the probability density function of a data set.

The first step of the mean-shift algorithm is to define a kernel function, which is used to compute the probability density of the data points. A common choice for the kernel function is a Gaussian function, although other types of kernels can be used as well. The next step is to initialize the mean of the data points and start the iterative process of shifting the mean towards the dense regions in the data set.

At each iteration, the algorithm computes the probability density of the data points given the current mean. Then, the mean is shifted to the weighted average of the data points, where the weights are determined by the probability density. This process is repeated until the mean converges or a maximum number of iterations is reached. The final mean is assigned to a cluster, and the data points that are closest to the mean are assigned to that cluster as well.

Once all the clusters have been found, the algorithm can be used to classify new data points by assigning them to the nearest cluster. The mean-shift algorithm is particularly useful when the data set contains clusters of different shapes and sizes, as it is able to find clusters of arbitrary shapes and sizes. Additionally, the mean-shift algorithm does not require the number of clusters to be specified in advance, making it a flexible and powerful tool for unsupervised machine learning.


One of the main advantages of the mean-shift algorithm is that it does not require the number of clusters to be specified beforehand, unlike other algorithms such as K-Means. This makes it an attractive option when the number of clusters is unknown or difficult to determine. Another advantage is that the mean-shift algorithm can handle non-linearly separated data, which is not the case for algorithms such as K-Means.


However, one disadvantage of the mean-shift algorithm is that it can be computationally expensive compared to other clustering algorithms. This is because the algorithm requires multiple iterations to converge to the modes. Additionally, the algorithm is sensitive to the choice of the bandwidth parameter, which controls the size of the neighborhood considered for mean shifting. A smaller bandwidth may result in over-segmentation, while a larger bandwidth may result in under-segmentation.

Example Code#

import numpy as np
import matplotlib.pyplot as plt
from sklearn.cluster import MeanShift
from sklearn.datasets import make_blobs

# Generating sample data for clustering
X, y = make_blobs(n_samples=200, centers=3, random_state=0)

# Fitting the data to the Mean-Shift algorithm
ms = MeanShift()
labels = ms.labels_
cluster_centers = ms.cluster_centers_

# Plotting the results
colors = ['r', 'g', 'b']
for i in range(len(np.unique(labels))):
    plt.scatter(X[labels == i, 0], X[labels == i, 1], s=30, c=colors[i], label='Cluster ' + str(i))
plt.scatter(cluster_centers[:, 0], cluster_centers[:, 1], s=100, c='black', label='Centroids')


In conclusion, mean-shift clustering is a valuable tool for clustering in situations where the number of clusters is unknown or the data is non-linearly separated. The algorithm is flexible and can handle complex data distributions. However, it is computationally expensive and requires careful tuning of the bandwidth parameter to obtain good results.