Unsupervised Image Classification using KMeans Classification in QGIS

Unsupervised Image Classification using KMeans Classification in QGIS

Time3 min read

Unsupervised classification is done on software analysis. It uses computer techniques for determining the pixels which are related and group them into classes. Now in this post, we are doing unsupervised image classification using KMeansClassification in QGIS.

Before doing unsupervised image classification it is very important to learn and understand the K-Means clustering algorithm.

Introduction to K-Means Clustering

The first property of clusters – the points or things within a cluster should be similar to each other or the same kind.

K-Means Clustering is an algorithm that tries to minimize the distance of the points in a cluster with their centroid – the k-means clustering technique. K-Means Clustering is a simple yet powerful algorithm in data science.

The main objective of the K-Means algorithm is to minimize the sum of distances between the points and their respective cluster centroid.

Let’s now take a basic example to see how K-Means Clustering algorithm really functions:

We have 8 points and we want to apply k-means to create clusters for these points. Let’s see how we can do it.

Step 1: Pick the number of clusters k

The first step in k-means is to choose the number of clusters, k.

Step 2: Select k random points from the data as centroids

Next, we randomly select the centroid for each cluster. Let’s say we want to have 2 clusters, so k is equal to 2 here. We then randomly select the centroid:

Here, the red and green circles represent the centroid for these clusters.

Step 3: Assign all the points to the closest cluster centroid

Once we have initialized the centroids, we assign each point to the closest cluster centroid:

Here you'll see that the points which are closer to the red point are assigned to the red cluster whereas the points which are closer to the green point are assigned to the green cluster.

Step 4: Recompute the centroids of newly formed clusters

Now, once we have assigned all of the points to either cluster, the next step is to compute the centroids of newly formed clusters

Here, the red and green crosses are the new centroids.

Step 5: Repeat steps 3 and 4

We then repeat steps 3 and 4:

The step of computing the centroid and assigning all the points to the cluster supported their distance from the centroid may be a single iteration. But wait – when should we stop this process? It can’t run till eternity, right?

Stopping Criteria for K-Means Clustering

There are essentially three stopping criteria which will be adopted to prevent the K-means algorithm:

1. Centroids of newly formed clusters don't change

2. Points remain within the same cluster

3. Maximum number of iterations are reached

We can stop the algorithm if the centroids of newly formed clusters aren't changing. Even after multiple iterations, if we are becoming an equivalent centroid for all the clusters, we will say that the algorithm isn't learning any new pattern and it's a symbol to prevent the training.

Another clear sign that we should always stop the training process if the points remain within the same cluster even after training the algorithm for multiple iterations.

Finally, we will stop the training if the utmost number of iterations is reached. Suppose if we've set the number of iterations as 100. the method will repeat for 100 iterations before stopping.

Unsupervised classification using KMeansClassification in QGIS

Now we will see the steps for Unsupervised Classification on QGIS software. Let's follow the steps.

  • Add a raster layer in a project Layer >> Add Layer >> Add Raster Layer.

  • Click on menu toolbar Processing >> Toolbox >> OTB >> Learning >> KMeansClassification.

  • Select the input image. Type the number of classes to 20 (default classes are 5). Fill training size to 10000.

  • Output pixel type unit8. (you can skip it)
  • Output image Save to File.

  • Click on Run. (it will take a little bit time). After completing a new output layer will be added in the layer.

  • Close it.
  • To change the symbology open Properties of Output Image.

  • From symbology select render type Singleband Pseudocolor.

  • Scroll down and click on Classify. Select Mode Equal Interval. Set classes 20.

  • Click Apply and then click on OK.

That’s it.

Finally, we did Unsupervised Classification using KMeansClassification in QGIS. If you face any problem regarding this comment below. Raster image and output file download links are also available. You can download it from here.


Raster Image > Area(Tufanganj)> Click Here

Output Image > Click Here

Share Now: