K-Means Clustering Algorithm

K-Means Clustering


Clustering is the task of dividing the population or data points into a number of groups such that data points in the same groups are more similar to other data points in the same group and dissimilar to the data points in other groups.




Basically K-Means Clustering has 3 steps
  • 1. Take Mean Value
  • 2. Find Nearest number of mean and put it in cluster
  • 3. Repeat 1st and 2nd steps until we get the same mean

Lets do some math,
The dataset,
k = {2, 3, 4,10,11,12,20,25,30}
and make 2 cluster
k = 2

1. Take Mean value:
It will be given. If not, lets take random
m1 = 4 and m2 = 12

2. Find Nearest number of mean and put it in cluster
so,  from dataset,

k1 = {2, 3, 4} and k2 = {10, 11, 12, 20, 25, 30}
mean:
m1 = (2+3+4)/3 = 3 and m2 = (10+11+12+20+25+30)/6 = 18

k1 = {2, 3, 4, 10} and k2 = {11, 12, 20, 25, 30}
again same process,
m1 = 4.75 ~ 5 and m2 = 19.6 ~ 20
again,
k1 = {2, 3, 4, 10, 11, 12} and k2 = {20, 25,  30}
and the mean is
m1 = 7 and m2 =25
again,
k1 = {2, 3, 4, 10, 11, 12} and k2 = {20, 25, 30}
it's same as previous cluster
so, m1 and m2 is again 7 and 25
stop the process here,

and the new cluster are,

k1 = {2, 3, 4, 10, 11, 12} and k2 = {20, 25, 30}

Comments