Data Generation

Why are we generating samples for clusters from data that is normally distributed ?

Hello @Shubhankit-Tiwari-1738649599558824,

Most of the natural phenomenons and raw datasets are believed to be normally distributed and they shows a lot of its characterstics. A normal distribution has values concentrated across the mean and as we move away from it, the frequency of the values decreases. Likewise, say a dataset that has students marks in an exam, there will be lot of students that has marks around the average, and very few will be having very low and very high marks.

A normal distribution is a very well understood probability distribution and resembles a lot of real world datasets. That is why we are generating our points from a Normal Distribution.

Happy Learning :blush:

1 Like