Probability distribution formula

mananaroramail · April 19, 2020, 12:58pm

In the formula, Prateek Bhaiya used X matrix which denotes m features [ x1 , x2,…xm ] , but shouldn’t be X a " n X m " matrix as while collecting data we will have " each data set " from “n” number of vectors having " m " features.
And if that’s the case, how can one vector represent for all the features.
Also if X is a single vector then how can we generate U i.e mean matrix from it.
As we are generating U as a vector/ matrix of dimension " 1 X m "

S18CRX0120 · April 19, 2020, 1:38pm

Hey @mananaroramail, X is a n* m matrix only. and your specified reason is completely fine.

since there are n vectors, assume each row as information for a different monkey. So we classify all its features in that complete row.
It can be calculated by first taking the sum of all column values individually and than dividing by the number of examples.

Assume this as spreadsheet for students marks in a class say X.

Hope this resolve your doubt.
Remember to mark the doubt as resolved

mananaroramail · April 19, 2020, 1:50pm

Okay, so actually P(x) formula is representing the “frequency” of that vector in z axis(of course).
So , that formula is actually a function which takes a specific data set(or vectors) from the bigger data set.
Is this conclusion alright?

S18CRX0120 · April 19, 2020, 2:10pm

Hey @mananaroramail, yes now it seems you got it right.