help regarding what made you take transpose here X.T(y_-y) why X transpose ?is it just for making sure dimensions are correct? is it is possible to do without taking transpose?
Can you please or share docs?
hey @gauthampkrishnan,
It is done so because before calculating the gradient we have our data both X and Y as to be column matrix ( having n rows and 1 column ) , so performing multiplication on such 2 matrices is not possible. Hence , according to multiplication seven rule of vector multiplications we had to convert our X to its transpose.
considering only n features in X with m records
shape of X = ( m,n) , shape of Y = ( m, )
#after transpose
shape of X = ( n,m ) , shape of Y = ( m, )
hence now we are able to multiply it properly and get those proper terms we need for our solution based on the given formulae.
I hope this would have resolved your doubt.
Thank You.