why don’t we use random.rand but used random.randn instead
Why don't we use random.rand but used random.randn instead
Hey @personifier997, numpy.random.randn
generates samples from the normal distribution, while numpy.random.rand
from a uniform distribution (in the range [0,1)).
Now the question that comes to mind is that why does uniform distribution don’t work well ?
The main reason in this is activation function, where you use sigmoid function. The plot of the sigmoid looks like following:
So you can see that if your input is away from 0, the slope of the function decreases quite fast and as a result you get a tiny gradient and tiny weight update. And if you have many layers - those gradients get multiplied many times in the back pass, so even “proper” gradients after multiplications become small and stop making any influence. So if you have a lot of weights which bring your input to those regions you network is hardly trainable. That’s why it is a usual practice to initialize network variables around zero value. This is done to ensure that you get reasonable gradients (close to 1) to train your net.
I hope this clears your doubt !
Please mark the doubt as resolved in your doubts section !
Happy Learning !
I hope I’ve cleared your doubt. I ask you to please rate your experience here
Your feedback is very important. It helps us improve our platform and hence provide you
the learning experience you deserve.
On the off chance, you still have some questions or not find the answers satisfactory, you may reopen
the doubt.