Solution for vanishing gradient

why does small weights with +ve and -ve helps

Hey @Lakshita, if you will notice the graph of sigmoid and tanh activations you will notice that, gradients will not vanish for values close to 0. That is the reason we initialize weights with small value both positive and negative.

Hope this resolved your doubt.
Plz mark the doubt as resolved in my doubts section :blush:

I hope I’ve cleared your doubt. I ask you to please rate your experience here
Your feedback is very important. It helps us improve our platform and hence provide you
the learning experience you deserve.

On the off chance, you still have some questions or not find the answers satisfactory, you may reopen
the doubt.