Cause of vanishing and exploding gradients

ambika11 · July 16, 2020, 12:45pm

What exactly is the cause of vanishing or exploding gradients? Is it solely related to the activation function we use? And, is Relu activation free of this problem?

prashant_ml · July 16, 2020, 12:55pm

hey @ambika11,

Answer is initialization of Model weights and what activation function we use to train them. Yeah Activation functions are the only thing which helps in changing or updating the values of weights on any node. If wrong activation functions are choosen then we can either lead to vanishing gradient or towards exploding gradients .

Relu activation reduces this problem to a great extent. Although there might be a possibility in which relu function can these errors , but most times it is not so. Relu function is a really effective and there are many other improvements of relu function available now. So , yeah is a activation function that almost free of this problem.

I hope this helps you understand better.
Thank You and Happy Learning .

prashant_ml · January 6, 2021, 7:03pm

I hope I’ve cleared your doubt. I ask you to please rate your experience here
Your feedback is very important. It helps us improve our platform and hence provide you
the learning experience you deserve.

On the off chance, you still have some questions or not find the answers satisfactory, you may reopen
the doubt.