Noisy Updates in Mini Batch Gradient Descent

How the noisy updates in the mini batch gradient descent helps to avoid the local minima and helps the gradient descent to converge at global minima?

hey @Spartan_online ,
mini batch gradient works the same way as our simple gradient descent on each sample works.
Just the difference is it takes more samples into consideration at once. This taking of a definite number of samples at once it only reason that helps our model , to move out and move towards global minima when it gets stuck in local minima. Depending on various factors ,each step is taken to reach convergence.
Although it might get stuck on local minima,but it is able to go out of it.

Actually this is an experimental task , there is no explanation that can exactly explain why it works most of the times.
There are numerous number of experiments done with this and hence ,it is concluded that it works so good.
Have a look here , hope it helps you more out of it.

Thank You :slightly_smiling_face:.