Implementation of backprop

In the video implementing backpropagation
while calculated db1, db2, db3
prateek bhaiya worte this :
db3 = np.sum(delta3,axis=0)/float(m)

and in video NN- training your model,
suddenly it changed to
db3 = np.sum(delta3,axis=0) for all db’s without showing us in the lecture,

My question is why he removed float(m) from the denomination,? in the theory part he taught that we have to take col wise sum and divide by m {means taking average} , even the formula he wrote also has divided by m…
Then why in the implementation part he removed / m
I tried dividing by m there, but then i get very unsual behavior of loss, it is not constantly decreasing…
What could be the reason for this?

hi @mohituniyal2010
good observation :wink:
there is simple logic if u divide the sum by m then u have to keep the learning rate high to compunsate for loss taken due to division.
if u dont divide db3 by m then u can keep the learning rate high

so, its just the matter of our preference, if we divide by m , learning rate should be high
if we dont divide db3 by m, then learning should be smaller as compared to first case.

yes u got the point exactly;
:+1: