Here is the code:
https://colab.research.google.com/drive/12JEBq8ggjOcmW5FehJFp5pdtumbmXgJ7?usp=sharing
Ques1.) (a.) What parameters does Batch Normalization layer learns ?
(b.) Aren’t there just two parameters - gamma and beta values which needs to be learned in same way as other hyperparameters through backpropagation during the training process ?
(c.) And how to calculate the no of parameters which Batch Normalization layer learns ? (It seems like it is 4*no of channels in previous layer)
Ques2.) My model shows - “Non-trainable params: 21,136’”. What are these non-trainable parameters and from which layers are these coming (basically how are these calculated) ?? I mean I know the definition of non trainable parameters but aren’t all the weights in a keras model trainable by default ?