As of now we are using derivative of the tanh activation function in the intermediate layer of the neural network, however why are we not using the derivative of the softmax activation function in the final layer?
NN Implementation doubts
I think you are right, the derivative of sigmoid component is missing in the mentioned video. I’ll talk with the mentor. I think it was rectified in the further videos.
Will let you know once this is resolved!