Plz could u tell how CNN make predictions stepwise clearly?
CNN Vs Neural Network
Hey @Bhawna,
Watch this video, this will clear your doubts regarding how a CNN works.
you may ask if you still have doubts or mark doubt as resolved.
I got how CNN works but how weights of filters and other layers internally are decided ,I didn’t undestand.
For this ,I refer this video but gets confused in middle of video.
Nice question,
First of all the filters are initialized randomly then these weights are fine-tuned by backpropagation.
Apart from this the hidden layers and fully connected layers are hyperparameters and you have to tune them in order to get the best accuracy.
We used Binary cross entroy to calculate derivative of Loss
we first calculated loss for last layer and then backpropagate loss for hidden layer of Neural network
But I am not getting how to decide weights of filters
plz could u give clear idea .
I think I have not understood completely and now too much confused.
Let me explain it to you how does loss back propagates in a CNN.
Before we get into backpropagation, we must first take a step back and talk about what a neural network needs in order to work. At the moment we all were born, our minds were fresh. We didn’t know what a cat or dog or bird was. In a similar sort of way, before the CNN starts, the weights or filter values are randomized. The filters don’t know to look for edges and curves. The filters in the higher layers don’t know to look for paws and beaks. As we grew older however, our parents and teachers showed us different pictures and images and gave us a corresponding label. This idea of being given an image and a label is the training process that CNNs go through. Before getting too into it, let’s just say that we have a training set that has thousands of images of dogs, cats, and birds and each of the images has a label of what animal that picture is.
Back to backprop:
So backpropagation can be separated into 4 distinct sections, the forward pass, the loss function, the backward pass, and the weight update.
During the forward pass, you take a training image which let’s assume is a 32 x 32 x 3 array of numbers and pass it through the whole network. On our first training example, since all of the weights or filter values were randomly initialized, the output will probably be something like [.1 .1 .1], basically an output that doesn’t give preference to any number in particular. The network, with its current weights, isn’t able to look for those low level features or thus isn’t able to make any reasonable conclusion about what the classification might be. This goes to the loss function part of backpropagation. Remember that what we are using right now is training data. This data has both an image and a label. Let’s say for example that the first training image inputted was of a dog The label for the image would be [1 0 0 ]. A loss function can be defined in many different ways but a common one is MSE (mean squared error), which is ½ times (actual - predicted) squared.
Let’s say the variable L is equal to that value. As you can imagine, the loss will be extremely high for the first couple of training images. Now, let’s just think about this intuitively. We want to get to a point where the predicted label (output of the ConvNet) is the same as the training label (This means that our network got its prediction right).In order to get there, we want to minimize the amount of loss we have. Visualizing this as just an optimization problem in calculus, we want to find out which inputs (weights in our case) most directly contributed to the loss (or error) of the network.
This is the mathematical equivalent of a dL/dW where W are the weights at a particular layer. Now, what we want to do is perform a backward pass through the network, which is determining which weights contributed most to the loss and finding ways to adjust them so that the loss decreases. Once we compute this derivative, we then go to the last step which is the weight update. This is where we take all the weights of the filters and update them so that they change in the opposite direction of the gradient.
The learning rate is a parameter that is chosen by the programmer. A high learning rate means that bigger steps are taken in the weight updates and thus, it may take less time for the model to converge on an optimal set of weights. However, a learning rate that is too high could result in jumps that are too large and not precise enough to reach the optimal point.
The process of forward pass, loss function, backward pass, and parameter update is one training iteration. The program will repeat this process for a fixed number of iterations for each set of training images (commonly called a batch). Once you finish the parameter update on the last training example, hopefully the network should be trained well enough so that the weights of the layers are tuned correctly.
Thank you so much ,now I am getting .
But Sir Doubt Support Period is for 6 months.
It is February(6th Month),And still I have not completed my course .
So Sir plz if possible ,reply as early as possible .
Or sir if I will have doubts after February How can I ask ,
Sir plz extend doubt support for atleast this course.
I am pretty new to these topics and even after seeing videos,I am not understanding all things deeply .
For extension related queries mail your query at [email protected] They will help you out.
Thank you!