hey @Royal_Yashasvi ,
Yeah your approach is correct.
The thing actually is that resnet is pre trained classifier model that is trained on Imagenet Data.
Now , as that being so big data,we can’t create our new model and train it again. So we use this pretrained model weights of Resnet model and then just use the learning it has to predict outputs/ extract features for our custom image.
Generating/Extracting features means , if you check the model of Resnet
It would be something like
input > processing in Network > 1024 features > 784 features >… > …>1000 classes
Not exactly the same figures but yeah somewhat similar ,
so these steps where you see 1024 features , 784 features are actually layers of the model , that act as a feature extractor layers , hence we take a sub model to that layer ( which is already trained and had there weights intact ) hence when we pass our image to it , it results in an array of 1024 values .
and after this , we take these values pass to our custom model ( Feed forward network ) and then its complete , your classification model is ready with ResNet as a base model/ feature extractor.
I hope this helps.