Trouble In training the tiny ImageNet-200 Dataset

I am trying to apply the Alexnet CNN architecture to train on tiny ImageNet-200 dataset which prateek bhaiya gave as an assignment. I am trying to implement it but not getting good accuracy and all, i guess i am making some mistakes in the code , but i am not able to counter the problems in my code,plz if you can check my code and tell me what changes i should make to train my model on this dataset efficiently.
code link: https://drive.google.com/file/d/18qWBdmyCQlv3JbfUgc1OjEAcA-2xipo5/view?usp=sharing

It would be a little more benefecial for me if you can clear my doubt either through a phone call or a whatsapp chat, as i want to clear my all doubts regarding this alexnet implementation on this data set in an interactive manner , if possible…

hey @saksham_thukral ,
kindly provide access to your code link. I am not able to view it.

sorry for the trouble ,this is the link, hope it will open now:-

hey @saksham_thukral ,
here are some points which i found some points which should be corrected to improve accuracy , as i have also found them to be useful, in my experience.

  1. You are applying a large number of augmentation transformations , you can change them as
    a. Generally the value for rotation_range is between 10-30.
    b. width_shift_range and height_shift_range have always lead to overfitting for me , try smaller values if you want to try them.
    c. use zoom range between 0.05-0.2 , as increasing it sometime leads to leaving some useful information which might be present in corners of image.
  2. First Start with small batch_size like 32 or 64 , as large value of batch_size means you are providing a large number of similar samples to the model , so the model will learn very slowly.
  3. your train_gen shows 200 classes but test_gen is showing 1 class. It should be same , kindly make your directory in the same format as train set.
  4. Instead of creating a seperate load_validation_function , just use parameter validation split in keras DataGenerators , you can have reference from here
  5. Now comes the main Model,
    a. Dont use such big values for kernel_size , recommended are [1,4] sometimes 5
    b. reduce the number of filters , like you can move like 16,32,64,128,… In this way your number of parameters will also be less and model might learn more properly.
    c. Instead of Flatten , try GlobalAvgPool . First understand how it works and then use.
    d. In the feed-forward model part , try smaller number of nodes in Dense Layers. Like 1024 -> 256 -> 200.
3 Likes

I know the last reply has gone a quite long , but i hope it will help you.
See, tuning deep learning models is really a big task. So be ready for it.