AlexNet - CIFAR 100

manmeetkaur0175 · November 27, 2020, 6:10pm

Hi!
I hope you’re doing good…
I was trying to train CIFAR100 on AlexNet from Scratch using Tensorflow, but even after 100+ epochs, I am only able to get the testing accuracy at approx. 42%. Confused as to why it is showing such behavior, with a batch size of 64 and learning rate: 0.001. Could you please help me in clearing this doubt…
Thanks!

prashant_ml · November 28, 2020, 6:58am

hey @manmeetkaur0175 ,
CIFAR 100 is a very big dataset , having a large number of classes too.
So , you need to work a lot on this to get good results.
Now coming to the model ,You are working on AlexNet, that nice.
Though if you are making it from scratch , you need to keep a lot of things in mind.

What kind of input you provide. Like , are you augmenting or not , whether augmenting will be useful or not.

Once You are done with the input data , then comes the forward modelling part into action.
Now as your model structure is fixed ( AlexNet Structure ) , Now its your choice to improve or optimize the number of nodes in each layer or not. Thats also comes as a part of experiment.

Now lets assume that , you have got full model with optimized no. of nodes in each layer.
The time comes to actually train it.

Most important things to remember now are learning_rate , optimizer ,loss function and batch_size.
Although number of epochs is also important , but not that much.
Now ,
a. As it a multiclass classification problem , then categorical_crossentropy would work very good as to be the loss function.
b. For optimizer , generally we say adam works best , but its not a rule to use this only.
You Can try RMSProp , Adamgrad, etc. Which works good.
c. Learning rate, i guess the value used by you is a quite large , i would suggest lowering it.
d. batch_size , most of the times , if data is very big , the value of 256-1024 is used. But yeah you need try a more of them and get the one which suits better.

Extra points.

Although the AlexNet model is really good , But there are may other models providing the state of the art results on this problem. You can have a look at them ,or even try improving the model structure by adding more layers , optimizing number of layers , optimizing number of nodes etc.
Try using cross_validation , with this you can get a simple expectation of predictions on unseen data.
Try using callbacks in training , like EarlyStopping , ModelCheckPoint, ReduceLrOnPlateau , these helps to improve the model working while being in training. Have found them very useful , when training model on large dataset.

I hope these points help you a bit.
Thank You and happy learning .

manmeetkaur0175 · November 28, 2020, 7:22am

Thanks a lot for providing a detailed overview of the parameters that can be tweaked…

manmeetkaur0175 · November 28, 2020, 7:23am

I will try experimenting with them, and improve upon my testing accuracy.

manmeetkaur0175 · November 28, 2020, 7:24am

Thanks for sharing !

prashant_ml · November 28, 2020, 7:33am

Its good to hear that.

I would request you to kindly mark this doubt as resolved and raise another doubt specifically to things to get stucked in experimenting above topics.

Thank You and Happy Learning .

prashant_ml · January 6, 2021, 7:27pm

I hope I’ve cleared your doubt. I ask you to please rate your experience here
Your feedback is very important. It helps us improve our platform and hence provide you
the learning experience you deserve.

On the off chance, you still have some questions or not find the answers satisfactory, you may reopen
the doubt.