In image classification using SVM project i used gridsearchcv for finding the best parameters but it is giving wrong results like there was some parameter which was giving better result but it did not choose that one
Gridsearchcv giving worng results
Hey @D17LP0096, that should not be the case, share your ipynb by uploading it on google drive and sharing the link here.
Hey @D17LP0096, in Grid search 5 fold cross validation is applied by default which means dataset is divided into 5 parts, trained on first 4 and cross validated on rest 1. But when you applied linear kernel with c = 1.0, you actually calculated the training loss and not, the cross validation score. that is why it is comming more.
So this means you are comparing two wrong things, and your grid search is working perfectly fine.
Hope this resolved your doubt.
Plz mark the doubt as resolved in my doubts section.
i am not getting your point i used score function in both of them. and about that i am comparing two differrent things i tried using the best_param given by gridsearch and computed the score same way like i did for linear kernel and its score matches with that given by gs.score()
Hey @D17LP0096, our aim is not to score more on training data, our task is to score more on validation data.
A model having less training error but having more validation error is not a good model.
You are comparing the training error of different models. This should not be the case, we need to compare the testing/validation error. Grid search does that by kfold validation with k value = 5.
First of all go through the concept of k fold validation, if you are not through it.
Also while performing grid search there is randomness, involved so sometimes usually very rare, the result may be slight different from the actual one, but this is sure that this will be close enough.
I hope I’ve cleared your doubt. I ask you to please rate your experience here
Your feedback is very important. It helps us improve our platform and hence provide you
the learning experience you deserve.
On the off chance, you still have some questions or not find the answers satisfactory, you may reopen
the doubt.
yeah we should compare validation error but if the model isnt able to get less training error we call it less trained model. If the training error is bad means that the classifier isnt good enough i guess. and about you point that theres little randomness. i sorted all the data by classes like…data is now class by class i.e not jumbled up. and i used the gridesearch its choosing completely different parameters and was getting much much higher accuracy
Hey @D17LP0096, the best way is to use train_test_split and divide your data into 80% training and 20% testing, now check accuracies of different model.
I hope I’ve cleared your doubt. I ask you to please rate your experience here
Your feedback is very important. It helps us improve our platform and hence provide you
the learning experience you deserve.
On the off chance, you still have some questions or not find the answers satisfactory, you may reopen
the doubt.