Scoring on same values

divij26 · August 9, 2019, 8:26am

In the Gaussian NB video, we used the same values for training and calculating the score. Then how come the score comes out to be 0.9(90% accuracy)?

mohituniyal2010 · August 9, 2019, 9:46am

Hi @divij26,
So, you are asking why training score is not 100% and only 90%, since we are calculating the score for same data that we fitted in the model. ? If this is the your question Then I would like you to recall the concept of over fitting, under fitting and generalizing model.

If you had plotted the data, you must have observed that the data is not linearly separable, neither it is completely separable by a non-linear hypothesis (using some curve as decision boundary).

Over fitting - means to fit our data very well such that training accuracy comes near 100%. But over fitting always predicts a bad result on test data. rather we should have a generalized model, which might not give 100% training accuracy, but performs well on testing data. So, we don’t judge a model, by how much accuracy it is giving on training set rather consider the testing set accuracy.

So, it’s fine to get accuracy lesser than 100%, till we doing good on testing set.

But, even if you want to increase the training accuracy, increase the over fitting, but it is not advisable.
And GaussianNB being a simple model, 90% accuracy is not bad. You will study More non linear classifiers that performs even better.

I hope you got my point.

mohituniyal2010 · August 29, 2019, 8:28am

I hope I’ve cleared your doubt. I ask you to please rate your experience here
Your feedback is very important. It helps us improve our platform and hence provide you
the learning experience you deserve.

On the off chance, you still have some questions or not find the answers satisfactory, you may reopen
the doubt.