I didnot get how to calculate accuracy over test data ?
I didnot get how to calculate accuracy over test data?
hey @vineetchanana ,
once you have got your centroids for each cluster , then for a query_X , calculate the distance from each centroid , get the respective cluster point which is closest to the query point.
Do this for each point , and submit a list of such and check your score on the website.
i hope this helped you .
Sorry, but in the KNN lecture we didn’t talk about finding clusters. I think you have mistaken this doubt for K-Means clustering.
oh sorry , my bad.
for validation part ,
in KNN , when you have got your predicted classes for each query point in validation set ,just compare those predictions with actual values , and get the
acc = (number of records that matches) / total number of records
acc = acc*100
Now this acc
, is your accuracy over validation data.
Similarly predict for test set , create a csv file as same as provided in sample submission and submit on the website to check what score you achieved.
I hope this could be helpful for you
Thank You
What is validation set?
While working on a machine learning task, how do you know before hand that how the model is going to perform on unseen data, that model is not overfitted?
To answer that, we take a small fraction of our data out as a validation set and use the other data for training.
Like we can split our data set with 80:20 ratio and training and validation data.
We train our model only on training data, and check it’s results on validation set.
When we get our best results , we use that model to predict on test data.
I hope this helped you understand it.