not able to understand that predict function part … moreover i think y_train contains labels from data sets and why we are taking y_train and label that is the same thing in prior_prob function?
Doubt regarding some concepts
Hello Kushal,
On the predict function, we simply pass our dataset into the Conditional Probability Function and Prior Probability Function. Conditional Probability refers to one term present in the Likelihood, i.e., one term of all the term we take the product of,
This equation is likelihood, while that term on the right side individually is the Conditional Probability.
Thus, in the predict function, we iterate over each class, find the Prior Probability for that class, find the Likelihood (by taking a product of all conditional probabilities) and then take the product of Likelihood and Prior to obtain the probability of the given class.
Then, we take the argmax of the probabilities, thus we obtain the class for which the probability is maximum and that becomes our predicted class.
Now,
y_train contains a set of labels, in the prior probability we are obtaining the probability of each label in our y_train. For that, we divide the number of times a particular label occurs in the y_train by the total number of examples (number of rows). Thus, label in this case stands for one class out the many present in the output column.
I hope this helps you strengthen your concept.