NLP Project - Cannot Understand use of Laplace smoothing in this case

raunaqsingh10 · February 10, 2020, 10:48am

When we are using countvectorizer and using cv.transform(x_test) we will get a vector. Then we shall go through the vector and then calculate probability. When we will iterate through the vector there will be no case where we will see a new word and therefore there is no point of doing laplace smoothing

S18CRX0120 · February 10, 2020, 11:26am

You are right for the statement, “When we will iterate through the vector there will be no case where we will see a new word”, but rest of part is wrong.

When you will be calculating the conditional probability of that particular word occurring in that particular class that may be zero, which means there was word in the dictionary but it never occured in that particular class documents, and hence its conditional probability belonging to that class would be zero, leading total conditional probability to be zero. And to prevent this we need laplace smoothing.

Hope this cleared your doubt.

raunaqsingh10 · February 10, 2020, 8:19pm

Understood sir, amazing explaination. How sir explained it then that was wrong. As he told that suppose we find a new word - “overjoyed” and that has never been used before that is why we use laplace smoothing

S18CRX0120 · February 11, 2020, 5:19am

I hope I’ve cleared your doubt. I ask you to please rate your experience here
Your feedback is very important. It helps us improve our platform and hence provide you
the learning experience you deserve.

On the off chance, you still have some questions or not find the answers satisfactory, you may reopen
the doubt.