What does the sparse matrix represent?? whwn we do this- cv.fit_transform(corpus).
Bag of words - Constructing vocab
cv.fit_transform basically converts each of sentence in the corpus list into a list of integers. What it does is that it assigns a specific position to each word in the entire vocabulary and then it transforms every sentence into a list (of length equal to the entire vocabulary size) in which the number at a particular position depicts the number of times that word (which is represented by that position/index) has occured in the sentence.
To get a better understanding try executing the cv.vocabulary_ command, it’ll return a dictionary with every word in the vocabulary along with the index which represents it.
I hope this clears your doubt.
I hope I’ve cleared your doubt. I ask you to please rate your experience here
Your feedback is very important. It helps us improve our platform and hence provide you
the learning experience you deserve.
On the off chance, you still have some questions or not find the answers satisfactory, you may reopen
the doubt.