Text Classification Vectorization

vaishnavhv0 · March 31, 2019, 1:25pm

xt_vec = cv.transform(xt_clean).toarray()
print(xt_vec)
print(xt_vec.shape)
Doubt : How does xt_vec got 18 features inspite test_x does not have that much feature.I am not able to understand fit_transform and transorm.How does it get same feature as that of x_clean

yash97 · March 31, 2019, 5:33pm

fit_predict does two things , > first it fits on the data and knows the vocabulary and then it makes vectors on each review .

thus when we used vectorizer.transform( "list of cleaned train reviews " ) this just transform the list of test reviews into the vector for each review it doesnot fit the vectorizer that is create vocabulary or add words to vocabulary.

Sanket-Singh-1962326997353458 · April 7, 2019, 5:37am

Hey Harshal, as you are not responding to this thread, I am marking your doubt as Resolved for now. Re-open it if required.

Please mark your doubts as resolved in your course’s “ Ask Doubt ” section, when your doubt is resolved.