The movie review classifier TLE

muditarya31 · September 11, 2020, 3:36pm

even after using a generate to iterate over all the reviews in order to clean them, the program is not able to do it (it keeps running like forever).
what’s the way out?

prashant_ml · September 14, 2020, 10:43am

hey @muditarya31,
the way in which you doing this is wrong.
It is true that it will take time as you are trying to first merge the generator in a tuple, so it will take time for it.

can you please let me know that how you are going to use it in further as based on you approach forward i would be only able to let you know what you need to do.

THank You .

muditarya31 · September 14, 2020, 3:05pm

https://drive.google.com/file/d/1_qT3Kb4o9P3jj_0ecOWoxFB8Dxe7WApu/view?usp=sharing
This is my code. If i’m wrong, please tell me how to use generators correctly to solve this problem.

prashant_ml · September 14, 2020, 5:37pm

hey @muditarya31 ,
can you let me know , around on average how much time it is taking to get it done.
Because i just tried , and due to 40000 records it just took me 2 min to get all done.

muditarya31 · September 15, 2020, 8:07am

this is the problem i’m facing. Not sure if there will be more such errors in the program after this one gets resolved.

prashant_ml · September 15, 2020, 10:35am

Try changing review to [review]

I hope it helps you.

muditarya31 · September 15, 2020, 5:11pm

even after solving this issue by writing [review] instead of review, I’m getting vectors of different size(length). How to resolve this problem? Is there something wrong with the way I have used the generator? Kindly help

prashant_ml · September 16, 2020, 6:04am

it is happening as such because you are fitting your count vectorizer on each sentence individually rather what you need to do is to fit that count vectorizer on all sentences at once.
So just do

from sklearn.feature_extraction.text import CountVectorizer cv = CountVectorizer(ngram_range=(1,1)) genvec = (i for i in Xclean) #generator for vectors Xtrain = [] vec = cv.fit_transform(Xclean) vec

I hope this will help you .

muditarya31 · September 16, 2020, 7:49am

problem still not solved.

prashant_ml · September 16, 2020, 10:20am

dont use toarray() , as it uses a lot of memory space to store the data.

muditarya31 · September 16, 2020, 11:18am

now there’s a new error/warning that has occured preventing the program from running

does someone have its solution? like complete solution. I have been trying to solve this assignment since days. please help.

muditarya31 · September 16, 2020, 11:23am

i have already shared complete code. sharing it again…
https://drive.google.com/file/d/11_WLIasMP2ZYVVfMqVdF9t51s_lgFOIn/view?usp=sharing

prashant_ml · September 16, 2020, 1:16pm

this is the link to your modified code
https://colab.research.google.com/drive/11DohoTJU76cT-0VO78EhzX5Yc5thRuLI

although you need to try something else as it is taking a hell lot of time to complete.

muditarya31 · September 16, 2020, 3:12pm

The access is denied for this link.
Also i don’t know any other way of doing this. I have done this according to what we were taught in the videos. Can you provide me the ideal solution that was expected for this assignment?

prashant_ml · September 16, 2020, 4:01pm

I have provided the access.
and by other technique i mean , you can try other alogithm like Naive bayes from sklearn , or you need to optimize your code .
Because with your current approach , its taking a lot time.