Showing error array size is too big, during vectorization

Here, X is an array of sentences which wave been cleaned(stopwords removed and stemming done)

Hey @SanchitSayala this error is coming as you are trying to store a lot of data in a single variable. I would recommend you to use google colaboratory to do this code if possible which provides greater RAM and computing power.

Also u can try this - do not write .toarray() with countvectorizer , let it be a sparse matrix, this also worked for me.

I hope this works ! :slight_smile:
Please mark the doubt as resolved in your doubts section ! :+1:
Happy Learning !

I’m sure it’s not an issue with the RAM cuz I think I have enough, at least more than what google colab provide for free…
Anyways, it’s a good idea to not convert the sparse matrix to an array :+1:

It’s great that your problem has been solved. But google colaboratory provides 24GB of RAM for free. So yeah it definitely has more RAM than what u have on your laptop.

Please mark the doubt as resolved in your doubts section ! :+1:
Happy Learning ! :slight_smile: