.toarray() function fails on training set

In the movie rating prediction challenge, I am not able to call:
X_vect=cv.fit_transform(x_clean).toarray() due to memory error as there are 40000 rows in it. What can be the solution to this?

the error is- MemoryError: Unable to allocate array with shape (40000, 65742) and data type int64
Due to this, I am not able to call the mnb.predict() function on the test set to outcome the predictions.

Hi, there are 2 options:

  1. You dont need to convert the array into toarray(). Just leave it as a condensed-sparse matrix.
  2. You can use del keyword to free your memory. For e.g Use del x_test to delete the x_train. There are many ram taking objects that you have loaded in your memory. Like until training you wont be needing testing data, so use del test_data, to free up memory. You can also clean the prev Ywhich you might have used to convert it into categorical y_train.

I hope I’ve cleared your doubt. I ask you to please rate your experience here
Your feedback is very important. It helps us improve our platform and hence provide you
the learning experience you deserve.

On the off chance, you still have some questions or not find the answers satisfactory, you may reopen
the doubt.