Movie review project

Ashu1318 · March 8, 2021, 4:16pm

sir, its huge amt of dataset arround 40K documents how can we use countVectorizer its taking lot of time and also my laptop freezes for sometime while running the code

prashant_ml · March 8, 2021, 6:16pm

hey @Ashu1318 ,
for this so much large dataset you need to use something else to understand the reviews.

but , can you please let me know ,whether are you using any toarray() function ? if yess than remove that first and try .

Ashu1318 · March 9, 2021, 9:12am

yes im using but after removing how can we access vector from sparse matrix,im whats the alternative??

Ashu1318 · March 9, 2021, 9:13am

google colab pr try krna chahiye kya ??

prashant_ml · March 9, 2021, 2:13pm

alternative nhi.
just using count vectorizer makes a reference , but toarray stores in memory .
and the point it crashes.

so , you can still access it.

Google colab pr bhi yehi error aaega

Ashu1318 · March 11, 2021, 10:36am

github.com

0013am/data-science/blob/main/Naive Bayes/MovieReviewProject.ipynb

{
 "cells": [
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [],
   "source": [
    "import pandas as pd\n",
    "import numpy as np"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",

This file has been truncated. show original

Ashu1318 · March 11, 2021, 10:37am

its 50% accurate only can u plz check it

Ashu1318 · March 11, 2021, 10:45am

tried bigrams and trigrams also

prashant_ml · March 12, 2021, 4:24pm

hey @Ashu1318 ,
there are many things that you need tot learn about cleaning the data.

Its not always that you had remove the extra characters , some time , at some place they might be useful.
Dealing with urls , html tags , etc. everything matters .

SO you need to try upon that a lot.
Its not just that using biagrams or triagrams will make it work.

Ashu1318 · March 15, 2021, 4:15pm

sir dont close this chat i will ask doubts later

prashant_ml · March 16, 2021, 5:41am

No problem buddy.
You can raise them as new doubts.
there wont be any problem in that.

prashant_ml · March 20, 2021, 8:13am

I hope I’ve cleared your doubt. I ask you to please rate your experience here
Your feedback is very important. It helps us improve our platform and hence provide you
the learning experience you deserve.

On the off chance, you still have some questions or not find the answers satisfactory, you may reopen
the doubt.