Embedding layer's doubt

yashaswiupmon · June 14, 2020, 11:17am

hey @S18CRX0120,@Aayushkh_333
how to solve the problem that pad_sequences will give the 0 ouptut.
and this problem
ValueError: setting an array element with a sequence.
https://colab.research.google.com/drive/1aqDbwgWWiwXqNR8ejLz_MmBD2lJNpAS7?usp=sharing

prashant_ml · June 14, 2020, 6:43pm

hey @yashaswiupmon ,
the problem is at the code line where are you using your count vectorizer , your count vectorizer prepares output of more than 50000 features and if you try to plot this , you will get it as : count_features

from this we can see that we are achieving our first value other than 0 at around after 9000 values and you are padding only till 500 length . That’s the reason you are achieving all 0’s in your array.

to solve it , either you need to more clean your data to be a bit shorten or you need to change that padding size from 500 to atleast 20000 to gain some information from your data.

I hope this would have helped in your doubt.
Thank You and Happy Coindg .

prashant_ml · June 15, 2020, 4:16am

hey @yashaswiupmon ,
the error is you are fitting your data on train_self array , instead you need to train it on train_s array,

refer this https://colab.research.google.com/drive/1SzTUGAIycldd4142azCn40dX2CCLneaB?usp=sharing

I have just updated the ending few code snippets , have a look ad if there is something , you can’t understand you can surely ask.

Thank You.

prashant_ml · June 15, 2020, 7:28am

hey @yashaswiupmon ,
It looks like your code is now resolved , i would request you to kindly mark this doubt as resolved in your course doubt section and provide your valuable feedback , as it helps us to improve this platform and provide you with better learning experience.

Thank You and Happy Coding .

yashaswiupmon · June 15, 2020, 10:02am

hey @prashant_ml
i have one doubt how can i predict for test data as whenever i try ram get crashed help me
thanking you
yashaswi upmon

prashant_ml · June 15, 2020, 10:36am

hey @yashaswiupmon ,
As initially you are storing a very large array into memory ,hence the ram will get crash everytime you run in this way.

To lower the memory usage :

You can create custom data generators , which generate data in small batches and hence require less memory storage.
After You have trained your model , just save your model , restart your runtime , again create the test data ( not the training one ) and get the results .

I would suggest you to go with 1 option , but you can also go for 2 if it works the way you want and your requirement is fulfilled.

yashaswiupmon · June 16, 2020, 8:29am

hey @prashant_ml
i got 46% percentage only using this method how can i increase the accuracy?

prashant_ml · June 16, 2020, 6:02pm

hey @yashaswiupmon ,
Increasing model performance is something you really need to work upon very much,
some ways :

Choosing the correct learning rate.
Early Stopping to stop model from over-fitting.
Regularization in between model.
Bidirectional LSTM to understand the text more accurately.
Choosing correct optimizer.
Choosing correct number of layers , number of nodes in layers , etc.
Processing Input , for this you can use pre trained word embeddings like Glove , FastText , etc. They will really help you to improve your model.

Above are some ways to improve a deep learning model , you need to work a lot upon them and make a model from combination of them to achieve better results.
-> Just try them , starting from pre trained word embeddings , if there is some problem while working on them you an surely ask me.

I hope this would help you to get better results.
Thank You and Happy Learning .

prashant_ml · January 6, 2021, 6:45pm

I hope I’ve cleared your doubt. I ask you to please rate your experience here
Your feedback is very important. It helps us improve our platform and hence provide you
the learning experience you deserve.

On the off chance, you still have some questions or not find the answers satisfactory, you may reopen
the doubt.