Error loading in files

Jan19LPN0013 · July 1, 2020, 5:19pm

I cannot able to load the file shown in the videos here is the screenshot the error:-

prashant_ml · July 1, 2020, 6:00pm

hey @Jan19LPN0013 ,
kindly download file again in proper format as of .bin file. Stating binary =True , means the file is in .bin format. As you are providing a text file , it raises an error.
or Try reading file like this:

from gensim.test.utils import datapath, get_tmpfile
from gensim.models import KeyedVectors
from gensim.scripts.glove2word2vec import glove2word2vec

glove_file = datapath(‘test_glove.txt’)
tmp_file = get_tmpfile(“test_word2vec.txt”)
_ = glove2word2vec(glove_file, tmp_file)

model = KeyedVectors.load_word2vec_format(tmp_file)

where the files are as follows
GloVe format (real example can be founded on Stanford size) # this test_glove.txt file format
word1 0.123 0.134 0.532 0.152
word2 0.934 0.412 0.532 0.159
word3 0.334 0.241 0.324 0.188 …
word9 0.334 0.241 0.324 0.188

Word2Vec format (real example can be founded on w2v old repository)# this test_word2vec.txt file format

9 4 # this line states the number of words and len of each vector
word1 0.123 0.134 0.532 0.152
word2 0.934 0.412 0.532 0.159
word3 0.334 0.241 0.324 0.188 …
word9 0.334 0.241 0.324 0.188

If your file follows the above formats , then it will be read . Else not.
I hope this would have helped you.
Thank You and Happy learning .

Jan19LPN0013 · July 1, 2020, 7:15pm

now also i didnt get how to use it plz exlain with code related to the file which was given in question

prashant_ml · July 1, 2020, 7:41pm

A glove file is in this format as name of file is glove.6B.50d.txt.
whenever you are reading txt file containing word embeddings ,
you read it like

importing some methods first

from gensim.test.utils import datapath, get_tmpfile
from gensim.models import KeyedVectors
from gensim.scripts.glove2word2vec import glove2word2vec

Reading glove file

glove_file = datapath(‘test_glove.txt’)

creating temp file to store word2vec result in

tmp_file = get_tmpfile(“test_word2vec.txt”)

Converting glove file to word2vec compatible file

_ = glove2word2vec(glove_file, tmp_file)

Loading the resulted file finally into a model variable.

model = KeyedVectors.load_word2vec_format(tmp_file)

now you can perform all your tasks using this model variable.
I hope this helps.

prashant_ml · July 1, 2020, 7:59pm

else than this you can make a dictionary
like this

Now that you have your dictionary , you can implement you tasks on your own from scratch.

Might this be helpful to you.