Word Embedding Challenge

def readFile(file):
f = open(file,‘r’,encoding=‘utf-8’)
text = f.read()
sentences = nltk.sent_tokenize(text)

data = []
for sent in sentences:
    words =  nltk.word_tokenize(sent)
    words = [w.lower() for w in words if len(w)>2 and w not in stopw]
    data.append(words)
    
return data

text = readFile(‘glove.6B.50d.txt’)

This is giving memory error how can I load the file then

Hey @dipansha.chhabra19, from this line text = readFile('glove.6B.50d.txt) i guess you are trying to read glove embeddings file, if yes than do it this way,

embeddings = {}

with open('./GloVE/glove.6B.50d.txt', 'r', encoding='utf-8') as f:
    for line in f:
        values = line.split()
        word = values[0]
        coeffs = np.array(values[1:], dtype="float32")
        
        embeddings[word] = coeffs 

Hope thi woks, and your doubt gets resolved.
Don’t forget to mark the doubt as resolved as well :blush:

I hope I’ve cleared your doubt. I ask you to please rate your experience here
Your feedback is very important. It helps us improve our platform and hence provide you
the learning experience you deserve.

On the off chance, you still have some questions or not find the answers satisfactory, you may reopen
the doubt.