Quiz Doubt- Can you explain how the answer of the following question is 5 instead of 4

How many trigrams phrases can be generated from the following sentence, after performing following text cleaning steps:

  • Removing stopwords
  • Replacing punctuations by single space
    #Coding-Blocks is a great source to learn @machine_learning.”

Hi Anurag,
After cleaning the sentence with these 2 constraints, It would be like this :
" Coding Blocks great source learn machine learning "
Now take trigrams from this sentence

Coding Blocks great
Blocks great source
great source learn
source learn machine
learn machine learning

These are 5 possible trigrams

I hope this clears your doubt
Thanks :slight_smile:

Underscore is not considered a punctuation in english ,i have checked it using various sources ,and it is specifically mentioned in the question statement that we have to put space wherever there is a punctuation. So machine_learning should have been as it is. But maybe in natural language processing it might be valid and logical to think of it as a punctuation and remove it but we were not told so explicitly.

With second statement, It was meant all the non alphabets characters should be removed.
removing punctuations is [^a-zA-Z0-9] means remove all character which are not alphanumeric.