split the text list into nGrams into Python

advertisements

I have to split a text file into a specific amount of words per list in list, probably be best to show in example.

say the text file looks like this

"i am having a good day today"

i have to write a function which looks like this

ngrams.makeNGrams("ngrams.txt", 2)
#so since the given variable says 2 the output should look like this:

[['i', 'am'],['am', 'having'],['having', 'a'],['a',’good’],[’good’, ’day’],[’day’,’today’]]

if the function looked like this

ngrams.makeNGrams("ngrams.txt", 3)

#it should give out:

[[’i’,’am’,’having’],[’having’,’a’,’good’],[’good’,’day’,’today’]]

Does anybody now how i should deal with this best ? thanks a lot in Advance


Define:

def ngrams(text, n):
    words = text.split()
    return [ words[i:i+n] for i in range(len(words)-n+1) ]

And use:

s = "i am having a good day today"
ngrams(s, 2)