WIP Wednesdays: Learning to use the NLTK

Happy from Emma by Jane Austen

Output from the Natural Language Tool Kit

I’ve been learning a little python and learning about the different functions of the Natural Language Toolkit, which are a bunch of programs written in python that are meant to help one process natural language data so that you can learn about it. This is from the natural language data’s program for concordancing. So, it has taken all of the instances of the word “happy” in Jane Austen’s novel Emma and has given us the text that immediately proceeds and immediately follows the word. Why would you want to do this? You would want to do this if you were interested in the environment the word is often found in. Maybe you have a hypothesis about how “happy” and “glad”, for example you think that they are used differently from each other. This would be a reasonable way to collect the data you’d want to test that hypothesis. You could do the same thing for “glad” and then compare all the words to the right and left for “happy” and all of the words to the right and left for “glad” and see how similar they are. You could also run a program that counted the words to see which words most often occurred with “happy” and which words most often occurred with “glad”.

