Created
February 12, 2012 13:00
-
-
Save hodbby/1808362 to your computer and use it in GitHub Desktop.
text- wordcount task
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# Define print_words(filename) and print_top(filename) functions. | |
# You could write a helper utility function that reads a file | |
# and builds and returns a word/count dict for it. | |
# Then print_words() and print_top() can just call the utility function. | |
1. For the --count flag, implement a print_words(filename) function that counts | |
how often each word appears in the text and prints: | |
word1 count1 | |
word2 count2 | |
... | |
Print the above list in order sorted by word (python will sort punctuation to | |
come before letters -- that's fine). Store all the words as lowercase, | |
so 'The' and 'the' count as the same word. | |
2. For the --topcount flag, implement a print_top(filename) which is similar | |
to print_words() but which prints just the top 20 most common words sorted | |
so the most common word is first, then the next most common, and so on. |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment