Do part 1 first; Then do part 2.
Part 1. Write a program to count the number of words and lines in a text file.
Part 2. Collect statistics on the number and lengths of words.
Print the statistics? Plot the statistics?
How many unique words are there in the text file. e.g. How many times does the word "the" appear? How many times do the other words appear?
How many sentences end with a period? A question mark? ...
Declaration of Independence
United States Constitution
Your favorite magazine story or book
This document (screen scrape the text and paste into a text file)
1. How to distinguish words in a text file?
(separated/terminated by spaces, punctuation, EOS, EOF?)
2. What to do with non-text files? How can you know it is a text file?
3. Collect statistics on sentence length?
Linux/Unix have built-in command "wc" which can counts the words in a document. In this project you are to write a program in Python3, but it is interesting to read the "wc" documentation and try it out.
to display wc documentation
to run wc