Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Finding word frequencies - without counter

I'm a beginner learning Python 3.3 through http://GrokLearning.com

My objective is to write a Word Counter program that reads multiple lines of plain text from the user, then prints out each different word from user input with a count of how many times the word occurs. All input will be lowercase words only - no punctuation or numbers. The output list will be in alphabetical order.

The program does not accept any submission with Counter or Collections. When I submit solutions found on Stack Exchange with Counter, the editor just pretends the Counter code doesn't exist.

This is what I have so far:

all = []
count = {}
line = input("Enter line: ")
while line:
    word = line.split()
    line = input("Enter line: ")
    for w in word:
        count[w] = word.count(w)
for word in sorted(count):
    print(word, count[word])

The problem with my code: if a word is repeated on multiple lines, the code will only count occurrences on the last line the word appeared (instead of total occurrences).

> this is another test test
> test test test test test
> test test test
> 
another 1
is 1
test 3
this 1

I know I did not utilize my list "all". I had tried all.append(word) to make a list of all words the user entered, but my code counted 0 (perhaps because the last line needs to be empty to end the while loop?)

For reference, I have gone through all of the free modules, but not any of the paid ones. Forgive me: since my knowledge is limited, please explain your answer in simple terms.

like image 960
Jessica Avatar asked Oct 15 '25 16:10

Jessica


1 Answers

The problem is in here:

for w in word:
    count[w] = word.count(w)

In your code, you don't add to your count. Instead, you reset the count every time you encounter a word. For example, if count['this'] was 1 before, the next time you encounter it, you set the count to 1 again instead of adding 1 to it.

The second problem is with the expression word.count(w). It is a count of how many times a word appears on a line, at the same time, the loop iterate through every words. That means if you correctly update (instead of reset) your count, you will be counting too many.

For example, if the line has three 'test', then you will be updating the count by 3 x 3 = 9.

To fix the problem, you need to address two cases:

  • If a word is already in the count (i.e. you have seen that word before), then increase the count by 1
  • If the word is not in the count, this is the first time you see it, set the count to 1

Here is a suggestion:

for w in word:
    if w in count:
        count[w] += 1
    else:
        count[w] = 1
like image 198
Hai Vu Avatar answered Oct 19 '25 13:10

Hai Vu



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!