Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python - Count words in a given text

I'm new to coding so forgive me if I ask something that was already answered but believe me that I did search for answer and couldn't find it.

I have a task do count how many of given words are in given text. Word can be a hole or part of other word. Letter case does not matter. If word appears several times in the text, it should be counted only once. So far I managed to come to this:

def count_words(text, words):
    count = 0
    text = text.lower()
    for w in words:
        if w in text:
            count =+ 1

    print (count)

count_words("How aresjfhdskfhskd you?", {"how", "are", "you", "hello"})
count_words("Bananas, give me bananas!!!", {"banana", "bananas"})
count_words("Lorem ipsum dolor sit amet, consectetuer adipiscing elit.",
                       {"sum", "hamlet", "infinity", "anything"})

With that code I manage to get final count of 1 for all tree texts and of that only third is ok.

As I see it, my first problem is that my text.lower() doesn't do anything and I tough it should lower all cases.

My second problem is that in first case "are" isn't found in "aresjfhdskfhskd" but in third case "sum" is found in "ipsum". Both of that words are part od larger word but first isn't found and second is. Also, in second case result should be 2 because there are banana and bananas, similar but different.

Thanks in advance.

like image 601
DejanJovanovic Avatar asked Mar 25 '26 23:03

DejanJovanovic


2 Answers

Using sum and a generator expression, this seems the simplest solution:

text = text.lower()
count = sum(word in text for word in words)
# bools are cast to ints (0, 1) here
like image 61
user2390182 Avatar answered Mar 27 '26 15:03

user2390182


First - strings are immutable, so text.lower() is not changing text itself, but returns new instance - lowercased. Other problem is that if a in base checks if exists, without info how many times...

def count_words(text, words):
    count = 0
    lower_text = text.lower()
    for w in words:
        print w + " - " + str(lower_text.count(w))

print "1"
count_words("How aresjfhdskfhskd you?", {"how", "are", "you", "hello"})
print "2"
count_words("Bananas, give me bananas!!!", {"banana", "bananas"})
print "3"
count_words("Lorem ipsum dolor sit amet, consectetuer adipiscing elit.",
                   {"sum", "hamlet", "infinity", "anything"})
like image 29
Michał Zaborowski Avatar answered Mar 27 '26 13:03

Michał Zaborowski



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!