Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Check if string contains list item

Tags:

python

I have the following script to check if a string contains a list item:

word = ['one',
        'two',
        'three']
string = 'my favorite number is two'
if any(word_item in string.split() for word_item in word):
    print 'string contains a word from the word list: %s' % (word_item)

This works, but I'm trying to print the list item(s) that the string contains. What am I doing wrong?

like image 989
O P Avatar asked Nov 01 '25 18:11

O P


1 Answers

The problem is that you're using an if statement instead of a for statement, so your print only runs (at most) once (if at least one word matches), and at that point, any has run through the whole loop.

This is the easiest way to do what you want:

words = ['one',
         'two',
         'three']
string = 'my favorite number is two'
for word in words:
    if word in string.split():
        print('string contains a word from the word list: %s' % (word))

If you want this to be functional for some reason, you could do it like this:

for word in filter(string.split().__contains__, words):
    print('string contains a word from the word list: %s' % (word))

Since someone is bound to answer with a performance-related answer even though this question has nothing to do with performance, it would be more efficient to split the string once, and depending on how many words you want to check, converting it to a set might also be useful.


Regarding your question in the comments, if you want multi-word "words", there are two easy options: adding whitespace and then searching for the words in the full string, or regular expressions with word boundaries.

The simplest way is to add a space character before and after the text to search and then search for ' ' + word + ' ':

phrases = ['one',
           'two',
           'two words']
text = "this has two words in it"

for phrase in phrases:
    if " %s " % phrase in text:
        print("text '%s' contains phrase '%s'" % (text, phrase))

For regular expressions, just use the \b word boundary:

import re

for phrase in phrases:
    if re.search(r"\b%s\b" % re.escape(phrase), text):
        print("text '%s' contains phrase '%s'" % (text, phrase))

Which one is "nicer" is hard to say, but the regular expression is probably significantly less efficient (if that matters to you).


And if you don't care about word boundaries, you can just do:

phrases = ['one',
           'two',
           'two words']
text = "the word 'tone' will be matched, but so will 'two words'"

for phrase in phrases:
    if phrase in text:
        print("text '%s' contains phrase '%s'" % (text, phrase))
like image 70
Brendan Long Avatar answered Nov 04 '25 10:11

Brendan Long