I have an array of strings like
urls_parts=['week', 'weeklytop', 'week/day']
And i need to monitor inclusion of this strings in my url, so this example needs to be triggered by weeklytop part only:
url='www.mysite.com/weeklytop/2'
for part in urls_parts:
if part in url:
print part
But it is of course triggered by 'week' too. What is the way to do it right?
OOps, let me specify my question a bit. I need that code not to trigger when url='www.mysite.com/week/day/2' and part='week' The only url needed to trigger on is when the part='week' and the url='www.mysite.com/week/2' or 'www.mysite.com/week/2-second' for example
This is how I would do it.
import re
urls_parts=['week', 'weeklytop', 'week/day']
urls_parts = sorted(urls_parts, key=lambda x: len(x), reverse=True)
rexes = [re.compile(r'{part}\b'.format(part=part)) for part in urls_parts]
urls = ['www.mysite.com/weeklytop/2', 'www.mysite.com/week/day/2', 'www.mysite.com/week/4']
for url in urls:
for i, rex in enumerate(rexes):
if rex.search(url):
print url
print urls_parts[i]
print
break
OUTPUT
www.mysite.com/weeklytop/2
weeklytop
www.mysite.com/week/day/2
week/day
www.mysite.com/week/4
week
Suggestion to sort by length came from @Roman
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With