counting number of each substring in array python

Question

I have a string array for example [a_text, b_text, ab_text, a_text]. I would like to get the number of objects that contain each prefix such as ['a_', 'b_', 'ab_'] so the number of 'a_' objects would be 2.

so far I've been counting each by filtering the array e.g num_a = len(filter(lambda x: x.startswith('a_'), array)). I'm not sure if this is slower than looping through all the fields and incrementing each counter since I am filtering the array for each prefix I am counting. Are functions such as filter() faster than a for loop? For this scenario I don't need to build the filtered list if I use a for loop so that may make it faster.

Also perhaps instead of the filter I could use list comprehension to make it faster?

user3483203 · Accepted Answer

You can use collections.Counter with a regular expression (if all of your strings have prefixes):

from collections import Counter

arr = ['a_text', 'b_text', 'ab_text', 'a_text']
Counter([re.match(r'^.*?_', i).group() for i in arr])

Output:

Counter({'a_': 2, 'b_': 1, 'ab_': 1})

If not all of your strings have prefixes, this will throw an error, since re.match will return None. If this is a possibility, just add an extra step:

arr = ['a_text', 'b_text', 'ab_text', 'a_text', 'test']
matches = [re.match(r'^.*?_', i) for i in arr]
Counter([i.group() for i in matches if i])

Output:

Counter({'a_': 2, 'b_': 1, 'ab_': 1})

counting number of each substring in array python

Tags:

performance

python

list-comprehension

mysticalstick

1 Answers

user3483203

Recent Activity

Donate For Us

counting number of each substring in array python

Tags:

performance

python

list-comprehension

mysticalstick

1 Answers

user3483203

Related questions

Recent Activity

Donate For Us