Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why count() method is faster than a for loop python

Tags:

python

Here are 2 functions that do exactly the same thing, but does anyone know why the one using the count() method is much faster than the other? (I mean how does it work? How is it built?)

If possible, I'd like a more understandable answer than what's found here : Algorithm used to implement the Python str.count function or what's in the source code : https://hg.python.org/cpython/file/tip/Objects/stringlib/fastsearch.h

def scoring1(seq):
    score = 0
    for i in range(len(seq)):
       if seq[i] == '0':
           score += 1      
    return score

def scoring2(seq):
    score = 0
    score = seq.count('0') 
    return score

seq = 'AATTGGCCGGGGAG0CTTC0CTCC000TTTCCCCGGAAA'
# takes 1min15 when applied to 100 sequences larger than 100 000 characters
score1  = scoring1(seq)
# takes 10 sec when applied to 100 sequences larger than 100 000 characters
score2  = scoring2(seq)

Thanks a lot for your reply

like image 457
Natha Avatar asked Nov 16 '25 05:11

Natha


1 Answers

@CodeMonkey has already given the answer, but it is potentially interesting to note that your first function can be improved so that it runs about 20% faster:

import time, random

def scoring1(seq):
    score=0
    for i in range(len(seq)):
       if seq[i]=='0':
           score+=1      
    return score

def scoring2(seq):
    score=0
    for x in seq:
       score += (x =='0')    
    return score

def scoring3(seq):
    score = 0
    score = seq.count('0') 
    return score

def test(n):
    seq = ''.join(random.choice(['0','1']) for i in range(n))
    functions = [scoring1,scoring2,scoring3]
    for i,f in enumerate(functions):
        start = time.clock()
        s = f(seq)
        elapsed = time.clock() - start
        print('scoring' + str(i+1) + ': ' + str(s) + ' computed in ' + str(elapsed) + ' seconds')

test(10**7)       

Typical output:

scoring1: 5000742 computed in 0.9651326495293333 seconds
scoring2: 5000742 computed in 0.7998054195159483 seconds
scoring3: 5000742 computed in 0.03732172598339578 seconds

Both of the first two approaches are blown away by the built-in count().

Moral of the story: when you are not using an already optimized built-in method, you need to optimize your own code.

like image 185
John Coleman Avatar answered Nov 17 '25 18:11

John Coleman



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!