Trying to convert a huge amount of records (time series) to int like this:
seconds_time = int(time.mktime(time.strptime(parts[0], '%Y%m%d %H%M%S')))
Unfortunately, this is the code's bottle neck (time-consuming increases by factor of about 20). Any suggestions to improve it?
Thanks in advance
Actually there's a way to drastically reduce parsing time.
import time
start = time.time()
nb_loops = 1000000
time_string = "20170101 201456"
for i in range(nb_loops):
seconds_time = int(time.mktime(time.strptime(time_string, '%Y%m%d %H%M%S')))
print(time.time()-start)
that first loop runs in 12 seconds. Not very good I admit.
But, since your format is simple, why not use integer conversion with slicing in a list comprehension (and add 0 for the missing fields like milliseconds, ...) and pass the result to mktime.
start = time.time()
for i in range(nb_loops):
seconds_time = time.mktime(tuple([int(time_string[s:e]) for s,e in ((0,4),(4,6),(6,8),(9,11),(11,13),(13,15))]+[0,0,0]))
print(time.time()-start)
that runs in 3 seconds (saves the parsing of the '%Y%m%d %H%M%S' format string, which seems to take a while).
Using compiled regular expressions is slightly faster:
import re
r = re.compile("(....)(..)(..) (..)(..)(..)")
start = time.time()
for i in range(nb_loops):
seconds_time = time.mktime(tuple(map(int,r.match(time_string).groups()))+(0,0,0))
print(time.time()-start)
results:
basic 14.41410493850708
string slicing 3.1356000900268555
regex 2.8703999519348145
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With