Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Sorting a list doesn't produce the right result

python question here:

I'm running a sort function to sort some data by dates, and get incorrect output. I've prepared a short version of my code with some sample data to show the error (the full code is uninteresting and the full real data is proprietary).

Here is the code:

import operator

mylist = [['CustomerID_12345', 'TransactionID_1001', '12/31/2012'],
['CustomerID_12345', 'TransactionID_1002', '3/12/2013'],
['CustomerID_12345', 'TransactionID_1003', '1/7/2013'],
['CustomerID_12345', 'TransactionID_1004', '12/31/2012']]


sorted_list = sorted(mylist, key=operator.itemgetter(2))


print type(mylist)
print len(mylist)

for i in mylist:
    print i

print ""        # just for a line break for convenience

for i in sorted_list:
    print i

and the output is:

<type 'list'>
4
['CustomerID_12345', 'TransactionID_1001', '12/31/2012']
['CustomerID_12345', 'TransactionID_1002', '3/12/2013']
['CustomerID_12345', 'TransactionID_1003', '1/7/2013']
['CustomerID_12345', 'TransactionID_1004', '12/31/2012']

['CustomerID_12345', 'TransactionID_1003', '1/7/2013']
['CustomerID_12345', 'TransactionID_1001', '12/31/2012']
['CustomerID_12345', 'TransactionID_1004', '12/31/2012']
['CustomerID_12345', 'TransactionID_1002', '3/12/2013']

the first block is the original data and the second is the output. Since I tried to sort by date it's easy to see the sort didn't work properly.

Can someone help explain the error and suggest how to correct it? Thanks in advance :)

like image 947
Optimesh Avatar asked Nov 28 '25 16:11

Optimesh


2 Answers

This is because python treats them as strings and not as dates.

This is because '1' is less than '2' which is less than '3' Also '/' is less than digits so there is your problem.

Instead try to compare them as dates, use the datetime module.

Here is a sample:

from datetime import datetime
your_date = datetime.strptime('1/1/2013', "%m/%d/%Y")
my_date = datetime.strptime('12/3/2011', "%m/%d/%Y")

print your_date > my_date
[Out]: True
like image 144
sshashank124 Avatar answered Dec 01 '25 04:12

sshashank124


Sort by date:

from datetime import datetime

mylist = [['CustomerID_12345', 'TransactionID_1001', '12/31/2012'],
        ['CustomerID_12345', 'TransactionID_1002', '3/12/2013'],
        ['CustomerID_12345', 'TransactionID_1003', '1/7/2013'],
        ['CustomerID_12345', 'TransactionID_1004', '12/31/2012']]


sorted_list = sorted(mylist, key=lambda x: datetime.strptime(x[2],'%m/%d/%Y'))
for item in sorted_list:
    print item

Or you can store the date as datetime in the first place. If they are strings for good reason then you can first add a datetime column:

for item in mylist:
    item.append(datetime.strptime(item[2], '%m/%d/%Y'))
sorted_list = sorted(mylist, key=lambda x: x[3])
for item in sorted_list: print item[:3]
like image 30
perreal Avatar answered Dec 01 '25 05:12

perreal