Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

pandas sort dataframe by column that includes numbers and letters

Tags:

python

pandas

I need to sort a dataframe by one column, which includes a combination of numbers and letters.

df = [{"user": "seth",
       "name": "1"},
     {"user" : "chris",
       "name": "10A"},
     {"user" : "aaron",
       "name": "4B"},
     {"user" : "dan",
       "name": "10B"}]

My code:

df1 = df.sort_values(by=['name'])

This gets me:

df1 = [{"user": "seth",
       "name": "1"},
     {"user" : "chris",
       "name": "10A"},
     {"user" : "dan",
       "name": "10B"},
     {"user" : "aaron",
       "name": "4B"}]

I want:

df1 =    [{"user": "seth",
           "name": "1"},
         {"user" : "aaron",
           "name": "4B"},
         {"user" : "chris",
           "name": "10A"},
         {"user" : "dan",
           "name": "10B"}]

I had a different question that was flagged as a similar question, and their code:

   df.reindex(index=natsorted(df.name))

It returns a sorted dataframe, but all values have been replaced by NaNs.

  df.iloc(natsorted(df.name))

It raises an error:

TypeError: unhashable type: 'list'
like image 624
FallingInForward Avatar asked Jan 17 '26 08:01

FallingInForward


1 Answers

To slightly correct Quang's comment, this works fine

import natsort

df1.iloc[natsort.index_humansorted(df1.name)]
like image 170
Igor Rivin Avatar answered Jan 19 '26 21:01

Igor Rivin



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!