I have a data frame with a 250 names with values imported in python via pandas read_csv. It reads in the data:
| name | val1 | val2 | val3 |
|---|---|---|---|
| George | 2.5 | 1.1 | 1.0 |
| George | 3.1 | 1.4 | 0.0 |
| George | 1.1 | 0.9 | 4.1 |
| Tom | 2.1 | 1.2 | -3.0 |
| Tom | 3.0 | -1.2 | 3.5 |
| Tom | 7.3 | 5.2 | -1.2 |
| Tom | 0.1 | 0.1 | 0.1 |
| ... | ... | ... | ... |
| Sally | 6.1 | 9.1 | -5.6 |
| Sally | 5.7 | 4.7 | 9.1 |
I want to reorder these by a particular order:
neworder = ['Sally', ..., 'George', 'Tom']
| name | val1 | val2 | val3 |
|---|---|---|---|
| Sally | 6.1 | 9.1 | -5.6 |
| Sally | 5.7 | 4.7 | 9.1 |
| ... | ... | ... | ... |
| George | 2.5 | 1.1 | 1.0 |
| George | 3.1 | 1.4 | 0.0 |
| George | 1.1 | 0.9 | 4.1 |
| Tom | 2.1 | 1.2 | -3.0 |
| Tom | 3.0 | -1.2 | 3.5 |
| Tom | 7.3 | 5.2 | -1.2 |
| Tom | 0.1 | 0.1 | 0.1 |
In IDL I would do this with some for loops, but I suspect there's a sorting function in Python that my google skills have not been able to find.
Create a lookup dictionary for your sort somehow:
name_order = {'Sally':1, ... , 'George':12, 'Tom':13} # hand-numbered
neworder = ['Sally', ... , 'George', 'Tom']
name_order = {nm:ix for ix,nm in enumerate(neworder)} # generated
And then pass it in a lambda function to the key parameter:
df.sort_values(by='name', key=lambda nm: nm.map(name_order))
I'd need to think a bit about what happened if an unexpected name appeared; you might be able to cope with this by making name_order a collections.defaultdict.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With