I have the following dataframe:
df = pd.DataFrame({'Params': {0: (400, 30),
1: (2000, 10),
2: (1200, 10),
3: (2000, 30),
4: (1600, None)},
'mean_test_score': {0: -0.6197478578718253,
1: -0.6164605619489576,
2: -0.6229674626212879,
3: -0.7963084775995496,
4: -0.7854265341671137}})
I wish to sort it according to the first element of the tuples in the first column.
First column of the desired output:
{'Params': {0: (400, 30),
2: (1200, 10),
4: (1600, 10),
1: (2000, 10),
3: (2000, 30),
I have tried to use df.sort_values(by=('Params'), key=lambda x:x[0]) like I would do with a list and .sort but I get the following value error: ValueError: User-provided key function must not change the shape of the array.
I have looked at the documentation of sort_values() but it did not help much about why lambda does not work.
EDIT: Following @DeepSpace suggestion, I can't do
df.sort_values(by='Params') gives '<' not supported between instances of 'NoneType' and 'int'
The document of sort_values() says
keyshould expect aSeriesand return a Series with the same shape as the input.
In df.sort_values(by=('Params'), key=lambda x:x[0]), the x is actually the Params column. By accessing x with x[0], you are returning the first element of x Series, which is not the same shape as input Series. Thus gives you the error.
If you want to sort by the first element of tuple, you can do
df.sort_values(by='Params', key=lambda col: col.map(lambda x: x[0]))
# or
df.sort_values(by='Params', key=lambda col: col.str[0])
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With