I have a dataframe df like this:
| region | model | metrics |
|---|---|---|
| Tokyo | ARIMA | 0.1 |
| Tokyo | FeedForward | 0.2 |
| Tokyo | DeepAR | 0.3 |
| Osaka | ARIMA | 0.5 |
| Osaka | FeedForward | 0.2 |
| Osaka | DeepAR | 0.1 |
I want to group this by region and return the minimum value of metrics in each group, as well as the model value where the metrics is minimum.
The expected result:
| region | model | metrics |
|---|---|---|
| Tokyo | ARIMA | 0.1 |
| Osaka | DeepAR | 0.1 |
I tried to do it like below, but I'm not sure how I can complete it:
df.groupby("region").agg({'metrics':'min', ####... })
Maybe use argmin? Any help will be appreciated. Thanks!
You can find the index of the minimum metric of each group and then loc with it into the original frame:
# not sorting to keep the original order of appearance of regions
min_inds = df.groupby("region", sort=False).metrics.idxmin()
result = df.loc[min_inds]
to get
>>> result
region model metrics
0 Tokyo ARIMA 0.1
5 Osaka DeepAR 0.1
(may the forecasting be fun!)
How about sort by value of metrics and drop duplicates remaining the smallest one like this.
df.sort_values("metrics").drop_duplicates(['region'], keep='first')
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With