I have a polars dataframe that contain some ID, actions, and values :
Example Dataframe:
data = {
    "ID" : [1, 1, 2,2,3,3],
    "Action" : ["A", "A", "B", "B", "A", "A"],
    "Where" : ["Office", "Home", "Home", "Office", "Home", "Home"],
    "Value" : [1, 2, 3, 4, 5, 6]
}
df = pl.DataFrame(data)
I want to select for each ID and action the biggest value, so i know where he rather do the action.
I'm taking the following approach :
(
    df
    .select(
        pl.col("ID"),
        pl.col("Action"),
        pl.col("Where"),
        TOP = pl.col("Value").max().over(["ID", "Action"]))
)
After that , i sorted the values and keep the unique values (The first one) to maintain the desired info, however the input its incorrect :
(
    df
    .select(
        pl.col("ID"),
        pl.col("Action"),
        pl.col("Where"),
        TOP = pl.col("Value").max().over(["ID", "Action"]))
    .sort(
        pl.col("*"), descending =True
    )
    .unique(
        subset = ["ID", "Action"],
        maintain_order = True,
        keep = "first"
    )
)
Current Output :
shape: (3, 4)
┌─────┬────────┬────────┬─────┐
│ ID  ┆ Action ┆ Where  ┆ TOP │
│ --- ┆ ---    ┆ ---    ┆ --- │
│ i64 ┆ str    ┆ str    ┆ i64 │
╞═════╪════════╪════════╪═════╡
│ 3   ┆ A      ┆ Home   ┆ 6   │
│ 2   ┆ B      ┆ Office ┆ 4   │
│ 1   ┆ A      ┆ Office ┆ 2   │
└─────┴────────┴────────┴─────┘
Expected Output:
shape: (3, 4)
┌─────┬────────┬────────┬─────┐
│ ID  ┆ Action ┆ Where  ┆ TOP │
│ --- ┆ ---    ┆ ---    ┆ --- │
│ i64 ┆ str    ┆ str    ┆ i64 │
╞═════╪════════╪════════╪═════╡
│ 3   ┆ A      ┆ Home   ┆ 6   │
│ 2   ┆ B      ┆ Office ┆ 4   │
│ 1   ┆ A      ┆ Home   ┆ 2   │
└─────┴────────┴────────┴─────┘
Also, i think this approach its not the optimal way
The over and unique could be combined into a group_by
.arg_max() can give you the index of the max.get() will extract the corresponding values at that index(df.group_by("ID", "Action")
   .agg(
      pl.all().get(pl.col("Value").arg_max())
   )
)
shape: (3, 4)
┌─────┬────────┬────────┬───────┐
│ ID  ┆ Action ┆ Where  ┆ Value │
│ --- ┆ ---    ┆ ---    ┆ ---   │
│ i64 ┆ str    ┆ str    ┆ i64   │
╞═════╪════════╪════════╪═══════╡
│ 1   ┆ A      ┆ Home   ┆ 2     │
│ 2   ┆ B      ┆ Office ┆ 4     │
│ 3   ┆ A      ┆ Home   ┆ 6     │
└─────┴────────┴────────┴───────┘
You can use the following builtin in function in polars to do this easily.
with_columns: this provides you with with the information based on col(value) over both ID and Action taken.
sort: this just sorts in descending order
unique : this allows you to drop all duplicates
import polars as pl
data = {
    "ID": [1, 1, 2, 2, 3, 3],
    "Action": ["A", "A", "B", "B", "A", "A"],
    "Where": ["Office", "Home", "Home", "Office", "Home", "Home"],
    "Value": [1, 2, 3, 4, 5, 6]
}
df = pl.DataFrame(data)
result = (
    df
    .with_columns(
        TOP=pl.col("Value").max().over(["ID", "Action"])
    )
    .sort(by="TOP", descending=True)
    .unique(subset=["ID", "Action"], maintain_order=True, keep="first")
)
print(result)
shape: (3, 5)
┌─────┬────────┬────────┬───────┬─────┐
│ ID  ┆ Action ┆ Where  ┆ Value ┆ TOP │
│ --- ┆ ---    ┆ ---    ┆ ---   ┆ --- │
│ i64 ┆ str    ┆ str    ┆ i64   ┆ i64 │
╞═════╪════════╪════════╪═══════╪═════╡
│ 3   ┆ A      ┆ Home   ┆ 5     ┆ 6   │
│ 2   ┆ B      ┆ Home   ┆ 3     ┆ 4   │
│ 1   ┆ A      ┆ Office ┆ 1     ┆ 2   │
└─────┴────────┴────────┴───────┴─────┘
Alternatively you can go with groupby and filter which eliminates the use of unique even tho they sound the same
result = (
    df.group_by(["ID", "Action"]).agg(
        [
            pl.col("Value").max().alias("TOP"),
            pl.col("Where")
            .filter(pl.col("Value") == pl.col("Value").max())
            .first()
            .alias("Where"),
        ]
    )
    .sort("TOP", descending=True)
)
another result:
shape: (3, 4)
┌─────┬────────┬─────┬────────┐
│ ID  ┆ Action ┆ TOP ┆ Where  │
│ --- ┆ ---    ┆ --- ┆ ---    │
│ i64 ┆ str    ┆ i64 ┆ str    │
╞═════╪════════╪═════╪════════╡
│ 3   ┆ A      ┆ 6   ┆ Home   │
│ 2   ┆ B      ┆ 4   ┆ Office │
│ 1   ┆ A      ┆ 2   ┆ Home   │
└─────┴────────┴─────┴────────┘
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With