I have a simple dataframe look like this:
import polars as pl
df = pl.DataFrame({
'ref': ['a', 'b', 'c', 'd', 'e', 'f'],
'idx': [4, 3, 1, 6, 2, 5],
})
How can I obtain the result as creating a new column as ref[idx], which is dynamic index from another column?
out = pl.DataFrame({
'ref': ['a', 'b', 'c', 'd', 'e', 'f'],
'idx': [4, 3, 1, 6, 2, 5],
'ref[idx]': ['d', 'c', 'a', 'f', 'b', 'e'],
})
shape: (6, 3)
┌─────┬─────┬──────────┐
│ ref ┆ idx ┆ ref[idx] │
│ --- ┆ --- ┆ --- │
│ str ┆ i64 ┆ str │
╞═════╪═════╪══════════╡
│ a ┆ 4 ┆ d │
│ b ┆ 3 ┆ c │
│ c ┆ 1 ┆ a │
│ d ┆ 6 ┆ f │
│ e ┆ 2 ┆ b │
│ f ┆ 5 ┆ e │
└─────┴─────┴──────────┘
Polars has .get() / .gather() expressions for extracting values by index.
df.with_columns(
pl.col("ref").get(pl.col("idx") - 1).alias("ref[idx]")
)
shape: (6, 3)
┌─────┬─────┬──────────┐
│ ref ┆ idx ┆ ref[idx] │
│ --- ┆ --- ┆ --- │
│ str ┆ i64 ┆ str │
╞═════╪═════╪══════════╡
│ a ┆ 4 ┆ d │
│ b ┆ 3 ┆ c │
│ c ┆ 1 ┆ a │
│ d ┆ 6 ┆ f │
│ e ┆ 2 ┆ b │
│ f ┆ 5 ┆ e │
└─────┴─────┴──────────┘
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With