I have the following DataFrame
import polars as pl
pl.Config(fmt_table_cell_list_len=8, fmt_str_lengths=100)
data = {
'col': [[11, 21, 31, 41, 51], [12, 22, 32, 42, 52], [13, 23, 33, 43, 53]]
}
df = pl.DataFrame(data)
shape: (3, 1)
┌──────────────────────┐
│ col │
│ --- │
│ list[i64] │
╞══════════════════════╡
│ [11, 21, 31, 41, 51] │
│ [12, 22, 32, 42, 52] │
│ [13, 23, 33, 43, 53] │
└──────────────────────┘
Starting from the first element of each list, I want to divide every two elements of the list with a number, and then starting from the second element of the list, divide again every two elements with another number. For example, if these two numbers are 5 and 10 respectively, the first list will be transformed like this
[11/5, 21/10, 31/5, 41/10, 51/5]
resulting in
[2.2, 2.1, 6.2, 4.1, 10.2]
I want to do the same transformation for all the lists of the column. How can I do that using the polars API?
┌────────────────────────────┐
│ col │
│ --- │
│ list[f64] │
╞════════════════════════════╡
│ [2.2, 2.1, 6.2, 4.1, 10.2] │
│ [2.4, 2.2, 6.4, 4.2, 10.4] │
│ [2.6, 2.3, 6.6, 4.3, 10.6] │
└────────────────────────────┘
Use list.eval and when/then/otherwise
import polars as pl
data = {"col": [[11, 21, 31, 41, 51], [12, 22, 32, 42, 52], [13, 23, 33, 43, 53]]}
df = pl.DataFrame(data)
def func(x):
return pl.when(x.cum_count() % 2 == 0).then(x / 5).otherwise(x / 10)
df.select(result=pl.col("col").list.eval(func(pl.element())))
shape: (3, 1)
┌────────────────────┐
│ result │
│ --- │
│ list[f64] │
╞════════════════════╡
│ [2.2, 2.1, … 10.2] │
│ [2.4, 2.2, … 10.4] │
│ [2.6, 2.3, … 10.6] │
└────────────────────┘
If your lists are the same length, you can create list of alternating 5 and 10 and divide your column by it
np.resize() to create a list of alternating valuesimport numpy as np
weights = np.resize([5, 10], 5)
df.select(pl.col.col / pl.lit(list(weights)))
shape: (3, 1)
┌────────────────────┐
│ col │
│ --- │
│ list[f64] │
╞════════════════════╡
│ [2.2, 2.1, … 10.2] │
│ [2.4, 2.2, … 10.4] │
│ [2.6, 2.3, … 10.6] │
└────────────────────┘
If lists are of dynamic length, you can use the fact that latest version 1.10.0 introduced arithmetic operations between lists and scalars. So now you can easily create list of alternating values.
pl.int_ranges() to create lists of integers.pl.Expr.list.len() to use length of existing lists.# lists of [1,0,1,0,..]
df.select(
result = pl.int_ranges(pl.col.col.list.len()) % 2
)
# lists of [5,0,5,0,..]
df.select(
result = 5 * (pl.int_ranges(pl.col.col.list.len()) % 2)
)
# lists of [5,10,5,,..]
df.select(
result = 5 + 5 * (pl.int_ranges(pl.col.col.list.len()) % 2)
)
df.select(
result = pl.col.col / (5 + 5 * (pl.int_ranges(pl.col.col.list.len()) % 2))
)
shape: (3, 1)
┌────────────────────┐
│ result │
│ --- │
│ list[f64] │
╞════════════════════╡
│ [2.2, 2.1, … 10.2] │
│ [2.4, 2.2, … 10.4] │
│ [2.6, 2.3, … 10.6] │
└────────────────────┘
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With