Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Divide every nth element of a Polars list with a number

I have the following DataFrame

import polars as pl

pl.Config(fmt_table_cell_list_len=8, fmt_str_lengths=100)

data = {
    'col': [[11, 21, 31, 41, 51], [12, 22, 32, 42, 52], [13, 23, 33, 43, 53]]
}
df = pl.DataFrame(data)
shape: (3, 1)
┌──────────────────────┐
│ col                  │
│ ---                  │
│ list[i64]            │
╞══════════════════════╡
│ [11, 21, 31, 41, 51] │
│ [12, 22, 32, 42, 52] │
│ [13, 23, 33, 43, 53] │
└──────────────────────┘

Starting from the first element of each list, I want to divide every two elements of the list with a number, and then starting from the second element of the list, divide again every two elements with another number. For example, if these two numbers are 5 and 10 respectively, the first list will be transformed like this

[11/5, 21/10, 31/5, 41/10, 51/5]

resulting in

[2.2, 2.1, 6.2, 4.1, 10.2]

I want to do the same transformation for all the lists of the column. How can I do that using the polars API?

┌────────────────────────────┐
│ col                        │
│ ---                        │
│ list[f64]                  │
╞════════════════════════════╡
│ [2.2, 2.1, 6.2, 4.1, 10.2] │
│ [2.4, 2.2, 6.4, 4.2, 10.4] │
│ [2.6, 2.3, 6.6, 4.3, 10.6] │
└────────────────────────────┘
like image 820
exch_cmmnt_memb Avatar asked Oct 28 '25 11:10

exch_cmmnt_memb


2 Answers

Use list.eval and when/then/otherwise

import polars as pl

data = {"col": [[11, 21, 31, 41, 51], [12, 22, 32, 42, 52], [13, 23, 33, 43, 53]]}
df = pl.DataFrame(data)


def func(x):
    return pl.when(x.cum_count() % 2 == 0).then(x / 5).otherwise(x / 10)


df.select(result=pl.col("col").list.eval(func(pl.element())))
shape: (3, 1)
┌────────────────────┐
│ result             │
│ ---                │
│ list[f64]          │
╞════════════════════╡
│ [2.2, 2.1, … 10.2] │
│ [2.4, 2.2, … 10.4] │
│ [2.6, 2.3, … 10.6] │
└────────────────────┘
like image 190
ignoring_gravity Avatar answered Oct 30 '25 01:10

ignoring_gravity


If your lists are the same length, you can create list of alternating 5 and 10 and divide your column by it

  • np.resize() to create a list of alternating values
import numpy as np

weights = np.resize([5, 10], 5)
df.select(pl.col.col / pl.lit(list(weights)))
shape: (3, 1)
┌────────────────────┐
│ col                │
│ ---                │
│ list[f64]          │
╞════════════════════╡
│ [2.2, 2.1, … 10.2] │
│ [2.4, 2.2, … 10.4] │
│ [2.6, 2.3, … 10.6] │
└────────────────────┘

If lists are of dynamic length, you can use the fact that latest version 1.10.0 introduced arithmetic operations between lists and scalars. So now you can easily create list of alternating values.

  • pl.int_ranges() to create lists of integers.
  • pl.Expr.list.len() to use length of existing lists.
# lists of [1,0,1,0,..]
df.select(
    result = pl.int_ranges(pl.col.col.list.len()) % 2
)

# lists of [5,0,5,0,..]
df.select(
    result = 5 * (pl.int_ranges(pl.col.col.list.len()) % 2)
)

# lists of [5,10,5,,..]
df.select(
    result = 5 + 5 * (pl.int_ranges(pl.col.col.list.len()) % 2)
)
df.select(
    result = pl.col.col / (5 + 5 * (pl.int_ranges(pl.col.col.list.len()) % 2))
)
shape: (3, 1)
┌────────────────────┐
│ result             │
│ ---                │
│ list[f64]          │
╞════════════════════╡
│ [2.2, 2.1, … 10.2] │
│ [2.4, 2.2, … 10.4] │
│ [2.6, 2.3, … 10.6] │
└────────────────────┘
like image 21
Roman Pekar Avatar answered Oct 30 '25 01:10

Roman Pekar



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!