Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

polars intersection of list columns in dataframe

import polars as pl

df = pl.DataFrame({'a': [[1, 2, 3], [8, 9, 4]], 'b': [[2, 3, 4], [4, 5, 6]]})

So given the dataframe df

    a           b
[1, 2, 3]   [2, 3, 4]
[8, 9, 4]   [4, 5, 6]

I would like to get a column c, that is an intersection of a and b

    a           b          c
[1, 2, 3]   [2, 3, 4]    [2, 3]
[8, 9, 4]   [4, 5, 6]     [4]

I know I can use the apply function with python set intersection, but I want to do it using polars expressions.

like image 605
Vikash Balasubramanian Avatar asked Oct 12 '25 11:10

Vikash Balasubramanian


1 Answers

Polars has dedicated set_* methods for lists.

pl.Config(fmt_table_cell_list_len=10, fmt_str_lengths=80) # increase repr len
df.with_columns(
   intersection = pl.col("a").list.set_intersection("b"),
   difference = pl.col("a").list.set_difference("b"),
   symmetric_difference = pl.col("a").list.set_symmetric_difference("b"),
   union = pl.col("a").list.set_union("b")
)
shape: (2, 6)
┌───────────┬───────────┬──────────────┬────────────┬──────────────────────┬─────────────────┐
│ a         ┆ b         ┆ intersection ┆ difference ┆ symmetric_difference ┆ union           │
│ ---       ┆ ---       ┆ ---          ┆ ---        ┆ ---                  ┆ ---             │
│ list[i64] ┆ list[i64] ┆ list[i64]    ┆ list[i64]  ┆ list[i64]            ┆ list[i64]       │
╞═══════════╪═══════════╪══════════════╪════════════╪══════════════════════╪═════════════════╡
│ [1, 2, 3] ┆ [2, 3, 4] ┆ [2, 3]       ┆ [1]        ┆ [1, 4]               ┆ [1, 2, 3, 4]    │
│ [8, 9, 4] ┆ [4, 5, 6] ┆ [4]          ┆ [8, 9]     ┆ [8, 9, 5, 6]         ┆ [8, 9, 4, 5, 6] │
└───────────┴───────────┴──────────────┴────────────┴──────────────────────┴─────────────────┘

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!