Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Repeating a date in polars and exploding it

I have a polars dataframe with two date columns that represent a start and end date and then a value that I want to repeat for all dates in between those two dates so that I can join those on other tables.

Example input is

df = pl.from_repr("""
┌─────┬─────────────────────┬─────────────────────┬───────┐
│ id  ┆ start               ┆ end                 ┆ value │
│ --- ┆ ---                 ┆ ---                 ┆ ---   │
│ str ┆ datetime[ns]        ┆ datetime[ns]        ┆ i64   │
╞═════╪═════════════════════╪═════════════════════╪═══════╡
│ 123 ┆ 2022-01-01 00:00:00 ┆ 2022-01-04 00:00:00 ┆ 10    │
│ abc ┆ 2022-03-04 00:00:00 ┆ 2022-03-04 00:00:00 ┆ 3     │
│ 456 ┆ 2022-05-11 00:00:00 ┆ 2022-05-16 00:00:00 ┆ 4     │
└─────┴─────────────────────┴─────────────────────┴───────┘
""")

and expected output is

shape: (11, 3)
┌─────┬─────────────────────┬───────┐
│ id  ┆ date                ┆ value │
│ --- ┆ ---                 ┆ ---   │
│ str ┆ datetime[ns]        ┆ i64   │
╞═════╪═════════════════════╪═══════╡
│ 123 ┆ 2022-01-01 00:00:00 ┆ 10    │
│ 123 ┆ 2022-01-02 00:00:00 ┆ 10    │
│ 123 ┆ 2022-01-03 00:00:00 ┆ 10    │
│ 123 ┆ 2022-01-04 00:00:00 ┆ 10    │
│ abc ┆ 2022-03-04 00:00:00 ┆ 3     │
│ 456 ┆ 2022-05-11 00:00:00 ┆ 4     │
│ 456 ┆ 2022-05-12 00:00:00 ┆ 4     │
│ 456 ┆ 2022-05-13 00:00:00 ┆ 4     │
│ 456 ┆ 2022-05-14 00:00:00 ┆ 4     │
│ 456 ┆ 2022-05-15 00:00:00 ┆ 4     │
│ 456 ┆ 2022-05-16 00:00:00 ┆ 4     │
└─────┴─────────────────────┴───────┘
like image 409
slugmagug Avatar asked Dec 14 '25 22:12

slugmagug


2 Answers

I managed to implement it without using map_elements:

import polars as pl
from datetime import date

df = pl.DataFrame({
    "id": ["123", "abc", "456"],
    "start": [date(2022, 1, 1), date(2022, 3, 4), date(2022, 5, 11)],
    "end": [date(2022, 1, 4), date(2022, 3, 4), date(2022, 5, 16)],
    "value": [10, 3, 4],
})

df = (
    df.with_columns(
        pl.date_ranges("start", "end", interval="1d")
          .alias("date")
    )
    .explode("date")
    .select("id", "date", "value")
)
shape: (11, 3)
┌─────┬────────────┬───────┐
│ id  ┆ date       ┆ value │
│ --- ┆ ---        ┆ ---   │
│ str ┆ date       ┆ i64   │
╞═════╪════════════╪═══════╡
│ 123 ┆ 2022-01-01 ┆ 10    │
│ 123 ┆ 2022-01-02 ┆ 10    │
│ 123 ┆ 2022-01-03 ┆ 10    │
│ 123 ┆ 2022-01-04 ┆ 10    │
│ abc ┆ 2022-03-04 ┆ 3     │
│ 456 ┆ 2022-05-11 ┆ 4     │
│ 456 ┆ 2022-05-12 ┆ 4     │
│ 456 ┆ 2022-05-13 ┆ 4     │
│ 456 ┆ 2022-05-14 ┆ 4     │
│ 456 ┆ 2022-05-15 ┆ 4     │
│ 456 ┆ 2022-05-16 ┆ 4     │
└─────┴────────────┴───────┘
like image 136
aSel80 Avatar answered Dec 16 '25 13:12

aSel80


I struggled today with the same problem and I thought I share my solution.

As cbilot already mentions pl.date_range doesn't take expressions as low and high value. So I worked around by using apply.

Data:

import polars as pl
from datetime import date

df = pl.DataFrame(
    {
        "id": ["123", "abc", "456"],
        "start": [date(2022, 1, 1), date(2022, 3, 4), date(2022, 5, 11)],
        "end": [date(2022, 1, 4), date(2022, 3, 4), date(2022, 5, 16)],
        "value": [10, 3, 4],
    }
)

Solution:

(
    df.with_columns(
        pl.struct("start", "end")
          .map_elements(lambda x: pl.date_range(x["start"], x["end"], "1d", eager=True))
          .alias("date"))
      .explode("date")
      .select("id", "date", "value")
)
shape: (11, 3)
┌─────┬────────────┬───────┐
│ id  ┆ date       ┆ value │
│ --- ┆ ---        ┆ ---   │
│ str ┆ date       ┆ i64   │
╞═════╪════════════╪═══════╡
│ 123 ┆ 2022-01-01 ┆ 10    │
│ 123 ┆ 2022-01-02 ┆ 10    │
│ 123 ┆ 2022-01-03 ┆ 10    │
│ 123 ┆ 2022-01-04 ┆ 10    │
│ abc ┆ 2022-03-04 ┆ 3     │
│ 456 ┆ 2022-05-11 ┆ 4     │
│ 456 ┆ 2022-05-12 ┆ 4     │
│ 456 ┆ 2022-05-13 ┆ 4     │
│ 456 ┆ 2022-05-14 ┆ 4     │
│ 456 ┆ 2022-05-15 ┆ 4     │
│ 456 ┆ 2022-05-16 ┆ 4     │
└─────┴────────────┴───────┘
like image 23
alexp Avatar answered Dec 16 '25 13:12

alexp



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!