Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to do a horizontal forward_fill in Polars?

I am wondering if there's a way to do forward filling by columns in polars.

df = pl.DataFrame(
    {
        "id": ["NY", "TK", "FD"], 
        "eat2000": [1, 6, 3], 
        "eat2001": [-2, None, 4],
        "eat2002": [None, None, None],
        "eat2003": [-9, 3, 8],
        "eat2004": [None, None, 8]
    }
); df
shape: (3, 6)
┌─────┬─────────┬─────────┬─────────┬─────────┬─────────┐
│ id  ┆ eat2000 ┆ eat2001 ┆ eat2002 ┆ eat2003 ┆ eat2004 │
│ --- ┆ ---     ┆ ---     ┆ ---     ┆ ---     ┆ ---     │
│ str ┆ i64     ┆ i64     ┆ f64     ┆ i64     ┆ i64     │
╞═════╪═════════╪═════════╪═════════╪═════════╪═════════╡
│ NY  ┆ 1       ┆ -2      ┆ null    ┆ -9      ┆ null    │
│ TK  ┆ 6       ┆ null    ┆ null    ┆ 3       ┆ null    │
│ FD  ┆ 3       ┆ 4       ┆ null    ┆ 8       ┆ 8       │
└─────┴─────────┴─────────┴─────────┴─────────┴─────────┘

I would like to do the equivlanet of .ffill(axis=1) in pandas.

pl.from_pandas(df.to_pandas().ffill(axis=1))
shape: (3, 6)
┌─────┬─────────┬─────────┬─────────┬─────────┬─────────┐
│ id  ┆ eat2000 ┆ eat2001 ┆ eat2002 ┆ eat2003 ┆ eat2004 │
│ --- ┆ ---     ┆ ---     ┆ ---     ┆ ---     ┆ ---     │
│ str ┆ i64     ┆ f64     ┆ f64     ┆ i64     ┆ f64     │
╞═════╪═════════╪═════════╪═════════╪═════════╪═════════╡
│ NY  ┆ 1       ┆ -2.0    ┆ -2.0    ┆ -9      ┆ -9.0    │
│ TK  ┆ 6       ┆ 6.0     ┆ 6.0     ┆ 3       ┆ 3.0     │
│ FD  ┆ 3       ┆ 4.0     ┆ 4.0     ┆ 8       ┆ 8.0     │
└─────┴─────────┴─────────┴─────────┴─────────┴─────────┘
like image 303
codedancer Avatar asked Oct 22 '25 20:10

codedancer


1 Answers

You can use the new coalesce Expression to fold columns horizontally. If you place the coalesce expressions in a with_columns context, they will be run in parallel.

(
    df
    .with_columns(pl.col("^eat.*$").cast(pl.Int64))
    .with_columns(
        pl.coalesce("eat2004", "eat2003", "eat2002", "eat2001", "eat2000"),
        pl.coalesce("eat2003", "eat2002", "eat2001", "eat2000"),
        pl.coalesce("eat2002", "eat2001", "eat2000"),
        pl.coalesce("eat2001", "eat2000"),
    )
)
shape: (3, 6)
┌─────┬─────────┬─────────┬─────────┬─────────┬─────────┐
│ id  ┆ eat2000 ┆ eat2001 ┆ eat2002 ┆ eat2003 ┆ eat2004 │
│ --- ┆ ---     ┆ ---     ┆ ---     ┆ ---     ┆ ---     │
│ str ┆ i64     ┆ i64     ┆ i64     ┆ i64     ┆ i64     │
╞═════╪═════════╪═════════╪═════════╪═════════╪═════════╡
│ NY  ┆ 1       ┆ -2      ┆ -2      ┆ -9      ┆ -9      │
│ TK  ┆ 6       ┆ 6       ┆ 6       ┆ 3       ┆ 3       │
│ FD  ┆ 3       ┆ 4       ┆ 4       ┆ 8       ┆ 8       │
└─────┴─────────┴─────────┴─────────┴─────────┴─────────┘

Couple of notes.

I first cast the eatXXXX columns to the same type. (In the DataFrame constructor, eat2002 is of type Float64 because of the way Polars initializes an all-null column that is not supplied with an explicit datatype).

I've written out the list of coalesce Expressions for demonstration, but the list of expressions can be generated with a Python list comprehension.

eat_cols = [col_nm for col_nm in reversed(df.columns)
            if col_nm.startswith('eat')]
(
    df
    .with_columns(pl.col("^eat.*$").cast(pl.Int64))
    .with_columns(
        pl.coalesce(eat_cols[idx:])
        for idx in range(0, len(eat_cols) - 1)
    )
)

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!