How to use Polars.filter() with null data [duplicate]

Question

I have a dataframe and need to filter for all rows with received no equal to "qty".

df = pl.DataFrame({
    'doc_n': ['1111', '2222', '3333'],
    'received': ['qty', '6.0', None],
})

However, after applying the following filter

df = df.filter(pl.col('received') != 'qty')

only the row ['6.0', '2222'] remains. Especially, polars filtered out the row with null value as well.

How can I apply a filter while leaving null values? The expected outcome has 2 rows (['6.0', '2222'] and [None, '3333']).

Abdul Niyas P M · Accepted Answer

To understand this behavior, see this related Github Issue and the comment. In short, Null values are propagated through comparison operators.

>>> import polars as pl
>>> pl.Series([1, 2, None]) != pl.Series([3, 4, 5])
shape: (3,)
Series: '' [bool]
[
    true
    true
    null <-- HERE
]

To get the expected outcome, you can use ne_missing

>>> df.filter(pl.col('received').ne_missing('qty'))
shape: (2, 2)
┌───────┬──────────┐
│ doc_n ┆ received │
│ ---   ┆ ---      │
│ str   ┆ str      │
╞═══════╪══════════╡
│ 2222  ┆ 6.0      │
│ 3333  ┆ null     │
└───────┴──────────┘

How to use Polars.filter() with null data [duplicate]

Tags:

python

dataframe

python-polars

Masik

1 Answers

Abdul Niyas P M

Recent Activity

Donate For Us

How to use Polars.filter() with null data [duplicate]

Tags:

python

dataframe

python-polars

Masik

1 Answers

Abdul Niyas P M

Related questions

Recent Activity

Donate For Us