import polars as pl
df = pl.DataFrame({
"tags": ["a", "a", "a", "b", "c", "c", "c", "c", "d"]
})
This is how to compute the most frequent element of the column using the .mode expression:
df.select([
pl.col("tags").mode().alias("mode"),
])
How can I display also the frequency/count of that mode?
There is a value_counts expression. This expression will return a Struct datatype where the first field is the unique value and the second field is the count of that value.
df.select([
pl.col("tags").value_counts()
])
shape: (4, 1)
┌───────────┐
│ tags │
│ --- │
│ struct[2] │
╞═══════════╡
│ {"c",4} │
├╌╌╌╌╌╌╌╌╌╌╌┤
│ {"a",3} │
├╌╌╌╌╌╌╌╌╌╌╌┤
│ {"b",1} │
├╌╌╌╌╌╌╌╌╌╌╌┤
│ {"d",1} │
└───────────┘
Or if you want to have that result as a DataFrame:
(df.select([
pl.col("tags").value_counts()
]).to_series().struct.to_frame())
shape: (4, 2)
┌──────┬────────┐
│ tags ┆ counts │
│ --- ┆ --- │
│ str ┆ u32 │
╞══════╪════════╡
│ c ┆ 4 │
├╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌┤
│ a ┆ 3 │
├╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌┤
│ d ┆ 1 │
├╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌┤
│ b ┆ 1 │
└──────┴────────┘
Edited: Which can be even simpler:
df["tags"].value_counts()
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With