Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to find the frequency of the most frequent value (mode) of a series in polars?

import polars as pl

df = pl.DataFrame({
    "tags": ["a", "a", "a", "b", "c", "c", "c", "c", "d"] 
})

This is how to compute the most frequent element of the column using the .mode expression:

df.select([
    pl.col("tags").mode().alias("mode"),
])

How can I display also the frequency/count of that mode?


1 Answers

There is a value_counts expression. This expression will return a Struct datatype where the first field is the unique value and the second field is the count of that value.

df.select([
    pl.col("tags").value_counts()
])
shape: (4, 1)
┌───────────┐
│ tags      │
│ ---       │
│ struct[2] │
╞═══════════╡
│ {"c",4}   │
├╌╌╌╌╌╌╌╌╌╌╌┤
│ {"a",3}   │
├╌╌╌╌╌╌╌╌╌╌╌┤
│ {"b",1}   │
├╌╌╌╌╌╌╌╌╌╌╌┤
│ {"d",1}   │
└───────────┘

Or if you want to have that result as a DataFrame:

(df.select([
    pl.col("tags").value_counts()
]).to_series().struct.to_frame())
shape: (4, 2)
┌──────┬────────┐
│ tags ┆ counts │
│ ---  ┆ ---    │
│ str  ┆ u32    │
╞══════╪════════╡
│ c    ┆ 4      │
├╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌┤
│ a    ┆ 3      │
├╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌┤
│ d    ┆ 1      │
├╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌┤
│ b    ┆ 1      │
└──────┴────────┘

Edited: Which can be even simpler:

df["tags"].value_counts()
like image 106
ritchie46 Avatar answered Oct 24 '25 20:10

ritchie46



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!