Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python Polars find the length of a string in a dataframe

I am trying to count the number of letters in a string in Polars. I could probably just use an apply method and get the len(Name). However, I was wondering if there is a polars specific method?

import polars as pl

df = pl.DataFrame({
    "start_date": ["2020-01-02", "2020-01-03", "2020-01-04", "2020-01-05"],
    "Name": ["John", "Joe", "James", "Jörg"]
})

In Pandas I can use .str.len()

>>> df.to_pandas()["Name"].str.len()
0    4
1    3
2    5
3    4
Name: Name, dtype: int64

But that does not exist in Polars:

df.with_columns(pl.col("Name").str.len())
# AttributeError: 'ExprStringNameSpace' object has no attribute 'len'
like image 228
John Smith Avatar asked Dec 13 '25 13:12

John Smith


1 Answers

You can use

  • .str.len_bytes() that counts number of bytes in the UTF8 string
  • .str.len_chars() that counts number of characters
df.with_columns(
    pl.col("Name").str.len_bytes().alias("bytes"),
    pl.col("Name").str.len_chars().alias("chars")
)
shape: (4, 4)
┌────────────┬───────┬───────┬───────┐
│ start_date ┆ Name  ┆ bytes ┆ chars │
│ ---        ┆ ---   ┆ ---   ┆ ---   │
│ str        ┆ str   ┆ u32   ┆ u32   │
╞════════════╪═══════╪═══════╪═══════╡
│ 2020-01-02 ┆ John  ┆ 4     ┆ 4     │
│ 2020-01-03 ┆ Joe   ┆ 3     ┆ 3     │
│ 2020-01-04 ┆ James ┆ 5     ┆ 5     │
│ 2020-01-05 ┆ Jörg  ┆ 5     ┆ 4     │
└────────────┴───────┴───────┴───────┘
like image 164
glebcom Avatar answered Dec 16 '25 05:12

glebcom



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!