How to select the longest string from a list of strings in polars?

Question

How do I select the longest string from a list of strings in polars?

Example and expected output:

import polars as pl

df = pl.DataFrame({
    "values": [
        ["the", "quickest", "brown", "fox"],
        ["jumps", "over", "the", "lazy", "dog"],
        []
    ]
})

┌──────────────────────────────┬────────────────┐
│ values                       ┆ longest_string │
│ ---                          ┆ ---            │
│ list[str]                    ┆ str            │
╞══════════════════════════════╪════════════════╡
│ ["the", "quickest", … "fox"] ┆ quickest       │
│ ["jumps", "over", … "dog"]   ┆ jumps          │
│ []                           ┆ null           │
└──────────────────────────────┴────────────────┘

My use case is to select the longest overlapping match.

Edit: elaborating on the longest overlapping match, this is the output for the example provided by polars:

┌────────────┬───────────┬─────────────────────────────────┐
│ values     ┆ matches   ┆ matches_overlapping             │
│ ---        ┆ ---       ┆ ---                             │
│ str        ┆ list[str] ┆ list[str]                       │
╞════════════╪═══════════╪═════════════════════════════════╡
│ discontent ┆ ["disco"] ┆ ["disco", "onte", "discontent"] │
└────────────┴───────────┴─────────────────────────────────┘

I desire a way to select the longest match in matches_overlapping.

juanpa.arrivillaga · Accepted Answer

You can do something like:

df.with_columns(
    pl.col('values').list.get(
        pl.col('values')
        .list.eval(pl.element().str.len_chars())
        .list.arg_max()
    )
    .alias('longest_string')
)

This expression:

pl.col('values')
.list.eval(pl.element().str.len_chars())
.list.arg_max()

first maps len_chars to each string in each of the lists with .list.eval, then it finds the arg_max (the index of the max element, so in this case, the index of the max length).

The result of that is passed to list.get to retrieve those values.

How to select the longest string from a list of strings in polars?

Tags:

python

python-polars

conjuncts

1 Answers

juanpa.arrivillaga

Recent Activity

Donate For Us

How to select the longest string from a list of strings in polars?

Tags:

python

python-polars

conjuncts

1 Answers

juanpa.arrivillaga

Related questions

Recent Activity

Donate For Us