Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Polars: How to reorder columns in a specific order?

I cannot find how to reorder columns in a polars dataframe in the polars DataFrame docs.

like image 672
rchitect-of-info Avatar asked Sep 06 '25 03:09

rchitect-of-info


1 Answers

Using the select method is the recommended way to sort columns in polars.

Example:

Input:

df
┌─────┬───────┬─────┐
│Col1 ┆ Col2  ┆Col3 │
│ --- ┆ ---   ┆ --- │
│ str ┆ str   ┆ str │
╞═════╪═══════╪═════╡
│ a   ┆ x     ┆ p   │
├╌╌╌╌╌┼╌╌╌╌╌╌╌┼╌╌╌╌╌┤
│ b   ┆ y     ┆ q   │
└─────┴───────┴─────┘

Output:

df.select(['Col3', 'Col2', 'Col1'])
or
df.select([pl.col('Col3'), pl.col('Col2'), pl.col('Col1)])

┌─────┬───────┬─────┐
│Col3 ┆ Col2  ┆Col1 │
│ --- ┆ ---   ┆ --- │
│ str ┆ str   ┆ str │
╞═════╪═══════╪═════╡
│ p   ┆ x     ┆ a   │
├╌╌╌╌╌┼╌╌╌╌╌╌╌┼╌╌╌╌╌┤
│ q   ┆ y     ┆ b   │
└─────┴───────┴─────┘

Note: While df[['Col3', 'Col2', 'Col1']] gives the same result (version 0.14), it is recommended (link) that you use the select method instead.

We strongly recommend selecting data with expressions for almost all use cases. Square bracket indexing is perhaps useful when doing exploratory data analysis in a terminal or notebook when you just want a quick look at a subset of data.

For all other use cases we recommend using expressions because:

  1. expressions can be parallelized
  2. the expression approach can be used in lazy and eager mode while the indexing approach can only be used in eager mode
  3. in lazy mode the query optimizer can optimize expressions
like image 84
NFern Avatar answered Sep 07 '25 19:09

NFern