Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Sort dataframe by column value (r)

I am new to R and currently trying to wrap my head around dataframes in R. I want to sort a dataframe by the column values, and then return the top of it after it has been sorted.

As of now I only seem to get one row back. I used the "iris" dataframe.

sort <- function(df, var.name, n){
  df1 <- df[rev(order(var.name)), ]
  sorted <- head(df1, n)
  return(sorted)
}

sort_head(df = iris, var.name = "Petal.Length", n = 10)
# My output
> sort_head(df = iris, var.name = "Petal.Length", n = 5)
  Sepal.Length Sepal.Width Petal.Length Petal.Width Species
1          5.1         3.5          1.4         0.2  setosa
  Sepal.Length Sepal.Width Petal.Length Petal.Width Species
1          5.1         3.5          1.4         0.2  setosa

My ordering of the dataframe seems to rewrite the dataframe to only contain one row - whereas all the guides (e.g here) I've found simply rewrites the dataframe to be sorted by column. What am I missing?

like image 566
OLGJ Avatar asked Mar 25 '26 11:03

OLGJ


2 Answers

Use arrange() and slice():

library(tidyverse)
iris %>%
  arrange(desc(Petal.Length)) %>%  # arrange in descending order
  slice(1:10)                      # return rows 1 through 10

This returns:

   Sepal.Length Sepal.Width Petal.Length Petal.Width   Species
1           7.7         2.6          6.9         2.3 virginica
2           7.7         3.8          6.7         2.2 virginica
3           7.7         2.8          6.7         2.0 virginica
4           7.6         3.0          6.6         2.1 virginica
5           7.9         3.8          6.4         2.0 virginica
6           7.3         2.9          6.3         1.8 virginica
7           7.2         3.6          6.1         2.5 virginica
8           7.4         2.8          6.1         1.9 virginica
9           7.7         3.0          6.1         2.3 virginica
10          6.3         3.3          6.0         2.5 virginica

If you liked this approach, I highly recommend walking through the examples of Chapter 5 of r4ds which covers { dplyr }, part of { tidyverse }. This is bound to save you countless hours in the future when scratching your head about data.frame transformations. =)

like image 88
Rich Pauloo Avatar answered Mar 27 '26 23:03

Rich Pauloo


With the latest 4.4.0 R version we can use the sort_by function to sort the dataframe using only base code. With head we select the 10 rows. Here is some reproducible code:

iris |>
  sort_by(~ list(-Petal.Length)) |>
  head(10)
#>     Sepal.Length Sepal.Width Petal.Length Petal.Width   Species
#> 119          7.7         2.6          6.9         2.3 virginica
#> 118          7.7         3.8          6.7         2.2 virginica
#> 123          7.7         2.8          6.7         2.0 virginica
#> 106          7.6         3.0          6.6         2.1 virginica
#> 132          7.9         3.8          6.4         2.0 virginica
#> 108          7.3         2.9          6.3         1.8 virginica
#> 110          7.2         3.6          6.1         2.5 virginica
#> 131          7.4         2.8          6.1         1.9 virginica
#> 136          7.7         3.0          6.1         2.3 virginica
#> 101          6.3         3.3          6.0         2.5 virginica

Created on 2024-05-02 with reprex v2.1.0

like image 28
Quinten Avatar answered Mar 28 '26 00:03

Quinten



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!