library(tidyverse)
#> Warning: package 'tidyverse' was built under R version 3.4.4
#> Warning: package 'forcats' was built under R version 3.4.4
example <- tibble(
num1 = sample(1:100, 10),
categ1 = as.factor(c(sample(letters, 10))),
num2 = sample(1:100, 10),
categ2 = as.factor(c(sample(letters, 10)))
)
head(example)
#> # A tibble: 6 x 4
#> num1 categ1 num2 categ2
#> <int> <fct> <int> <fct>
#> 1 4 c 5 l
#> 2 86 u 64 b
#> 3 38 z 18 r
#> 4 95 e 44 j
#> 5 77 w 35 u
#> 6 84 y 14 i
Created on 2018-06-19 by the reprex package (v0.2.0).
The above example shows a basic dataframe with integer and factor data type columns. In this small example, it is easy to use the select(example, categ1, categ2, num1, num2) in dplyr to manually pick the order you want columns to appear.
But suppose you have many columns that are a mixture of data types, and that you want all of the factors to be selected first followed by everything else (or any particular order based on data type)?
Manually typing out each column name or using select() helpers like contains() can become tedious quickly with an innumerable amount columns. I prefer a tidyverse solution, but would also be interested how this could be accomplished in base R.
Example data with columns of 3 classes
library(tidyverse)
example <- tibble(
num1 = as.character(sample(1:100, 10)),
categ1 = as.factor(c(sample(letters, 10))),
num2 = sample(1:100, 10),
categ2 = as.factor(c(sample(letters, 10)))
)
Lets say you want to order the columns in this order
my.order <- c('factor', 'integer', 'character')
i.e. factors, then integers, then characters
You can do
example %>%
select(sapply(., class) %>% .[order(match(., my.order))] %>% names)
# # A tibble: 10 x 4
# categ1 categ2 num2 num1
# <fct> <fct> <int> <chr>
# 1 y e 94 46
# 2 t b 52 31
# 3 w c 32 57
# 4 k i 27 89
# 5 n d 76 14
# 6 x g 67 40
# 7 c v 16 20
# 8 e z 6 95
# 9 i t 70 13
# 10 g w 57 42
As a function (same output)
order_cols <- function(df, col.order){
df %>%
select(sapply(., class) %>% .[order(match(., col.order))] %>% names)
}
example %>%
order_cols(c('factor', 'integer', 'character'))
Posting a tidyverse approach that I figured out.
library(tidyverse)
#> Warning: package 'tidyverse' was built under R version 3.4.4
#> Warning: package 'forcats' was built under R version 3.4.4
example <- tibble(
num1 = sample(1:100, 10),
categ1 = as.factor(c(sample(letters, 10))),
num2 = sample(1:100, 10),
categ2 = as.factor(c(sample(letters, 10)))
)
head(example)
#> # A tibble: 6 x 4
#> num1 categ1 num2 categ2
#> <int> <fct> <int> <fct>
#> 1 33 h 94 s
#> 2 78 x 6 k
#> 3 82 s 84 i
#> 4 11 k 20 o
#> 5 51 v 11 q
#> 6 5 w 51 b
# Use select_if() to specify data-type and pull names to insert into outter select()
# Intersect is only needed if you previously filtered
# some columns and you do not want those factors (in this case) to creep back in
# with the select_if() call
example_arranged <- example %>%
select(intersect(names(select_if(., is.factor)), names(.)), everything())
head(example_arranged)
#> # A tibble: 6 x 4
#> categ1 categ2 num1 num2
#> <fct> <fct> <int> <int>
#> 1 h s 33 94
#> 2 x k 78 6
#> 3 s i 82 84
#> 4 k o 11 20
#> 5 v q 51 11
#> 6 w b 5 51
Created on 2018-06-19 by the reprex package (v0.2.0).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With