Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Convert semi-long data into wide data

Tags:

r

tidyr

reshape2

I'm very sure there should be a simple alternative but I'm not able to figure it out. Currently using a for loop which is not optimal. My dataframe is like this:

NAME <- c("ABC", "ABC", "ABC", "DEF", "GHI", "GHI", "JKL", "JKL", "JKL", "MNO")
YEAR <- c(2012, 2013, 2014, 2012, 2012, 2013, 2012, 2014, 2016, 2013)
MARKS <- c(45, 75, 95, 91, 75, 76, 85, 88, 89, 77)
MAXIMUM <- c(95, NA, NA, 91, 76, NA, 89, NA, NA, 77)

DF <- data.frame(
  NAME,
  YEAR,
  MARKS,
  MAXIMUM
)

> DF
   NAME YEAR MARKS MAXIMUM
1   ABC 2012    45      95
2   ABC 2013    75      NA
3   ABC 2014    95      NA
4   DEF 2012    91      91
5   GHI 2012    75      76
6   GHI 2013    76      NA
7   JKL 2012    85      89
8   JKL 2014    88      NA
9   JKL 2016    89      NA
10  MNO 2013    77      77

I want to have only one name per row and each year-wise details (YEAR, MARKS and MAXIMUM columns) should be spread as individual headers. I have tried to use tidyr::pivot_wider function but was not successful.

I have given the sample output here:

Required output enter image description here

like image 884
Frodo Avatar asked Sep 02 '25 03:09

Frodo


2 Answers

Perhaps you could enumerate by NAME first based on row_number(). Then, use pivot_wider:

library(tidyverse)

DF %>%
  group_by(NAME) %>%
  mutate(n = row_number()) %>%
  pivot_wider(NAME, names_from = n, values_from = c(YEAR, MARKS, MAXIMUM))

Output

  NAME  YEAR_1 YEAR_2 YEAR_3 MARKS_1 MARKS_2 MARKS_3 MAXIMUM_1 MAXIMUM_2 MAXIMUM_3
  <chr>  <dbl>  <dbl>  <dbl>   <dbl>   <dbl>   <dbl>     <dbl>     <dbl>     <dbl>
1 ABC     2012   2013   2014      45      75      95        95        NA        NA
2 DEF     2012     NA     NA      91      NA      NA        91        NA        NA
3 GHI     2012   2013     NA      75      76      NA        76        NA        NA
4 JKL     2012   2014   2016      85      88      89        89        NA        NA
5 MNO     2013     NA     NA      77      NA      NA        77        NA        NA

Or, as mentioned by @RobertoT, you could make YEAR a factor and then line up your YEAR values. Using complete you can fill in NA for missing YEAR. The final select will order your columns.

DF$YEAR_FAC = factor(DF$YEAR)

DF %>%
  group_by(NAME) %>%
  complete(YEAR_FAC, fill = list(YEAR = NA)) %>%
  mutate(n = row_number()) %>%
  pivot_wider(NAME, names_from = n, values_from = c(YEAR, MARKS, MAXIMUM)) %>%
  select(NAME, ends_with(as.character(1:nlevels(DF$YEAR_FAC))))

Output

  NAME  YEAR_1 MARKS_1 MAXIMUM_1 YEAR_2 MARKS_2 MAXIMUM_2 YEAR_3 MARKS_3 MAXIMUM_3 YEAR_4 MARKS_4 MAXIMUM_4
  <chr>  <dbl>   <dbl>     <dbl>  <dbl>   <dbl>     <dbl>  <dbl>   <dbl>     <dbl>  <dbl>   <dbl>     <dbl>
1 ABC     2012      45        95   2013      75        NA   2014      95        NA     NA      NA        NA
2 DEF     2012      91        91     NA      NA        NA     NA      NA        NA     NA      NA        NA
3 GHI     2012      75        76   2013      76        NA     NA      NA        NA     NA      NA        NA
4 JKL     2012      85        89     NA      NA        NA   2014      88        NA   2016      89        NA
5 MNO       NA      NA        NA   2013      77        77     NA      NA        NA     NA      NA        NA
like image 200
Ben Avatar answered Sep 05 '25 00:09

Ben


In addition to @Ben+1 solution we could a code that I recently learned to order the columns Combining two dataframes with alternating column position

DF %>%
  group_by(NAME) %>%
  mutate(n = row_number()) %>%
  pivot_wider(NAME, names_from = n, values_from = c(YEAR, MARKS, MAXIMUM)) %>% 
  select(-NAME) %>% 
  dplyr::select(all_of(c(matrix(names(.), ncol = 3, byrow = TRUE))))    
  NAME  YEAR_3 MARKS_3 MAXIMUM_3 YEAR_1 MARKS_1 MAXIMUM_1 YEAR_2 MARKS_2 MAXIMUM_2
  <chr>  <dbl>   <dbl>     <dbl>  <dbl>   <dbl>     <dbl>  <dbl>   <dbl>     <dbl>
1 ABC     2014      95        NA   2012      45        95   2013      75        NA
2 DEF       NA      NA        NA   2012      91        91     NA      NA        NA
3 GHI       NA      NA        NA   2012      75        76   2013      76        NA
4 JKL     2016      89        NA   2012      85        89   2014      88        NA
5 MNO       NA      NA        NA   2013      77        77     NA      NA        NA
like image 41
TarJae Avatar answered Sep 04 '25 23:09

TarJae