Say I have data on people who choose between several options. I have one row per person, and I want to have one row per person and choice option. So, if I have 10 people who have 3 choices, right now I have 10 rows, and I want to have 30.
All of the other variables should be copied to each of the new rows.  So, for example, if I have a variable for gender, that should be constant within ID.  (I am setting my data up this way to analyze with mnlogit.)
This seems like the situation that two tidyr functions, complete and fill, were designed for.  To use a simple example:
library(lubridate)
library(tidyr)
dat <- data.frame(
    id = 1:3,
    choice = 5:7,
    c = c(9, NA, 11),
    d = ymd(NA, "2015-09-30", "2015-09-29")
    )
dat %>% 
  complete(id, choice) %>%
  fill(everything())
# Source: local data frame [9 x 4]
# 
#      id choice     c          d
#   (int)  (int) (dbl)     (time)
# 1     1      5     9       <NA>
# 2     1      6     9       <NA>
# 3     1      7     9       <NA>
# 4     2      5     9       <NA>
# 5     2      6     9 2015-09-30
# 6     2      7     9 2015-09-30
# 7     3      5     9 2015-09-30
# 8     3      6     9 2015-09-30
# 9     3      7    11 2015-09-29
But this has some problems -- the values of d were carried forward correctly, but the values of c from ID 1 replaced the (correct) NA values for ID 2.
I could try a workaround, like replacing all of the missing values with 999, running complete and fill, and then replacing 999 with NA.  (I think I would have to convert the date variables to character variables and then convert them back again if I go this route.)  But maybe someone on here knows of a tidy way to do this with tidyr?
Edit: the desired output here is:
# Source: local data frame [9 x 4]
# 
#     id     c          d choice
#  (int) (dbl)     (time)  (int)
# 1     1     9       <NA>      5
# 2     1     9       <NA>      6
# 3     1     9       <NA>      7
# 4     2    NA 2015-09-30      5
# 5     2    NA 2015-09-30      6
# 6     2    NA 2015-09-30      7
# 7     3    11 2015-09-29      5
# 8     3    11 2015-09-29      6
# 9     3    11 2015-09-29      7
As an update to @jeremycg answer. From tidyr 0.5.1 (or maybe even version 0.4.0) onwards c() does not work anymore. Use nesting() instead:
dat %>% 
 complete(nesting(id, c, d), choice) 
Note I was trying to edit @jeremycg answer, since the answer was correct at the time it was written (and hence a new answer is not really necessary) but unfortunately the edit got rejected.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With