Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Adding a new year variable to a data frame (with all other variables being duplicated)

Tags:

r

I have a data frame containing a shape-file that I want to merge with another data-set that contains years. I'm interested in adding a variable with years to the former while all other variables remain the same for each year. I'm not sure how to do this.

As an example, say I have the following data-set:

a <- data.frame(code = c("aaa" , "bbb", "ccc") ,
            item = c("apples" , "bananas" , "carrots") ,
            id = c(1,2,3))

giving the following:

  code    item id
1  aaa  apples  1
2  bbb bananas  2
3  ccc carrots  3

I would like to add a new variable called year of length n that repeats all the same elements of the other variables for each year. For example, say I'd like to add the years 1990 to 1992 to an existing object like this:

  code    item id year
1  aaa  apples  1 1990
2  aaa  apples  1 1991
3  aaa  apples  1 1992
4  bbb bananas  2 1990
5  bbb bananas  2 1991
6  bbb bananas  2 1992
7  ccc carrots  3 1990
8  ccc carrots  3 1991
9  ccc carrots  3 1992

Is there a way of doing this (for existing data frames)? For this example I used this code;

b <- data.frame(code = rep(c("aaa" , "bbb", "ccc") , each = 3) ,
                item = rep(c("apples" , "bananas" , "carrots") , each = 3) ,
                id = rep(c(1,2,3) , each = 3) ,
                year = rep(c(1990:1992) , times = 3))

but this would not work (or is extremely inefficient) when the data-set is already there or extremely large. Is there a better way of doing this?

like image 611
Adrian Avatar asked Aug 31 '25 22:08

Adrian


1 Answers

Base R:

b <- data.frame(year = 1990:1992)
merge(a, b, by = NULL)
#   code    item id year
# 1  aaa  apples  1 1990
# 2  bbb bananas  2 1990
# 3  ccc carrots  3 1990
# 4  aaa  apples  1 1991
# 5  bbb bananas  2 1991
# 6  ccc carrots  3 1991
# 7  aaa  apples  1 1992
# 8  bbb bananas  2 1992
# 9  ccc carrots  3 1992

Data

a <- structure(list(code = c("aaa", "bbb", "ccc"), item = c("apples", "bananas", "carrots"), id = c(1, 2, 3)), class = "data.frame", row.names = c(NA, -3L))
like image 72
r2evans Avatar answered Sep 03 '25 14:09

r2evans