I have the following dataframes a,b,c
Year<-rep(c("2002","2003"),1)
Crop<-c("TTT","RRR")
a<-data.frame(Year,Crop)
Year<-rep(c("2002","2003"),2)
ProductB<-c("A","A","B","B")
b<-data.frame(Year,ProductB)
Year<-rep(c("2002","2003"),3)
Location<-c("XX","XX","YY","YY","ZZ","ZZ")
c<-data.frame(Year,Location)
and want to get them together. When I use the merge function i get the cartesian product which is not what I want.
d<-merge(a,b,by="Year")
e<-merge(d,c,by="Year")
I would like the dataframe to look like
Year Crop ProductB Location
2002 TTT A XX
2002 NA B YY
2002 NA NA ZZ
2003 RRR A XX
2003 NA B YY
2003 NA NA ZZ
Is this possible? Thanks for your help
Here's one way using data.table.
require(data.table) ## 1.9.2
# (1)
setDT(a)[, GRP := 1:.N, by=Year]
setDT(b)[, GRP := 1:.N, by=Year]
setDT(c)[, GRP := 1:.N, by=Year]
# (2)
merge(a, merge(b, c, by=c("Year", "GRP"),
all=TRUE), by=c("Year", "GRP"), all=TRUE)
# Year GRP Crop ProductB Location
# 1: 2002 1 TTT A XX
# 2: 2002 2 NA B YY
# 3: 2002 3 NA NA ZZ
# 4: 2003 1 RRR A XX
# 5: 2003 2 NA B YY
# 6: 2003 3 NA NA ZZ
- (1) -
setDTconverts thedata.frametodata.tableand then we create a new columnGRPby grouping byYear. With this, we've a unique combination ofYear, Grp.- (2) - we merge on the two columns
Year, GRP.
.N is an inbuilt variable that holds the number of rows for that group.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With