I've this data set
id <- c(0,0,1,1,2,2,3,3,4,4)
gender <- c("m","m","f","f","f","f","m","m","m","m")
x1 <-c(1,1,1,1,2,2,3,3,10,10)
x2 <- c(3,7,5,6,9,15,10,15,12,20)
alldata <- data.frame(id,gender,x1,x2)
which looks like:
id gender x1 x2
0 m 1 3
0 m 1 7
1 f 1 5
1 f 1 6
2 f 2 9
2 f 2 15
3 m 3 10
3 m 3 15
4 m 10 12
4 m 10 20
Notice that for each unique id x1 are similar, but x2 are different. I need to sort data by id and x2 (from smallest to largest) and then for each unique id I need to set x1(for the second record) = x2 (for the first record).
The data would look like:
id gender x1 x2
0 m 1 3
0 m 3 7
1 f 1 5
1 f 5 6
2 f 2 9
2 f 9 15
3 m 3 10
3 m 10 15
4 m 10 12
4 m 12 20
I found this easier using data.table
> library(data.table)
> dt = data.table(alldata)
> setkey(dt, id, x2) #sort the data
This next line says: within each ID for x1, take the first value of x1, then every remaining value take from x2 as needed.
> dt[,x1 := c(x1[1], x2)[1:.N],keyby=id]
> dt
id gender x1 x2
1: 0 m 1 3
2: 0 m 3 7
3: 1 f 1 5
4: 1 f 5 6
5: 2 f 2 9
6: 2 f 9 15
7: 3 m 3 10
8: 3 m 10 15
9: 4 m 10 12
10: 4 m 12 20
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With