This is my df:
group value
1 10
1 20
1 25
2 5
2 10
2 15
I now want to compute differences between each value of a group and a reference value, which is the first row of a group. More precisely:
group value diff
1 10 NA # because this is the reference for group 1
1 20 10 # value[2] - value[1]
1 25 15 # value[3] - value[1]
2 5 NA # because this is the reference for group 2
2 10 5 # value[5] - value[4]
2 15 10 # value[6] - value[4]
I found good answers for difference scores of the previous line (e.g., lag-function in dpylr, shift-function in data.table). However, I am looking for a fixed reference point and I couldn't make it work.
I think you can also use this:
library(dplyr)
df %>%
group_by(group) %>%
mutate(diff = value - value[1],
diff = replace(diff, row_number() == 1, NA))
# A tibble: 6 x 3
# Groups: group [2]
group value diff
<int> <int> <int>
1 1 10 NA
2 1 20 10
3 1 25 15
4 2 5 NA
5 2 10 5
6 2 15 10
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With