I have data where the 'Law' variable indicates changes in legislation, in different places ('Place'):
Person Place Year Law
1 A 1990 0
2 A 1991 1
3 A 1992 1
4 B 1990 0
5 B 1991 0
6 B 1992 1
7 B 1993 1
8 B 1993 1
9 B 1993 1
10 B 1992 1
Basically the law was implemented in place A in 1991 and remained in force for all subsequent time periods. It was implemented in place B in 1992 and remained in force, & so on.
I would like to create a new variable that takes on a value of 0 for the year the law was implemented, 1 for 1 year after, 2 for 2 years after, -1 for the year before, -2 for 2 years before, and so on.
I need the final dataframe to look like:
Person Place Year Law timeline
1 A 1990 0 -1
2 A 1991 1 0
3 A 1992 1 1
4 B 1990 0 -2
5 B 1991 0 -1
6 B 1992 1 0
7 B 1993 1 1
8 B 1993 1 2
9 B 1993 1 2
10 B 1992 1 1
I have tried:
library(dplyr)
df %>%
group_by(Place) %>%
arrange(Year) %>%
mutate(timeline = rank(Law))
but it's not working like I need. What am I doing wrong? Can I do this in dplyr or do I need to create a complex for loop?
You can subtract the row_numer by the index where the Law is implemented:
df %>%
arrange(Year) %>%
group_by(Place) %>%
mutate(timeline = row_number() - which(diff(Law) == 1) - 1) %>%
arrange(Place)
# A tibble: 7 x 5
# Groups: Place [2]
# Person Place Year Law timeline
# <int> <fct> <int> <int> <dbl>
#1 1 A 1990 0 -1.
#2 2 A 1991 1 0.
#3 3 A 1992 1 1.
#4 4 B 1990 0 -2.
#5 5 B 1991 0 -1.
#6 6 B 1992 1 0.
#7 7 B 1993 1 1.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With