I would like to count the occurrences of previous occurred values using library(dplyr).
Example data:
dates <- as.Date(as.character(c("2011-01-13",
"2011-01-14",
"2011-01-15",
"2011-01-16",
"2011-01-17",
"2011-01-13",
"2011-01-14",
"2011-01-15",
"2011-01-16",
"2011-01-17",
"2011-01-13",
"2011-01-14",
"2011-01-15",
"2011-01-16",
"2011-01-17",
"2011-01-17",
"2011-01-17",
"2011-01-18",
"2011-01-18")))
ID <-c("1","2","3","3","1","5","6","5","7","8","1","2","11","2",'12',"5","5","1","4")
# put together
data <- data.frame(dates,ID)
data
dates ID
1 2011-01-13 1
2 2011-01-14 2
3 2011-01-15 3
4 2011-01-16 3
5 2011-01-17 1
6 2011-01-13 5
7 2011-01-14 6
8 2011-01-15 5
9 2011-01-16 7
10 2011-01-17 8
11 2011-01-13 1
12 2011-01-14 2
13 2011-01-15 11
14 2011-01-16 2
15 2011-01-17 12
16 2011-01-17 5
17 2011-01-17 5
18 2011-01-18 1
19 2011-01-18 4
I would like to construct a dataset which looks like:
dates ID prev_occurene
1 2011-01-13 1 1
2 2011-01-14 2 1
3 2011-01-15 3 1
4 2011-01-16 3 2
5 2011-01-17 1 2
6 2011-01-13 5 1
7 2011-01-14 6 1
8 2011-01-15 5 2
9 2011-01-16 7 1
10 2011-01-17 8 1
11 2011-01-13 1 3
12 2011-01-14 2 2
13 2011-01-15 11 1
14 2011-01-16 2 3
15 2011-01-17 12 1
16 2011-01-17 5 3
17 2011-01-17 5 4
18 2011-01-18 1 4
19 2011-01-18 4 1
where I add 1 to an ID if it has occurred in the past.
So far I have tried to solve that using duplicates. However the output doesnt look very promising:
library(dplyr)
data_dups <- data %>%
group_by(dates) %>%
mutate(dups = duplicated(ID)) %>%
filter(dups == 'TRUE') %>%
summarise(occurence = n())
dates occurence
<date> <int>
1 2011-01-13 1
2 2011-01-14 1
3 2011-01-17 1
In dplyr you can use row_number() to count occurrences within groups.
library(tidyverse)
data %>%
arrange(dates) %>%
group_by(ID) %>%
mutate(occurrence = row_number())
# A tibble: 19 x 3
# Groups: ID [10]
# dates ID occurrence
# <date> <fctr> <int>
# 1 2011-01-13 1 1
# 2 2011-01-14 2 1
# 3 2011-01-15 3 1
# 4 2011-01-16 3 2
# 5 2011-01-17 1 2
# 6 2011-01-13 5 1
# 7 2011-01-14 6 1
# 8 2011-01-15 5 2
# 9 2011-01-16 7 1
# 10 2011-01-17 8 1
# 11 2011-01-13 1 3
# 12 2011-01-14 2 2
# 13 2011-01-15 11 1
# 14 2011-01-16 2 3
# 15 2011-01-17 12 1
# 16 2011-01-17 5 3
# 17 2011-01-17 5 4
# 18 2011-01-18 1 4
# 19 2011-01-18 4 1
Note that this solution relies on data ordered by dates. Thus, arrange(dates) is added.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With