| id | msgid | source | value |
|----|-------|--------|-------|
| 1 | 1 | B | 0 |
| 1 | 2 | A | 1 |
| 1 | 3 | B | 0 |
| 2 | 1 | B | 0 |
| 2 | 2 | A | 0 |
| 2 | 3 | A | 1 |
| 2 | 4 | B | 0 |
In the above snippet, I want to create column value from the other columns. id is a conversation and msgId is the message in each conversation.
I wish to identify the row number for the last message that came from source=A.
I made an attempt to solve it. However, I was able to identify only the last row within a conversation.
last_values <- dat %>% group_by(id) %>%
slice(which.max(msgid)) %>%
ungroup %>%
mutate(value = cumsum(msgid))
dat$final_val <- 0
dat[last_values$value,5] <- 1
We can create the column 'value' by
dat %>%
group_by(id) %>%
mutate(value1 = as.integer(source == "A" & !duplicated(source == "A", fromLast = TRUE)))
# A tibble: 7 x 5
# Groups: id [2]
# id msgid source value value1
# <int> <int> <chr> <int> <int>
#1 1 1 B 0 0
#2 1 2 A 1 1
#3 1 3 B 0 0
#4 2 1 B 0 0
#5 2 2 A 0 0
#6 2 3 A 1 1
#7 2 4 B 0 0
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With