I have a thinking problem (not sure if there is already a solution on stack, but I have no idea how I can fix this).
I have a data frame like this:
ID Visits Time X Y Z
1 2 2016-05-15 06:38:40 1 1 0
1 4 2016-05-15 07:38:40 0 0 1
1 2 2016-05-15 08:38:40 0 1 0
2 3 2016-05-15 09:38:40 1 0 2
3 2 2016-05-15 10:38:40 0 1 0
3 1 2016-05-15 11:38:40 1 0 1
I want to make a new data frame, with:
So the result should be this:
ID Visits Time X Y Z
1 8 2016-05-15 06:38:40 1 2 1
2 3 2016-05-15 09:38:40 1 0 2
3 3 2016-05-15 10:38:40 1 1 1
I tried this:
data %>% group_by(ID) %>% summarise_at(vars(-Time), funs(sum.,na.rm = TRUE)))
But, there is my thinking issue: the variable Time is now out of my data, and I can't add the variable anymore (because it is not the same length anymore).
We can do this with data.table
library(data.table)
setDT(data)[, c(list(Time = Time[1]), lapply(.SD, sum, na.rm = TRUE)),
ID, .SDcols = setdiff(names(data), c("ID", "Time"))]
Or with dplyr, after grouping by 'ID', add the 'Time' also in the grouping variables by taking the first of 'Time' and then do with summarise_all
data %>%
group_by(ID) %>%
group_by(Time = first(Time), add = TRUE) %>%
summarise_all(sum, na.rm = TRUE)
# A tibble: 3 x 6
# Groups: ID [?]
# ID Time Visits X Y Z
# <int> <chr> <int> <int> <int> <int>
#1 1 2016-05-15 06:38:40 8 1 2 1
#2 2 2016-05-15 09:38:40 3 1 0 2
#3 3 2016-05-15 10:38:40 3 1 1 1
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With