I am trying to group_by a variable and then do operations per row per group. I got lost when using ifelse vs case_when. There is something basic I am failing to understand between the usage of two. I was assuming both would give me same output but that is not the case here. Using ifelse didn't give the expected output but case_when did. And I am trying to understand why ifelse didn't give me the expected output.
Here is the example df
structure(list(Pos = c(73L, 146L, 146L, 150L, 150L, 151L, 151L,
152L, 182L, 182L), Percentage = c(81.2, 13.5, 86.4, 66.1, 33.9,
48.1, 51.9, 86.1, 48, 52)), row.names = c(NA, -10L), class = c("tbl_df",
"tbl", "data.frame")) -> foo
I am grouping by Pos and I want to round Percentage if their sum is 100. The following is using ifelse:
library(tidyverse)
foo %>%
group_by(Pos) %>%
mutate(sumn = n()) %>%
mutate(Val = ifelse(sumn == 1,100,
ifelse(sum(Percentage) == 100, unlist(map(Percentage,round)), 0)
# case_when(sum(Percentage) == 100 ~ unlist(map(Percentage,round)),
# TRUE ~ 0
# )
))
the output is
# A tibble: 10 x 4
# Groups: Pos [6]
Pos Percentage sumn Val
<int> <dbl> <int> <dbl>
1 73 81.2 1 100
2 146 13.5 2 0
3 146 86.4 2 0
4 150 66.1 2 66
5 150 33.9 2 66
6 151 48.1 2 48
7 151 51.9 2 48
8 152 86.1 1 100
9 182 48 2 48
10 182 52 2 48
I don't want this, rather I want the following which I get using case_when
foo %>%
group_by(Pos) %>%
mutate(sumn = n()) %>%
mutate(Val = ifelse(sumn == 1,100,
#ifelse(sum(Percentage) == 100, unlist(map(Percentage,round)), 0)
case_when(sum(Percentage) == 100 ~ unlist(map(Percentage,round)),
TRUE ~ 0
)
))
# A tibble: 10 x 4
# Groups: Pos [6]
Pos Percentage sumn Val
<int> <dbl> <int> <dbl>
1 73 81.2 1 100
2 146 13.5 2 0
3 146 86.4 2 0
4 150 66.1 2 66
5 150 33.9 2 34
6 151 48.1 2 48
7 151 51.9 2 52
8 152 86.1 1 100
9 182 48 2 48
10 182 52 2 52
What is ifelse doing different?
According to ?ifelse
A vector of the same length and attributes (including dimensions and "class") as test and data values from the values of yes or no.
If we replicate to make the lengths same, then it should work
foo %>%
group_by(Pos) %>%
mutate(sumn = n()) %>%
mutate(Val = ifelse(sumn == 1,100,
ifelse(rep(sum(Percentage) == 100,
n()), unlist(map(Percentage,round)), 0)
))
# A tibble: 10 x 4
# Groups: Pos [6]
Pos Percentage sumn Val
<int> <dbl> <int> <dbl>
1 73 81.2 1 100
2 146 13.5 2 0
3 146 86.4 2 0
4 150 66.1 2 66
5 150 33.9 2 34
6 151 48.1 2 48
7 151 51.9 2 52
8 152 86.1 1 100
9 182 48 2 48
10 182 52 2 52
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With