Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to mutate a new variable in R dyplyr that is based on criteria from 2 columns?

Tags:

r

I have a dataset which looks like this:

Recipient  ID
(chr)       (chr)  
Smith       C
Wells       S
Wells       S
Jones       S
Jones       N
Wu          C
Wu          N
Wu          S

I want to mutate a new variable, which is either "Unique" or "Multiple", based on if Recipient appears once (Unique), Recipient appears more than once but has the same ID for each occurence (Unique), Recipient appears more than once AND has 1 or more IDs (Multiple). I've tried to use:

df %>%
 group_by(Recipient, ID) %>%
 mutuate(Freq = case_when(
                str_count(Recipient) == 1 & str_count(ID) == 1 ~ "Unique",
                str_count(Recipient) > 2 & str_count(ID) == 1 ~ "Unique",
                str_count(Recipient) > 2 & str_count(ID) > 1 ~ "Multiple"))

When I did this, all the values were multiple:

Recipient  ID     Freq
(chr)      (chr)  (chr)
Smith       C     Multiple (should be Unique)
Wells       S     Multiple (should be Unique)
Wells       S     Multiple (should be Unique)
Jones       S     Multiple
Jones       N     Multiple
Wu          C     Multiple
Wu          N     Multiple
Wu          S     Multiple

I've tried multiple times, but can't crack it. Can anyone help to solve this, or recommend an easier way to code this? Thanks!

like image 834
lucy_Eh Avatar asked Jan 27 '26 09:01

lucy_Eh


1 Answers

A possible solution with n_distinct():

library(dplyr)

df %>%
  group_by(Recipient) %>%
  mutate(Freq = ifelse(n_distinct(ID) == 1, "unique", "multiple")) %>%
  ungroup()

# A tibble: 8 x 3
  Recipient ID    Freq
  <chr>     <chr> <chr>
1 Smith     C     unique
2 Wells     S     unique
3 Wells     S     unique
4 Jones     S     multiple
5 Jones     N     multiple
6 Wu        C     multiple
7 Wu        N     multiple
8 Wu        S     multiple

Data
df <- structure(list(Recipient = c("Smith", "Wells", "Wells", "Jones", 
"Jones", "Wu", "Wu", "Wu"), ID = c("C", "S", "S", "S", "N", "C",
"N", "S")), class = "data.frame", row.names = c(NA, -8L))
like image 131
Darren Tsai Avatar answered Jan 28 '26 23:01

Darren Tsai



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!