Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Using mutate, case_when, across, and string detect to evaluate patterns in multiple columns to create a new column

I would like to create a new column based on the evaluation of multiple columns. I am searching for ICD10 codes and would like to create a flag when the appropriate pattern is present in one or more columns. Specifically, I would like to search for any ICD codes T36-T50 where the 5th character is 2 (e.g. T3602).

Here is a reprex.

df <- data.frame (dx1 = c('K5039','T215','C219861','T36002'),
                  dx2 = c('T38022','X72001','X55124','T36022'),
                  dx3 = c('X80011','X790122','X55124','T36022'),
                  pcode = c('R41','R44','R98','R99'),
                  ecode = c('X79','X81012','X44015','X83012'),
                  stringsAsFactors = FALSE)

df_new <- df %>% 
  mutate(injury = case_when(
    across(c(dx1:dx3, ecode),
           ~str_detect(.,regex('(T36|T37|T38|T39|T40|T41|T42|T43|T44|
                                  T45|T46|T47|T48|T49|T50)[:graph:]{1}2.*'))        ~ 1,
    TRUE                                                   ~ 0)))
    

I am getting an error Error in mutate(): ℹ In argument: injury = case_when(...). Caused by error in across(): ! Can't convert .fns, a two-sided formula, to a function.

I can get it to partially work using if_any see below:

df_new <- df %>% 
  mutate(injury = case_when(
    if_any(c(dx1:dx3, ecode),
           ~str_detect(.,regex('X72|X73|X74|X75')))        ~ 1,
    TRUE                                                   ~ 0))

but when I add the code to select the 5th character I get a similar error. Error in mutate(): ℹ In argument: injury = case_when(...). Caused by error in if_any(): ! Can't convert .fns, a two-sided formula, to a function. Run rlang::last_trace() to see where the error occurred.

This is what I would like my output to look like.

table of correctly evaluated ICD10 codes

like image 603
Jenny Mercado Avatar asked Nov 19 '25 15:11

Jenny Mercado


1 Answers

I think its just a syntax error,

df_new <- df %>% 
  mutate(injury = case_when(
    if_any(c(dx1:dx3, ecode),
           ~str_detect(.,regex('(T36|T37|T38|T39|T40|T41|T42|T43|T44|
                                  T45|T46|T47|T48|T49|T50)[:graph:]{1}2.*'))) ~ 1,
           TRUE ~ 0))

works for me. Note 3 closing brackets after "...{1}2.*'" instead of 2.

I'll add, you don't want across for this, across "typically returns a tibble with one column for each column in .cols" so you would get 4 columns, when you really want 1. In mutate across will alter the existing columns, rather than adding a new one.

like image 145
Sarah Avatar answered Nov 21 '25 12:11

Sarah



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!