Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Dplyr - Select if a column exists and summarize if it does exist

Tags:

r

dplyr

I am able to check for a specific column if it exists using 'contains' in dplyr . I struggle with evaluating the summary of the expression if it does not exist.

Here is my code snippet:

  df <- Prod%>%
      group_by(Entity)%>%
      select(Entity,`Cum.Oil`,`Cum.Gas`,contains("EUR")%>%
      summarise(Oil = mean(`Cum.Oil`), Gas = mean(`Cum.Gas`), EUR=mean(EUR))

How can I ignore 'EUR' expression in the summarise expression if the EUR column does not exist?

like image 707
CodeMaster Avatar asked Oct 22 '25 06:10

CodeMaster


2 Answers

Something like this should work:

df <- Prod%>%
      group_by(Entity)%>%
      summarise(across(any_of(c('Cum.Oil', 'Cum.Gas', 'Eur')), ~mean(.x), 
                .names = '{.col %>% str_remove("Cum.")}' )

Can't test without some reprex, though.

Tip: You can also use any_of in select statements:

df <- Prod%>%
      group_by(Entity)%>%
      select(any_of(c('Entity', 'Cum.Oil', 'Cum.Gas', "EUR"))
like image 125
Juan C Avatar answered Oct 23 '25 19:10

Juan C


If it is more convinient you can use ifelse constructions within dplyr pipes as well though I consider @Juan answer more Rly elegant:

Prod <- data.frame(Entity = c("a", "b", "a"),`Cum.Oil` = 1:3,`Cum.Gas`=c(2,4,6)
                   , EUR = c(7,9,9)
)
Prod %>% 
  group_by(Entity) %>% {
  ifelse(exists("EUR", .), 
         . <- summarise(., Oil = mean(`Cum.Oil`), 
                        Gas = mean(`Cum.Gas`), 
                        EUR = mean(EUR)
         ), 
         . <- summarise(., Oil = mean(`Cum.Oil`), 
                        Gas = mean(`Cum.Gas`)
         )
  )  
  .
} 
like image 33
asd-tm Avatar answered Oct 23 '25 19:10

asd-tm



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!