Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Find row-wise minimum positive non-zero number in data.frame using dplyr

Tags:

r

dplyr

tidyverse

Given a numeric data frame

A <- c(1.1, 3.0, 2.0, 4.0, 0.0, 1.3)
B <- c(0.2, 1.0, 2.4, 1.1, 1.3, 0.0)
C <- c(5.2, 1.3, 3.7, 1.7, 1.3, 1.0)

data <- data.frame(A, B, C) %>% as_tibble()

how can I create another column containing the row-wise minimum positive non-zero number (using dplyr if possible) to obtain the following data frame?

## A tibble: 6 x 4
#      A     B     C posmin
#  <dbl> <dbl> <dbl>  <dbl>
#1   1.1   0.2   5.2    0.2
#2   3     1     1.3    1  
#3   2     0     3.7    2  
#4   4     1.1   1.7    1.1
#5   0     1.3   1.3    1.3  
#6   1.3   0     1      1  

What's concise and almost does the job is

data %>% mutate(posmin = pmin(A, B, C))

which has two issues, however:

  • My real data frame has more columns (A to Z) and I can't call pmin(A:Z)
  • pmin computes the row-wise minimum

Is there something like pminpos and if there isn't how can I create it so that it can be called just like pmin in the code above? And how do I specify many contiguous columns without passing a comma-separated list of their names?

Thank you very much.

edit: I clearly missed to stress the point, that I'm looking for non-zero positive numbers, i.e. numbers strictly greater than > 0. Hence the sought-after values for rows #5 and #6 are not zero.

like image 584
Slavistan Slavistan Avatar asked Dec 04 '25 13:12

Slavistan Slavistan


1 Answers

One option would be to convert the column names to symbols and then evaluate (!!!)

library(dplyr)
data %>% 
   mutate_all(funs(replace(., .==0, NA))) %>% 
   transmute(posmin = pmin(!!! rlang::syms(names(.)), na.rm = TRUE)) %>%
   bind_cols(data, .)

# A tibble: 6 x 4
#      A     B     C posmin
#  <dbl> <dbl> <dbl>  <dbl>
#1   1.1   0.2   5.2    0.2
#2   3     1     1.3    1  
#3   2     2.4   3.7    2  
#4   4     1.1   1.7    1.1
#5   0     1.3   1.3    1.3  
#6   1.3   0     1      1 

Or use map/reduce

map(data, na_if, 0) %>% 
    reduce(pmin, na.rm = TRUE) %>% 
    bind_cols(data, posmin = .)

Or without using any external packages, we can call pmin within do.call in a single line

data$posmin <- do.call(pmin, c(NA^ (data == 0) * data, na.rm = TRUE))
data$posmin
#[1] 0.2 1.0 2.0 1.1 1.3 1.0

Or based on @Moody_Mudskipper's comments, instead of assigning value that are 0 to NA, change it to a larger value (Inf) and then use pmin

data$posmin <- do.call(pmin, '[<-'(data, data <=0, value=Inf))
like image 171
akrun Avatar answered Dec 07 '25 07:12

akrun



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!