I would like to redefine the mean function (to apply it in a tabular() table) for it to omit all NA, NaN and Inf observations for a certain variable. I don't want to delete the whole row (observation) but rather have the mean formular simply calculate the mean for all values that are not NA, NaN, Inf.
Mean.new <- function(x) base::mean(x, na.rm=TRUE)
As far as I know does na.rm=TRUE in the standard mean() only remove NAs, not NaN and Inf.
Therefore, how do I add to the code above the functionality to check for is.finite() (which would exclude all NA, NaN, Inf)?
Thank you and best,
cork
With is.finite:
mean_new <- function(x) {mean(x[is.finite(x)])}
mean_new(c(NA,Inf,NaN,1,2))
[1] 1.5
Base R defines a default method for the generic mean, so here is a way that works by defining a method for objects of class "numeric".
The example data is taken from Waldi's answer. Unlike in his answer, I negate is.infinite because is.finite will also return TRUE for missing values (NA) and argument na.rm will be irrelevant, missing values would always be removed. From the documentation ?is.finite, my emphasis:
Description
is.finite and is.infinite return a vector of the same length as x, indicating which elements are finite (not infinite and not missing) or infinite.
In this description, the missing values part refers to the finite elements only and is.infinite expected behavior is to return TRUE for -Inf/Inf values but not NA nor NaN.
The code then becomes
mean.numeric <- function(x, trim = 0, na.rm = FALSE, ...){
x <- x[!is.infinite(x)]
mean.default(x, trim = trim, na.rm = na.rm, ...)
}
y <- c(NA,Inf,NaN,1,2)
is.finite(y)
#[1] FALSE FALSE FALSE TRUE TRUE
!is.infinite(y)
#[1] TRUE FALSE TRUE TRUE TRUE
mean(y)
#[1] NA
mean(y, na.rm = TRUE)
#[1] 1.5
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With