I would like to add a new calculated column "new" that will have the values of the mean of "snakes" per area. I tried to use the ave function but it didn't workd with mean. I tried to run the same line with sum and It worked, what is the reason and are there any other ways to get the desired result.
Here is my toy data frame:
df <- read.table(text = "snakes birds wolfs area
3 9 7 a
3 8 4 b
1 2 8 c
1 2 3 a
1 8 3 a
6 1 2 a
6 7 1 b
6 1 5 c ",header = TRUE)
Here is the working line of code:
df$sum <- ave(df$snakes, df$area, FUN=sum)
df
snakes birds wolfs area sum
1 3 9 7 a 11
2 3 8 4 b 9
3 1 2 8 c 7
4 1 2 3 a 11
5 1 8 3 a 11
6 6 1 2 a 11
7 6 7 1 b 9
8 6 1 5 c 7
And here is the error that I get when replacing the sum function with mean function:
df$avg <- ave(df$snakes, df$area, FUN=mean)
Error in get(as.character(FUN), mode = "function", envir = envir) :
object 'FUN' of mode 'function' was not found
The ave works in R 3.2.2, R 3.1.0 (based on @Pascal's comment), and in R studio version 0.99.467. So, we are not sure the real reason behind the error. As far as mean is concerned, we don't need to specify explicitly, as
ave(df$snakes, df$area)
#[1] 2.75 4.50 3.50 2.75 2.75 2.75 4.50 3.50
A base R alternative is split/unsplit where we split the 'snakes' by the 'area' column, get the mean of snakes, replicate it to length of the list element and unsplit by 'area'
unsplit(lapply(split(df$snakes, df$area),
function(x) rep(mean(x),length(x))), df$area)
#[1] 2.75 4.50 3.50 2.75 2.75 2.75 4.50 3.50
If we can install other packages, we can use either dplyr or data.table.
Using dplyr, we group by 'area', and create the 'avg' column with mutate.
library(dplyr)
df %>%
group_by(area) %>%
mutate(avg= mean(snakes))
We convert the 'data.frame' to 'data.table' (setDT(df)), grouped by 'area', we assign (:=) the mean of 'snakes' as the 'avg' column.
library(data.table)
setDT(df)[, avg:= mean(snakes), by = area]
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With