Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Setting a maximum limit for values in a data frame in R

Tags:

r

max

limit

mean

In a data frame (in R), I have two columns - the first is a list of species names (species), the second is the number of occurrence records I have for that species (number). There is a large variation in the number column with most values being <100 but a few being very high values (>100,000), and there are many rows (~4000). Here is a simplified example:

    x<-data.frame(species=c("a","b","c","d","e","f","g","h","i","j"),number=c(53,17,67,989,135,67,13,786,100400,28))   

Basically what I want to do is reduce the maximum number of records (the value in the number column) until the mean of all the values in this column stabilises.

To do this, I need to set a maximum limit for values in the number column so that any value > this limit is reduced to this maximum limit, and record the mean. I want to repeat this multiple times, each time reducing the maximum limit by 100.

I've not been able to find any similar questions online and am not really sure where to start with this! Any help, even just a point in the right direction, would be much appreciated! Cheers

like image 527
kim1801 Avatar asked Jan 01 '26 12:01

kim1801


1 Answers

you should use the pmin value :

pmin(x$number, 1e3)
# to test multiple limits :
mns <- sapply(c(1e6, 1e4, 1e2), function(u) mean(pmin(x$number, u)))
like image 142
droopy Avatar answered Jan 03 '26 03:01

droopy



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!