I want to find the nth smallest number for every column in a data.frame.
In the below example I specify actually the second smallest value using the dcast nth function. Can someone help with the coding of the function?
library(vegclust)
library(dplyr)
data(wetland)
dfnorm = decostand(wetland,"normalize")
dfchord = dist(dfnorm, method = "euclidean")
dfchord = data.frame(as.matrix(dfchord)
number_function = function(x) nth(x,2) # can change 2 to any number..
answer_vector = apply(dfchord, 2, number) # here, 2 specifying apply on columns
The actual answer would be something like this..
ans = c(0.5689322,0.579568297,0.315017693,0.315017693,0.632246369, 0.868563003, 0.704638684, 0.35827587, 0.725220337, 0.516397779) # length of 1:38
Just a warning, if you don't specify the order for dplyr's nth(), it will not actually do the sorting:
For example,
> sapply(mtcars, dplyr::nth, 2)
mpg cyl disp hp drat wt qsec vs am gear carb
21.000 6.000 160.000 110.000 3.900 2.875 17.020 0.000 1.000 4.000 4.000
which is actually just the second row of the data:
> mtcars[2,]
mpg cyl disp hp drat wt qsec vs am gear carb
Mazda RX4 Wag 21 6 160 110 3.9 2.875 17.02 0 1 4 4
The nth function in Rfast does sort by default:
> sapply(mtcars, Rfast::nth, 2)
mpg cyl disp hp drat wt qsec vs am gear carb
10.400 4.000 75.700 62.000 2.760 1.615 14.600 0.000 0.000 3.000 1.000
If you are sensitive to performance, the Rfast version was written to scale well by using a partial sort, which isn't true for solutions based on sort, order or rank (including dplyr::nth).
Here is my example;
num_func <- function(x, n) nth(sort(x), n)
sapply(dfchord, num_func, n = 2) # edited (thanks for @thelatemail's comment)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With