I'm summarizing a data frame in dplyr with the summarize_all() function. If I do the following:
summarize_all(mydf, list(mean="mean", median="median", sd="sd"))
I get a tibble with 3 variables for each of my original measures, all suffixed by the type (mean, median, sd). Great! But when I try to capture the within-vector n's to calculate the standard deviations myself and to make sure missing cells aren't counted...
summarize_all(mydf, list(mean="mean", median="median", sd="sd", n="n"))
...I get an error:
Error in (function () : unused argument (var_a)
This is not an issue with my var_a vector. If I remove it, I get the same error for var_b, etc. The summarize_all function is producing odd results whenever I request n or n(), or if I use .funs() and list the descriptives I want to compute instead.
What's going on?
The reason it's giving you problems is because n() doesn't take any arguments, unlike mean() and median(). Use length() instead to get the desired effect:
summarize_all(mydf, list(mean="mean", median="median", sd="sd", n="length"))
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With