I would like to use stargazer to produce summary statistics for each category of a grouping variable. I could do it in separate tables, but I'd like it all in one – if that is not unreasonably challenging for this package.
For example
library(stargazer)
stargazer(ToothGrowth, type = "text")
#>
#> =========================================
#> Statistic N Mean St. Dev. Min Max
#> -----------------------------------------
#> len 60 18.813 7.649 4.200 33.900
#> dose 60 1.167 0.629 0.500 2.000
#> -----------------------------------------
provides summary statistics for the continuous variables in ToothGrowth. I would like to split that summary by the categorical variable supp, also in ToothGrowth.
Two suggestions for desired outcome,
stargazer(ToothGrowth ~ supp, type = "text")
#>
#> ==================================================
#> Statistic N Mean St. Dev. Min Max
#> --------------------------------------------------
#> OJ len 30 16.963 8.266 4.200 33.900
#> dose 30 1.167 0.634 0.500 2.000
#> VC len 30 20.663 6.606 8.200 30.900
#> dose 30 1.167 0.634 0.500 2.000
#> --------------------------------------------------
#>
stargazer(ToothGrowth ~ supp, type = "text")
#>
#> ==================================================
#> Statistic N Mean St. Dev. Min Max
#> --------------------------------------------------
#> len
#> _by VC 30 16.963 8.266 4.200 33.900
#> _by VC 30 1.167 0.634 0.500 2.000
#> _tot 60 18.813 7.649 4.200 33.900
#>
#> dose
#> _by OJ 30 20.663 6.606 8.200 30.900
#> _by OJ 30 1.167 0.634 0.500 2.000
#> _tot 60 1.167 0.629 0.500 2.000
#> --------------------------------------------------
library(stargazer)
library(dplyr)
library(tidyr)
ToothGrowth %>%
group_by(supp) %>%
mutate(id = 1:n()) %>%
ungroup() %>%
gather(temp, val, len, dose) %>%
unite(temp1, supp, temp, sep = '_') %>%
spread(temp1, val) %>%
select(-id) %>%
as.data.frame() %>%
stargazer(type = 'text')
=========================================
Statistic N Mean St. Dev. Min Max
-----------------------------------------
OJ_dose 30 1.167 0.634 0.500 2.000
OJ_len 30 20.663 6.606 8.200 30.900
VC_dose 30 1.167 0.634 0.500 2.000
VC_len 30 16.963 8.266 4.200 33.900
-----------------------------------------
This gets rid of the problem mentioned by the OP in a comment to the original answer, "What I really want is a single table with summary statistics separated by a categorical variable instead of creating separate tables." The easiest way I saw to do that with stargazer was to create a new data frame that had variables for each group's observations using a gather(), unite(), spread() strategy. The only trick to it is to avoid duplicate identifiers by creating unique identifiers by group and dropping that variable before calling stargazer().
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With