Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Display the total number of bin elements in a stacked histogram with ggplot2

I'd like to show data values on stacked bar chart in ggplot2. After many attempts, the only way I found to show the total amount (for each bean) is using the following code

set.seed(1234)

df <- data.frame(
  sex=factor(rep(c("F", "M"), each=200)),
  weight=round(c(rnorm(200, mean=55, sd=5), rnorm(200, mean=65, sd=5)))
)

p<-ggplot(df, aes(x=weight, fill=sex, color=sex))
p<-p + geom_histogram(position="stack", alpha=0.5, binwidth=5)

tbl <- (ggplot_build(p)$data[[1]])[, c("x", "count")]
agg <- aggregate(tbl["count"], by=tbl["x"], FUN=sum)

for(i in 1:length(agg$x))
  if(agg$count[i])
    p <- p + geom_text(x=agg$x[i], y=agg$count[i] + 1.5, label=agg$count[i], colour="black" )

which generates the following plot:

enter image description here

Is there a better (and more efficient) way to get the same result using ggplot2? Thanks a lot in advance

like image 674
Anton Avatar asked Jan 23 '26 12:01

Anton


1 Answers

You can use stat_bin to count up the values and add text labels.

p <- ggplot(df, aes(x=weight)) +
  geom_histogram(aes(fill=sex, color=sex), 
                 position="stack", alpha=0.5, binwidth=5) +
  stat_bin(aes(y=..count.. + 2, label=..count..), geom="text", binwidth=5)

I moved the fill and color aesthetics to geom_histogram so that they would apply only to that layer and not globally to the whole plot, because we want stat_bin to generate and overall count for each bin, rather than separate counts for each level of sex. ..count.. is an internal variable returned by stat_bin that stores the counts.

enter image description here

In this case, it was straightforward to add the counts directly. However, in more complicated situations, you might sometimes want to summarise the data outside of ggplot and then feed the summary data to ggplot. Here's how you would do that in this case:

library(dplyr)

counts = df %>% group_by(weight = cut(weight, seq(30,100,5), right=FALSE)) %>%
  summarise(n = n())

countsByGroup = df %>% group_by(sex, weight = cut(weight, seq(30,100,5), right=FALSE)) %>%
  summarise(n = n())

ggplot(countsByGroup, aes(x=weight, y=n, fill=sex, color=sex)) +
  geom_bar(stat="identity", alpha=0.5, width=1) +
  geom_text(data=counts, aes(label=n, y=n+2), colour="black")

Or, you can just create countsByGroup and then create the equivalent of counts on the fly inside ggplot:

ggplot(countsByGroup, aes(x=weight, y=n, fill=sex, color=sex)) +
  geom_bar(stat="identity", alpha=0.5, width=1) +
  geom_text(data=countsByGroup %>% group_by(weight) %>% mutate(n=sum(n)), 
            aes(label=n, y=n+2), colour="black")
like image 77
eipi10 Avatar answered Jan 25 '26 05:01

eipi10



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!