I want to plot a (facetted) stacked barplot where the X-Axis is in percent. Also the Frequency labels are displayed within the bars.
After quite some work and viewing many different questions on stackoverflow, I found a solution on how to solve this with ggplot2. However, I don't do it directly with ggplot2, I manually aggregate my data with a table call. And I do this manual aggregation in a complicated way and also calculate the percent values manually with temp variables (see source code comment "manually aggregate data").
How can I do the same plot, but in a nicer way without the manual and complicated data aggregation?
library(ggplot2)
library(scales)
library(gridExtra)
library(plyr)
##
## Random Data
##
fact1 <- factor(floor(runif(1000, 1,6)),
labels = c("A","B", "C", "D", "E"))
fact2 <- factor(floor(runif(1000, 1,6)),
labels = c("g1","g2", "g3", "g4", "g5"))
##
## STACKED BAR PLOT that scales x-axis to 100%
##
## manually aggregate data
##
mytable <- as.data.frame(table(fact1, fact2))
colnames(mytable) <- c("caseStudyID", "Group", "Freq")
mytable$total <- sapply(mytable$caseStudyID,
function(caseID) sum(subset(mytable, caseStudyID == caseID)$Freq))
mytable$percent <- round((mytable$Freq/mytable$total)*100,2)
mytable2 <- ddply(mytable, .(caseStudyID), transform, pos = cumsum(percent) - 0.5*percent)
## all case studies in one plot (SCALED TO 100%)
p1 <- ggplot(mytable2, aes(x=caseStudyID, y=percent, fill=Group)) +
geom_bar(stat="identity") +
theme(legend.key.size = unit(0.4, "cm")) +
theme(axis.text.x = element_text(angle = 60, hjust = 1)) +
geom_text(aes(label = sapply(Freq, function(x) ifelse(x>0, x, NA)), y = pos), size = 3) # the ifelse guards against printing labels with "0" within a bar
print(p1)
..

After you make the data:
fact1 <- factor(floor(runif(1000, 1,6)),
labels = c("A","B", "C", "D", "E"))
fact2 <- factor(floor(runif(1000, 1,6)),
labels = c("g1","g2", "g3", "g4", "g5"))
dat = data.frame(caseStudyID=fact1, Group=fact2)
You can automate making an unlabeled graph of the kind that you want with position_fill:
ggplot(dat, aes(caseStudyID, fill=Group)) + geom_bar(position="fill")

I don't know if there's a way to generate the text labels automatically. The positions and counts from the stacked graph are accessible with ggplot_build, if you want to use what ggplot calculates instead of doing it separately.
p = ggplot(dat, aes(caseStudyID, fill=Group)) + geom_bar(position="fill")
ggplot_build(p)$data[[1]]
That will return a dataframe with (among other things), count, x, y, ymin, and ymax variables that can be used to create positioned labels.
If you want the labels vertically centered in each category, first make a column with values halfway between ymin and ymax.
freq = ggplot_build(p)$data[[1]]
freq$y_pos = (freq$ymin + freq$ymax) / 2
Then add the labels to the graph with annotate.
p + annotate(x=freq$x, y=freq$y_pos, label=freq$count, geom="text", size=3)

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With