I want to plot a very simple boxplot like this in R:
desired graph
It is a log-link (Gamma distributed: jh_conc
is a hormone concentration variable) Generalized linear model of a continuous dependent variable (jh_conc
) for a categorical grouping variable (group: type of bee
)
My script that I already have is:
> jh=read.csv("data_jh_titer.csv",header=T)
> jh
group jh_conc
1 Queens 6.38542714
2 Queens 11.22512563
3 Queens 7.74472362
4 Queens 11.56834171
5 Queens 3.74020100
6 Virgin Queens 0.06080402
7 Virgin Queens 0.12663317
8 Virgin Queens 0.08090452
9 Virgin Queens 0.04422111
10 Virgin Queens 0.14673367
11 Workers 0.03417085
12 Workers 0.02449749
13 Workers 0.02927136
14 Workers 0.01648241
15 Workers 0.02150754
fit1=glm(jh_conc~group,family=Gamma(link=log), data=jh)
ggplot(fit, aes(group, jh_conc))+
geom_boxplot(aes(fill=group))+
coord_trans(y="log")
the resulting plot looks like this:
My question is: what (geom) extensions can I use to split the y-axis and rescale them different? Also how do I add the black circles (averages; which are calculated on a log scale and then back-transformed to the original scale) horizontal lines which are significance levels based on posthoc tests performed on log transformed data: ** : p<0.01, *** :p< 0.001?
You can't create a broken numeric axis in ggplot2
by design, mainly because it visually distorts the data/differences being represented and is considered misleading.
You can however use scale_log10() + annotation_logticks()
to help condense data across a wide range of values or better show heteroskedastic data. You can also use annotate
to build out your p-value representation stars and bars.
Also you can easily grab information from a model using it's named attributes, here we care about fit$coef
:
# make a zero intercept version for easy plotting
fit2 <- glm(jh_conc ~ 0 + group, family = Gamma(link = log), data = jh)
# extract relevant group means and use exp() to scale back
means <- data.frame(group = gsub("group", "",names(fit2$coef)), means = exp(fit2$coef))
ggplot(fit, aes(group, jh_conc)) +
geom_boxplot(aes(fill=group)) +
# plot the circles from the model extraction (means)
geom_point(data = means, aes(y = means),size = 4, shape = 21, color = "black", fill = NA) +
# use this instead of coord_trans
scale_y_log10() + annotation_logticks(sides = "l") +
# use annotate "segment" to draw the horizontal lines
annotate("segment", x = 1, xend = 2, y = 15, yend = 15) +
# use annotate "text" to add your pvalue *'s
annotate("text", x = 1.5, y = 15.5, label = "**", size = 4) +
annotate("segment", x = 1, xend = 3, y = 20, yend = 20) +
annotate("text", x = 2, y = 20.5, label = "***", size = 4) +
annotate("segment", x = 2, xend = 3, y = .2, yend = .2) +
annotate("text", x = 2.5, y = .25, label = "**", size = 4)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With