Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to create a customized "geom_box" in ggplot2?

Tags:

r

ggplot2

boxplot

I'd like to have something similar to geom_boxplot, but that will only have a box, and that I can set the function for the lower and upper parts of the box, for example, showing plus minus 1 SD of the data from the mean. I am not if stat_boxplot can be used for this purpose or if some other function would fit better.

This can be (almost) done manually on data using stat="identity" and pre-computation, for example:

y <- rnorm(100)
y1 <- mean(y) - sd(y)
y2 <- mean(y) + sd(y)
df1 <- data.frame(y)
df2 <- data.frame(
  x = 1,
  y0 = y1,
  y25 = y1,
  y50 = y1, # this is a problem...
  y75 = y2,
  y100 = y2
)

ggplot(df1, aes(x=1,y=y)) +
geom_boxplot() + 
  geom_boxplot(data = df2,
   mapping = aes(x = 1, y = 1, ymin = y0, lower = y25, middle = y50, upper = y75, ymax = y100),
   stat = "identity", alpha = 0.1, fill = "red")

enter image description here

This example has several problems:

  1. the box is not of the width of the boxplot
  2. the lower part of the box has a thicker line than the upper part (since I needed to say where "middle" should be)
  3. The computation needs to happen manually per data (which could be wrapped in a function, but due to the other problems, I didn't get to that yet).

In short, I'd like something like geom_box but couldn't find it from a google search, and I'd be happy for some directions on how to proceed with writing such a customized geom function (I guess this is a start, but some more help would be welcomed).

like image 839
Tal Galili Avatar asked Oct 22 '25 09:10

Tal Galili


1 Answers

newbox <- function(values) {
  values <- na.omit(values)
  data.frame(
    ymin = mean(values) - sd(values),
    lower = mean(values) - sd(values),
    middle = mean(values),
    upper = mean(values) + sd(values),
    ymax = mean(values) + sd(values),
    width = 0.75
  )
}

ggplot(iris, aes(Species, Sepal.Length)) + 
  stat_summary(fun.data = newbox, geom = "boxplot", fatten = NA) 

enter image description here

Like that?

If you want it with no fill, you could use built-in functions like:

ggplot(iris, aes(Species, Sepal.Length)) + 
  stat_summary(fun.data = mean_sdl, fun.args = list(mult = 1), 
               geom = "crossbar", fatten = NA) 
like image 169
Brian Avatar answered Oct 24 '25 23:10

Brian