Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Can I fix overlapping dashed lines in a histogram in ggplot2?

I am trying to plot a histogram of two overlapping distributions in ggplot2. Unfortunately, the graphic needs to be in black and white. I tried representing the two categories with different shades of grey, with transparency, but the result is not as clear as I would like. I tried adding outlines to the bars with different linetypes, but this produced some strange results.

require(ggplot2)
set.seed(65)
a = rnorm(100, mean = 1, sd = 1)
b = rnorm(100, mean = 3, sd = 1)
dat <- data.frame(category = rep(c('A', 'B'), each = 100),
              values = c(a, b))

ggplot(data = dat, aes(x = values, linetype = category, fill = category)) +
        geom_histogram(colour = 'black', position = 'identity', alpha = 0.4, binwidth = 1) +
        scale_fill_grey()

histogram

Notice that one of the lines that should appear dotted is in fact solid (at a value of x = 4). I think this must be a result of it actually being two lines - one from the 3-4 bar and one from the 4-5 bar. The dots are out of phase so they produce a solid line. The effect is rather ugly and inconsistent.

  1. Is there any way of fixing this overlap?
  2. Can anyone suggest a more effective way of clarifying the difference between the two categories, without resorting to colour?

Many thanks.

like image 652
user2390246 Avatar asked Oct 16 '25 17:10

user2390246


1 Answers

One possibility would be to use a 'hollow histogram', as described here:

# assign your original plot object to a variable 
p1 <- ggplot(data = dat, aes(x = values, linetype = category, fill = category)) +
  geom_histogram(colour = 'black', position = 'identity', alpha = 0.4, binwidth = 0.4) +
  scale_fill_grey()
# p1

# extract relevant variables from the plot object to a new data frame
# your grouping variable 'category' is named 'group' in the plot object
df <- ggplot_build(p1)$data[[1]][ , c("xmin", "y", "group")]

# plot using geom_step
ggplot(data = df, aes(x = xmin, y = y, linetype = factor(group))) +
  geom_step()

enter image description here

If you want to vary both linetype and fill, you need to plot a histogram first (which can be filled). Set the outline colour of the histogram to transparent. Then add the geom_step. Use theme_bw to avoid 'grey elements on grey background'

p1 <- ggplot() +
  geom_histogram(data = dat, aes(x = values, fill = category),
                 colour = "transparent", position = 'identity', alpha = 0.4, binwidth = 0.4) +
  scale_fill_grey()

df <- ggplot_build(p1)$data[[1]][ , c("xmin", "y", "group")]
df$category <- factor(df$group, labels = c("A", "B"))

p1 +
  geom_step(data = df, aes(x = xmin, y = y, linetype = category)) +
  theme_bw()

enter image description here

like image 63
Henrik Avatar answered Oct 18 '25 10:10

Henrik



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!