Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

r - ggplot2 - Add differences to grouped bar charts

I am plotting the following data on ggplot:

library(ggplot2)

DF <- structure(list(Type = structure(c(1L, 2L, 1L, 2L, 1L, 2L, 1L, 
2L), .Label = c("Observed", "Simulated"), class = "factor"), 
    variable = structure(c(1L, 1L, 2L, 2L, 3L, 3L, 4L, 4L), .Label = c("EM to V6", 
    "V6 to R0", "R0 to R4", "R4 to R9"), class = "factor"), value = c(28, 
    30, 29, 35, 32, 34, 26, 29)), row.names = c(NA, -8L), .Names = c("Type", 
"variable", "value"), class = "data.frame")

ggplot(DF, aes(variable, value)) +
      geom_bar(aes(fill = Type), position = "dodge", stat="identity", width=.5) +
      geom_text(aes(label=value, group=Type), position=position_dodge(width=0.5), vjust=-0.5) +
      theme_bw(base_size = 18) +
      ylab('Duration (days)') + xlab('Growth stages')

enter image description here

I was wondering if there is any graphical way to add the differences between each group of bars to the chart?

This is the data frame with the differences to be added:

DF2 <- data.frame(variable=c("EM to V6", "V6 to R0", "R0 to R4", "R4 to R9"), value=c(2,6,2,3)

The final chart would look somewhat like this (notice the coloured bars):

enter image description here

source: https://www.excelcampus.com/charts/variance-clustered-column-bar-chart/

Is that possible to do using ggplot?

like image 287
thiagoveloso Avatar asked Dec 01 '25 12:12

thiagoveloso


1 Answers

As rawr suggested, you can add a layer of bars behind the current ones with a slightly smaller width:

library(tidyverse)
diff_df = DF %>%
    group_by(variable) %>%
    spread(Type, value) %>%
    mutate(diff = Simulated - Observed)

ggplot(DF, aes(variable, value)) +
    geom_bar(aes(y = Simulated), data = diff_df, stat = "identity", fill = "grey80", width = 0.4) +
    geom_bar(aes(fill = Type), position = "dodge", stat="identity", width=.5) +
    geom_text(aes(label=value, group=Type), position=position_dodge(width=0.5), vjust=-0.5) +
    geom_text(aes(label = diff, y = Simulated), vjust=-0.5, data = diff_df, hjust = 2, colour = scales::muted("red")) +
    theme_bw(base_size = 18) +
    ylab('Duration (days)') + xlab('Growth stages')

Updated code to deal with Observed sometimes being higher than Simulated:

library(tidyverse)
diff_df = DF %>%
    group_by(variable) %>%
    spread(Type, value) %>%
    mutate(diff = Simulated - Observed,
           max_y = max(Simulated, Observed),
           sim_higher = Simulated > Observed)

ggplot(DF, aes(variable, value)) +
    geom_bar(aes(y = max_y), data = diff_df, stat = "identity", fill = "grey80", width = 0.4) +
    geom_bar(aes(fill = Type), position = "dodge", stat="identity", width=.5) +
    geom_text(aes(label=value, group=Type), position=position_dodge(width=0.5), vjust=-0.5) +
    geom_text(aes(label = diff, y = max_y), vjust=-0.5, data = diff_df %>% filter(sim_higher), 
              hjust = 2, colour = scales::muted("red")) +
    geom_text(aes(label = diff, y = max_y), vjust=-0.5, data = diff_df %>% filter(!sim_higher), 
              hjust = -1, colour = scales::muted("red")) +
    theme_bw(base_size = 18) +
    ylab('Duration (days)') + xlab('Growth stages')
like image 136
Marius Avatar answered Dec 03 '25 02:12

Marius



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!