Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to plot multiple curves with a multi factor table?

Tags:

r

ggplot2

I have a table looking like this

Condition   downsampling    E. coli S. cerevisiae
 Treated    45000000    1   0.944968385
 Treated    40000000    1   0.932060195
 Treated    32000000    1   0.900323585
 Treated    16000000    0.99999549  0.73127366
 Treated    8000000 0.99993898  0.503170515
 Treated    4000000 0.99892133  0.30287704
 Treated    2000000 0.97810106  0.16861184
 Treated    1000000 0.86656028  0.089200035
Untreated   45000000    0.886457145 0.108071345
Untreated   40000000    0.85728706  0.09729946
Untreated   32000000    0.79344402  0.08072991
Untreated   16000000    0.553520285 0.04306675
Untreated   8000000 0.337149605 0.023035225
Untreated   4000000 0.18756713  0.0119472
Untreated   2000000 0.097686445 0.006072755
Untreated   1000000 0.05007619  0.0031243

how can I generate a graph in R that has 4 curves that respectively the two condition for each bacteria? See attach picture generated in Excel for what I'd like to obtain

4 curves graphs

Tried something with ggplot along these lines but doesn't work

test <- ggplot(dfuse, aes(x=Downsampling))+
  geom_line(data = dfuse[dfuse$Treatment=="Untreated",], aes(x=Downsampling, y=E.coli, color="E.coli untreated"), size=1.5) +
  geom_line(data = dfuse[dfuse$Treatment=="Untreated",], aes(x=Downsampling, y=S.cerevisiae, color="S.cerevisiae untreated"), size=1.5) +
  geom_line(data = dfuse[dfuse$Treatment=="ALU Plus 120K 1hr 1X 42C",], aes(x=Downsampling, y=E.coli, color="E.coli treated"), size=1.5) +
  geom_line(data = dfuse[dfuse$Treatment=="ALU Plus 120K 1hr 1X 42C",], aes(x=Downsampling, y=S.cerevisiae, color="S.cerevisiae treated"), size=1.5) +
  labs(x="Downsampling", y="Genome Coverage (%)") +
  scale_color_manual(values=c("E.coli untreated"="blue", "S.cerevisiae untreated"="red", "E.coli treated"="green", "S.cerevisiae treated"="purple")) +
  theme_minimal()
  
test
  
ggsave(test,filename = "test.png", height=8, width=10)
like image 230
desplat yvain Avatar asked Oct 23 '25 15:10

desplat yvain


1 Answers

library(dplyr)
library(tidyr)
library(ggplot2)

dfuse %>% 
  pivot_longer(-c(Condition, downsampling)) %>% 
  mutate(cb = paste(gsub("\\.\\.", "\\.", name), tolower(Condition))) %>% 
  ggplot(aes(x = downsampling/1e6, y = value, color = cb, group = cb)) + 
  geom_line(color = "black") +
  geom_point() + 
  labs(title = "Microbial genome breradth of coverage",
       x="Downsampling (Millions)", y="Genome Coverage (%)") +
  scale_x_continuous(limits = c(0, 50)) +
  scale_y_continuous(labels = scales::percent) +
  scale_color_manual(values=c("E.coli untreated"="blue", 
                              "S.cerevisiae untreated"="red", 
                              "E.coli treated"="green", 
                              "S.cerevisiae treated"="purple"),
                    name = "") +
  theme_classic() +
  theme(legend.position="bottom",
        plot.title = element_text(hjust = 0.5))

Data:

read.table(text = "Condition   downsampling    'E. coli' 'S. cerevisiae'
 Treated    45000000    1   0.944968385
 Treated    40000000    1   0.932060195
 Treated    32000000    1   0.900323585
 Treated    16000000    0.99999549  0.73127366
 Treated    8000000 0.99993898  0.503170515
 Treated    4000000 0.99892133  0.30287704
 Treated    2000000 0.97810106  0.16861184
 Treated    1000000 0.86656028  0.089200035
Untreated   45000000    0.886457145 0.108071345
Untreated   40000000    0.85728706  0.09729946
Untreated   32000000    0.79344402  0.08072991
Untreated   16000000    0.553520285 0.04306675
Untreated   8000000 0.337149605 0.023035225
Untreated   4000000 0.18756713  0.0119472
Untreated   2000000 0.097686445 0.006072755
Untreated   1000000 0.05007619  0.0031243", 
header = T, stringsAsFactors = T) -> dfuse

Created on 2024-03-28 with reprex v2.0.2

like image 196
M-- Avatar answered Oct 26 '25 04:10

M--