Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

R: how to separate zeros in ggplot histogram

I am trying to show the distribution of data between three different methods(FAP, One PIT (onetrans), Two PIT (twotrans), shown in facets below) for measuring the forest fuels. My count on the y-axis is the number of sample points that estimate the grouped value on the x-axis (Total.kg.m2). The Total.kg.m2 is a continuous variable. I don't particularly care how big the binwidth is on the x-axis is but I want only values that are exactly zero to be above the "0" label. My current graph [1] is misrepresentative because there are no sample points that estimate "0" for the FAP method. Below is some example data and my code. How can I do this more effectively? My dataframe is called "cwd" but I have included a subset at the bottom.

My current graph:

The code for my current graph:

method_names <- c(`FAP` = "FAP", `onetrans` = "PIT - One Transect ", `twotrans` ="PIT - Two Transects")

ggplot(sampleData, aes(Total.kg.m2)) +
  geom_histogram(bins=40, color = "black", fill = "white") +
  theme_bw() +
  theme(panel.grid.major = element_blank(), panel.grid.minor = 
element_blank(),
    panel.background = element_blank(), axis.line = element_line(colour = "black"),
    legend.position = "none",axis.text=element_text(size=10), axis.title = 
      element_text(size = 12)) +
  scale_x_continuous(name=  expression("kg m"^"-2"), breaks =seq(0,16,1)) +
  scale_y_continuous(name = "Count", breaks = seq(0, 80,10), limits= c(0,70)) +
  facet_grid(.~method) +
  facet_wrap(~method, ncol =1, labeller = as_labeller(method_names)) +
  theme(strip.text.x = element_text(size =14),
        strip.background = element_rect(color = "black", fill = "gray"))

I don't think using geom_bar gets me what I want and I tried changing the binwidth to 0.05 in geom_histogram but then I get bins too small. Essentially, I think I'm trying to change my data from continuous numeric to factors but I'm not sure how to make it work.

Here is some sample data:

sampleData
       Site Treatment Unit Plot Total.Tons.ac Total.kg.m2   method
130    Thinning        CO   10    7     0.4500000   0.1008000 twotrans
351 Shelterwood        CO   12    1     7.2211615   1.6175402 twotrans
88     Thinning        NB    3    7     1.1400000   0.2553600 twotrans
224 Shelterwood        NB    2    3     2.1136105   0.4734487 onetrans
54     Thinning        SB    9   11     1.8857743   0.4224134 onetrans
74     Thinning        SB    1    3     0.8500000   0.1904000 twotrans
328 Shelterwood        DB    7   11     0.8740906   0.1957963 twotrans
341 Shelterwood        CO   10    5     2.4210886   0.5423239 twotrans
266 Shelterwood        WB    9    7     1.0092961   0.2260823 onetrans
405 Shelterwood        WB    9    5     7.0029263   1.5686555      FAP
332 Shelterwood        NB    8    7     2.8059152   0.6285250 twotrans
126    Thinning        SB    9   11     1.4900000   0.3337600 twotrans
295 Shelterwood        NB    2    5     7.6567281   1.7151071 twotrans
406 Shelterwood        WB    9    7     3.0703135   0.6877502      FAP
179    Thinning        FB    6    9    13.2916773   2.9773357      FAP
185    Thinning        FB    7    9     5.3594318   1.2005127      FAP
39     Thinning        FB    7    5     0.0000000   0.0000000 onetrans
187    Thinning        NB    8    1     0.9477477   0.2122955      FAP
10     Thinning        FB    2    7     0.0000000   0.0000000 onetrans
102    Thinning        SB    5   11     0.0000000   0.0000000 twotrans

 dput(sampleData)
structure(list(Site = structure(c(2L, 1L, 2L, 1L, 2L, 2L, 1L, 
1L, 1L, 1L, 1L, 2L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L), .Label = 
c("Shelterwood", 
"Thinning"), class = "factor"), Treatment = structure(c(1L, 1L, 
4L, 4L, 5L, 5L, 2L, 1L, 6L, 6L, 4L, 5L, 4L, 6L, 3L, 3L, 3L, 4L, 
3L, 5L), .Label = c("CO", "DB", "FB", "NB", "SB", "WB"), class = "factor"), 
Unit = c(10L, 12L, 3L, 2L, 9L, 1L, 7L, 10L, 9L, 9L, 8L, 9L, 
2L, 9L, 6L, 7L, 7L, 8L, 2L, 5L), Plot = c(7L, 1L, 7L, 3L, 
11L, 3L, 11L, 5L, 7L, 5L, 7L, 11L, 5L, 7L, 9L, 9L, 5L, 1L, 
7L, 11L), Total.Tons.ac = c(0.45, 7.221161504, 1.14, 2.113610483, 
1.885774282, 0.85, 0.874090569, 2.421088641, 1.009296069, 
7.002926269, 2.805915201, 1.49, 7.656728085, 3.07031351, 
13.29167729, 5.359431807, 0, 0.947747726, 0, 0), Total.kg.m2 = c(0.1008, 
1.617540177, 0.25536, 0.473448748, 0.422413439, 0.1904, 0.195796287, 
0.542323856, 0.22608232, 1.568655484, 0.628525005, 0.33376, 
1.715107091, 0.687750226, 2.977335712, 1.200512725, 0, 0.212295491, 
0, 0), method = structure(c(3L, 3L, 3L, 2L, 2L, 3L, 3L, 3L, 
2L, 1L, 3L, 3L, 3L, 1L, 1L, 1L, 2L, 1L, 2L, 3L), .Label = c("FAP", 
"onetrans", "twotrans"), class = "factor")), .Names = c("Site", 
"Treatment", "Unit", "Plot", "Total.Tons.ac", "Total.kg.m2", 
"method"), row.names = c(130L, 351L, 88L, 224L, 54L, 74L, 328L, 
341L, 266L, 405L, 332L, 126L, 295L, 406L, 179L, 185L, 39L, 187L, 
10L, 102L), class = "data.frame")
like image 799
KBo Avatar asked Nov 29 '25 04:11

KBo


1 Answers

Replacing zeros with a placeholder below the minimum limit bins them to a separate bar

set.seed(0)
df_with_zeros <- data.frame(x=c(rpois(100, lambda=1))*10)
ggplot(df_with_zeros, aes(x=x))+
  geom_bar()+
  scale_x_binned(breaks=seq(0,40, 10))

df_without_zeros <- df_with_zeros
df_without_zeros$x[df_without_zeros$x==0]<- -1
ggplot(df_without_zeros, aes(x=x))+
  geom_bar()+
  scale_x_binned(breaks=c(seq(0,40, 10)))
like image 107
Rob O'Shea Avatar answered Nov 30 '25 20:11

Rob O'Shea