Since I want to compare several distibutions, I am creating histrograms of the same variable but for different years. However, the scale of the y axis changes, because the highest point of the frequencies is different every year. I want to create histograms in which all y axis display the same range, even if there are no frequencies for that point.
More precisely, in one year the peak of the disribution is 30 counts, in another year it is 35. on the graphs, 30 looks the same as 35 in the other one because the scale of the y-axis changes.
I have tried ylim=(35), but that only leads the the error "invalid value for ylim".
Thanks!
Type ?hist
into your console to see the documentation. You'll see ylim
is for "the range of ... y values". There is an example given showing how ylim
is used, hist(x, freq = FALSE, ylim = c(0, 0.2))
. There you can see that you need to give ylim
a vector containing the lower limit and upper limit.
With a histogram you almost always want the lower limit to be zero (failure to do so is generally regarded as a statistical sin). So as pointed out in the comments above, you could do with setting ylim=c(0,35)
.
Sample with a minimal example:
#Sets frequencies with which x and y data will appear
yfreq <- c(1:10, 10:1) #frequencies go up to 10 and down again
xfreq <- c(1:7, rep(7, times=6), 7:1) #frequencies go up to 7 and down again
xdata <- rep(1:length(xfreq), times=xfreq)
ydata <- rep(1:length(yfreq), times=yfreq)
par(mfrow=c(2,2))
hist(ydata, breaks=((0:max(ydata)+1)-0.5), ylim=c(0,10),
main="Hist of y with ylim set")
hist(xdata, breaks=((0:max(xdata)+1)-0.5), ylim=c(0,10),
main="Hist of x with ylim set")
hist(ydata, breaks=((0:max(ydata)+1)-0.5),
main="Hist of y without ylim set")
hist(xdata, breaks=((0:max(xdata)+1)-0.5),
main="Hist of x without ylim set")
So setting ylim
appropriately makes the side-by-side comparison of histogram work better.
In practice it's convenient to do this automatically, just by finding what's the highest peak in both your datasets and using that in your ylim
. How you do that depends on whether you are constructing a histogram of frequencies (which is what R does automatically if your breaks are equidistant, unless you specify otherwise) or of densities, but one way is to create — but not plot — histogram objects and extract either their counts
or their density
as appropriate.
#Make histogram object but don't draw it
yhist <- hist(ydata, breaks=((0:max(ydata)+1)-0.5), plot=FALSE)
xhist <- hist(xdata, breaks=((0:max(xdata)+1)-0.5), plot=FALSE)
#Find highest count, use it to set ylim of histograms of counts
highestCount <- max(xhist$counts, yhist$counts)
hist(ydata, breaks=((0:max(ydata)+1)-0.5), ylim=c(0,highestCount),
main="Hist of y with automatic ylim")
hist(xdata, breaks=((0:max(xdata)+1)-0.5), ylim=c(0,highestCount),
main="Hist of x with automatic ylim")
#Same but for densities
highestDensity <- max(xhist$density, yhist$density)
hist(ydata, breaks=((0:max(ydata)+1)-0.5),
freq=FALSE, ylim=c(0,highestDensity),
main="Hist of y with automatic ylim")
hist(xdata, breaks=((0:max(xdata)+1)-0.5),
freq=FALSE, ylim=c(0,highestDensity),
main="Hist of x with automatic ylim")
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With