I am having trouble figuring out why ggplot reorders my categorical variables
xaxis = c('80','90','100')
test = data.frame(x = xaxis, y = c(1,2,3))
ggplot(test, aes(x=x,y=y)) + geom_point()

I found online that it has something to do with the factor levels and the following code fixes my problem.
xaxis = c('80','90','100')
xaxis = factor(xaxis,levels=xaxis)
test = data.frame(x = xaxis, y = c(1,2,3))
ggplot(test, aes(x=x,y=y)) + geom_point()

But if you go back to the original code.
class(xaxis)
[1] "character"
Its simply a character vector and I don't see any innate ordering. Can someone explain what is happening here please? Do I always have to change my x variable into a factor for ggplot to respect my sequence?
sort(xaxis)
[1] "100" "80" "90"
Sorting of character vectors is done by a character by character basis - ie it doesn't understand the numerical context of the data.
ggplot2 will convert character variables to factors and by default factors sort their levels:
factor(xaxis)
[1] 80 90 100
Levels: 100 80 90
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With