I have a data frame with "x" and "y" columns as numeric values, and a third column "cluster" as a hexidecimal string, an example seen below:
library(ggplot2)
library(scales)
colList = c(scales::hue_pal()(3),"#520090")
dat = data.frame(x=runif(100,0,1),y=runif(100,0,1),cluster=sample(1:4, 100, replace=T))
dat$cluster = factor(dat$cluster)
levels(dat$cluster) = c(colList)
head(dat)
I am trying to create a scatterplot with "x" and "y" columns mapped to the x and y axis, and with those points colored according to the hexadecimal value stored in the "cluster" column. I have tried the following:
ggplot(dat,aes(x,y)) +
geom_point(aes(colour = cluster), alpha=0.5)
However, this simply assigns the default first four values stored in scales::hue_pal()(4), and I have changed the last one to a dark purple color with hexadecimal value #520090. I also am trying to change the default hexadecimal values from appearing as the text in the legend. I tried unsuccessfully to hardcode in "Cluster 1", "Cluster 2", ..., "Cluster 4" as the legend text:
ggplot(dat,aes(x,y)) +
geom_point(aes(colour = cluster), alpha=0.5) +
theme(legend.text = element_text("Cluster 1", "Cluster 2", "Cluster 3", "Cluster 4"))
Any advice is much appreciated!
In order to color the dots based on the cluster identity, the cluster name (i.e., your hex values) needd to be mapped to a set of aesthetic values.
Since you want to have the hex values from the cluster column to represent actual colors, you can use the scale_color_manual function and give the levels of the cluster column as the values parameter. To changes the labels, simply set the desired labels value.
ggplot(dat, aes(x,y)) + geom_point(aes(colour = cluster), alpha=0.5) +
scale_color_manual(values = levels(dat$cluster),
labels = c("Cluster1","Cluster2","Cluster3", "Cluster4"))
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With