I am plotting survival functions with the survival package. Everything works fine, but how do I know which curve is which? And how can I add it to a legend?
  url <- "http://socserv.mcmaster.ca/jfox/Books/Companion/data/Rossi.txt"
  Rossi <- read.table(url, header=TRUE)[,c(1:10)]
  km <- survfit(Surv(week, arrest)~race, data=Rossi)
  plot(km, lty=c(1 ,2))
how do I know which curve is which?
Using str() you can see which elements are in km.
km$strata shows there are 48 and 10 elements. This coincides with the declining pattern of the first 48 items and last 10 items in km$surv
km$surv[1:48]
km$surv[49:58]
So in addition to the hint on order in print(), with this particular dataset we can also be sure that the first 48 elements belong to race=black
And how can I add it to a legend?
Unlike other model output km is not easily transformed to a data.frame. However, we can extract the elements ourselves and create a data.frame and then plot it ourselves.
First we create a factor referring to the strata: 48 blacks and 10 others
race <- as.factor(c(rep("black", 48), rep("other", 10)))
df <- data.frame(surv = km$surv, race = race, time = km$time)
Next we can plot it as usual (in my case, using ggplot2).
library(ggplot2)
ggplot(data = df, aes(x = time, y = surv)) + 
    geom_point(aes(colour = race)) + 
    geom_line(aes(colour = race)) +
    theme_bw()

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With