Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How does ggplot2 find residuals and fitted values stored lm objects?

I generally use broom::augment() to create .fitted and .resid columns which can then be plotted. By accident I used the non-augmented model object and still got a plot that based on my understanding should not work. Where is ggplot2 finding .resid and .fitted?

mod <- lm(Ozone ~ Solar.R + Wind + Temp, data = airquality)
mod$`.resid`
#> NULL
mod$`.fitted`
#> NULL
library(ggplot2)
ggplot(mod, aes(x = .fitted, y = .resid)) + geom_point()

Created on 2024-02-01 with reprex v2.1.0

like image 968
jtr13 Avatar asked Dec 30 '25 21:12

jtr13


1 Answers

Whatever is passed in ggplot, is first passed into the function fortify before ploting is done. Take a peek at the ggplot function below:

ggplot2:::ggplot.default
function (data = NULL, mapping = aes(), ..., environment = parent.frame()) 
{
    if (!missing(mapping) && !inherits(mapping, "uneval")) {
        cli::cli_abort(c("{.arg mapping} should be created with {.fn aes}.", 
            x = "You've supplied a {.cls {class(mapping)[1]}} object"))
    }
    data <- fortify(data, ...)
    ....

Th function fortify is generic and it does contain a method for a linear model object and its defined as below:

ggplot2:::fortify.lm
function (model, data = model$model, ...) 
{
    infl <- stats::influence(model, do.coef = FALSE)
    data$.hat <- infl$hat
    data$.sigma <- infl$sigma
    data$.cooksd <- stats::cooks.distance(model, infl)
    data$.fitted <- stats::predict(model)
    data$.resid <- stats::resid(model)
    data$.stdresid <- stats::rstandard(model, infl)
    data
}

Notice how fortify creates all the variables:

eg

head(fortify(lm(Ozone ~ Solar.R + Wind + Temp, data = airquality)))
  Ozone Solar.R Wind Temp       .hat   .sigma      .cooksd   .fitted      .resid   .stdresid
1    41     190  7.4   67 0.04213526 21.26578 1.619281e-03 33.045483   7.9545175  0.38372526
2    36     118  8.0   72 0.02386496 21.28020 1.399323e-05 34.998710   1.0012902  0.04784798
3    12     149 12.6   74 0.01493584 21.24339 1.410343e-03 24.822814 -12.8228139 -0.60997176
4    18     313 11.5   62 0.07184509 21.28037 1.049577e-05 18.475226  -0.4752262 -0.02328889
7    23     299  8.6   65 0.06645355 21.26005 3.644684e-03 32.261431  -9.2614315 -0.45255232
8    19      99 13.8   59 0.04581198 21.12342 1.888167e-02 -6.949919  25.9499188  1.25423141

Now this dataset is the one used for plotting.

like image 170
KU99 Avatar answered Jan 02 '26 11:01

KU99



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!