Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Polynomial fitting with R using poly vs. I function

I'm trying to understanding polynomial fitting with R. From my research on the internet, there apparently seems to be two methods. Assuming I want to fit a cubic curve ax^3 + bx^2 + cx + d into some dataset, I can either use:

lm(dataset, formula = y ~ poly(x, 3))

or

lm(dataset, formula = y ~ x + I(x^2) + I(x^3))

However, as I try them in R, I ended up with two different curves with complete different intercepts and coefficients. Is there anything about polynomial I'm not getting right here?

like image 373
James Ngo Avatar asked Sep 14 '25 05:09

James Ngo


1 Answers

This comes down to what the different functions do. poly generates orthonormal polynomials. Compare the values of poly(dataset$x, 3) to I(dataset$x^3). Your coefficients will be different because the values being passed directly into the linear model (as opposed to indirectly, through either the I or poly function) are different.

As 42 pointed out, your predicted values will be fairly similar. If a is your first linear model and b is your second, b$fitted.values - a$fitted.value should be fairly close to 0 at all points.

like image 115
Daniel V Avatar answered Sep 15 '25 20:09

Daniel V