I am struggling with "out-of-sample" prediction using loess
. I get NA
values for new x that are outside the original sample. Can I get these predictions?
x <- c(24,36,48,60,84,120,180)
y <- c(3.94,4.03,4.29,4.30,4.63,4.86,5.02)
lo <- loess(y~x)
x.all <- seq(3, 200, 3)
predict(object = lo, newdata = x.all)
I need to model full yield curve, i.e. interest rates for different maturities.
loess must use the data originally used to fit the loess model to compute the predictions. If you fit the loess model using the data argument, then the data set given by data should not be changed between the fit and the prediction.
The name 'loess' stands for Locally Weighted Least Squares Regression. So, it uses more local data to estimate our Y variable. But it is also known as a variable bandwidth smoother, in that it uses a 'nearest neighbors' method to smooth.
A higher span smooths out the fit more, while a lower span captures more trends but introduces statistical noise if there is too little data. I use a higher span for smaller sample sizes and a lower span for larger sample sizes.
From the manual page of predict.loess
:
When the fit was made using surface = "interpolate" (the default), predict.loess will not extrapolate – so points outside an axis-aligned hypercube enclosing the original data will have missing (NA) predictions and standard errors
If you change the surface parameter to "direct" you can extrapolate values.
For instance, this will work (on a side note: after plotting the prediction, my feeling is that you should increase the span
parameter in the loess
call a little bit):
lo <- loess(y~x, control=loess.control(surface="direct"))
predict(lo, newdata=x.all)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With