In the documentation of the function cv.glmnet(), it is given that:
lambda.1se :
largest value of lambda such that error is within 1 standard error of the minimum.
Which means that lambda.1se gives the lambda, which gives an error (cvm) which is one standard error away from the minimum error.
So, while trying to check this fact:
There is a data set Boston in the library MASS. I performed a cross validation, using the lasso:
x = model.matrix(crim~.-1,data=Boston)#-1 for removing the intercept column
y = Boston$crim
cv.lasso = cv.glmnet(x,y,type.measure = "mse",alpha=1)
And the value of cv.lasso$lambda.min comes out to be:
> cv.lasso$lambda.min
[1] 0.05630926
And, the value of cv.lasso$lambda.1se is:
> cv.lasso$lambda.1se
[1] 3.375651
Now, look at this:
> std(cv.lasso$cvm)
[1] 0.7177808
Where std is a function, that returns the standard error of the values inserted in it.1
And the minimum value of cvm can be found as:
> cv.lasso$cvm[cv.lasso$lambda==cv.lasso$lambda.min]
[1] 42.95009
So, we add the standard error to the value of cvm and we get:
> 42.95009+0.7177808
[1] 43.66787
Even though there is no lambda value corresponding to this cvm value, we can have an idea on the basis of existing data:

Which means lambda.1se should be between 0.4784899 and 0.4359821. But that's absolutely not the case. So, there's a gut feeling that says I'm making a blunder here. Can you help me point at that?
1:Definition of std:
std<-function(x)
sd(x)/sqrt(length(x))
I'll add a seed so one can replicate the results below:
library(glmnet)
library(MASS)
data("Boston")
x = model.matrix(crim~.-1,data=Boston)#-1 for removing the intercept column
y = Boston$crim
set.seed(100)
cv.lasso = cv.glmnet(x,y,type.measure = "mse",alpha=1)
The minimum cross-validated MSE is min(cv.lasso$cvm) = 43.51256. The corresponding lambda is cv.lasso$lambda.min = 0.01843874. The lambda.1se is cv.lasso$lambda.1se = 3.375651. This corresponds to a cross-validated MSE of
cv.lasso$cvm[which(cv.lasso$lambda == cv.lasso$lambda.1se)] = 57.5393
We can access the cross-validated standard errors directly from GLMNET's output as follows:
cv.lasso$cvsd[which(cv.lasso$lambda == cv.lasso$lambda.min)] = 15.40236
So the cross-validated MSE one standard error away is
43.51256 + 15.40236 = 58.91492
This is just slightly higher than the cross-validated MSE at lambda.1se above (i.e. 57.5393). If we look at the cross-validated MSE at the lambda just before the lambda.1se, it is:
cv.lasso$cvm[which(cv.lasso$lambda == cv.lasso$lambda.1se)-1] = 59.89079
So now that we can reconcile GLMNET's output, let me explain why you are not getting the same thing using your calculation:
cv.lasso$cvm contains the cross-validated mean MSE for each value of lambda. lambda.min, we have 10 folds. We fit 10 models and have 10 out-of-sample MSE's. The mean of these 10 MSE's is given by cv.lasso$cvm[which(cv.lasso$lambda == cv.lasso$lambda.min)]. The standard deviation of these 10 MSE's is given by cv.lasso$cvsd[which(cv.lasso$lambda == cv.lasso$lambda.min)]. What we are not given in the GLMNET output are the 10 MSE's at lambda.min. If we had this, then we should be able to replicate the the standard error by using the formula you have above.Let me know if this helps.
EDIT: Let's do an example whereby we define in advance three folds
set.seed(100)
folds = sample(1:3, nrow(x), replace = T)
cv.lasso = cv.glmnet(x,y,type.measure = "mse",alpha=1, keep =T, foldid = folds)
Note that
> min(cv.lasso$cvm)
[1] 42.76584
> cv.lasso$cvsd[which.min(cv.lasso$cvm)]
[1] 17.89725
(These differ from the earlier example above because we've defined our own folds)
Note also that I have an additional parameter keep = T in the cv.glmnet call. This returns the fold predictions for each lambda. You can extract them for the optimal lambda by doing:
cv.lasso$fit.preval[,which.min(cv.lasso$cvm)]
Before we proceed, let's create a data frame with the response, the fold predictions, and the corresponding folds:
library(data.table)
OOSPred = data.table(y = y,
predictions = cv.lasso$fit.preval[,which.min(cv.lasso$cvm)],
folds = folds)
Here is a preview of the first 10 rows:
> head(OOSPred, 10)
y predictions folds
1: 0.00632 -0.7477977 1
2: 0.02731 -1.3823830 1
3: 0.02729 -3.4826143 2
4: 0.03237 -4.4419795 1
5: 0.06905 -3.4373021 2
6: 0.02985 -2.5256505 2
7: 0.08829 0.7343478 3
8: 0.14455 1.1262462 2
9: 0.21124 4.0507847 2
10: 0.17004 0.5859587 1
For example, for the cases where folds = 1, a model was built on folds #2 & #3 and then predictions were obtained for the observations in fold #1. We compute the MSE by fold now:
OOSPredSum = OOSPred[, list(MSE = mean((y - predictions)^2)), by = folds]
folds MSE
1: 1 27.51469
2: 2 75.72847
3: 3 19.93480
Finally, we return the mean MSE and standard error of the MSE across the folds
> OOSPredSum[, list("Mean MSE" = mean(MSE), "Standard Error" = sd(MSE)/sqrt(3))]
Mean MSE Standard Error
1: 41.05932 17.47213
GLMNET may be performing a weighted mean and standard error (weighted by the number of observations in each fold), which is why the figures above a close but do not match exactly.
I think the procedure is:
Let's say that we have ƛmin and serrorƛmin (calculated in step 2.). Now, ƛse is defined as "largest value of lambda such that error is within 1 standard error of the minimum". Then the condition for ƛse is:
ƛse in [ƛmin - seƛmin, ƛmin + seƛmin]
Then ƛse = max(ƛ),ƛ where which fulfills the condition above.
I can show you an example:
lasso_cv <- cv.glmnet(x = x, y= endpoint, alpha = 1, lambda = lambdas_to_try,
standardize = TRUE, nfolds = 10,type.measure="auc",
family="binomial")
Note that ƛmin is:
lasso_cv$lambda.min
[1] 0.007742637
And serrorƛmin is:
serrorlmin <- lasso_cv$cvsd[which(lasso_cv$lambda == lasso_cv$lambda.min)]
serrorlmin
[1] 0.01058009
Then, the range in which ƛse is chosen is:
rang <- c(lasso_cv$lambda.min - serrorlmin,lasso_cv$lambda.min + serrorlmin)
[1] -0.002837457 0.018322731
And to find it:
max(lasso_cv$lambda[lasso_cv$lambda>=rang[1] & lasso_cv$lambda<=rang[2]])
[1] 0.01629751
And this value matches with ƛse!
lasso_cv$lambda.1se # 0.01629751
I hope it helps!
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With