Using python and scikit-learn, I'd like to do a grid search. But some of my models end up being empty. How can I make the grid search function to ignore those models?
I guess I can have a scoring function which returns 0 if the models is empty, but I'm not sure how.
predictor = sklearn.svm.LinearSVC(penalty='l1', dual=False, class_weight='auto')
param_dist = {'C': pow(2.0, np.arange(-10, 11))}
learner = sklearn.grid_search.GridSearchCV(estimator=predictor,
param_grid=param_dist,
n_jobs=self.n_jobs, cv=5,
verbose=0)
learner.fit(X, y)
My data's in a way that this learner
object will choose a C
corresponding to an empty model. Any idea how I can make sure the model's not empty?
EDIT: by an "empty model" I mean a model that has selected 0 features to use. Specially with an l1
regularized model, this can easily happen. So in this case, if the C
in the SVM is small enough, the optimization problem will find the 0 vector as the optimal solution for the coefficients. Therefore predictor.coef_
will be a vector of 0
s.
Try to implement custom scorer, something similar to:
import numpy as np
def scorer_(estimator, X, y):
# Your criterion here
if np.allclose(estimator.coef_, np.zeros_like(estimator.coef_)):
return 0
else:
return estimator.score(X, y)
learner = sklearn.grid_search.GridSearchCV(...
scoring=scorer_)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With