Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to tune parameters of nested Pipelines by GridSearchCV in scikit-learn?

Tags:

scikit-learn

Is it possible to tune parameters of nested Pipelines in scikit-learn? E.g.:

svm = Pipeline([
    ('chi2', SelectKBest(chi2)),
    ('cls', LinearSVC(class_weight='auto'))
])

classifier = Pipeline([
    ('vectorizer', TfIdfVectorizer()),
    ('ova_svm', OneVsRestClassifier(svm))
})

parameters = ?

GridSearchCV(classifier, parameters)

If it's not possible to do this directly, what could be a workaround?

like image 358
lizarisk Avatar asked May 08 '13 09:05

lizarisk


1 Answers

scikit-learn has a double underscore notation for this, as exemplified here. It works recursively and extends to OneVsRestClassifier, with the caveat that the underlying estimator must be explicitly addressed as __estimator:

parameters = {'ova_svm__estimator__cls__C': [1, 10, 100],
              'ova_svm__estimator__chi2_k': [200, 500, 1000]}
like image 111
Fred Foo Avatar answered Sep 26 '22 02:09

Fred Foo