I am trying to use scikitlearn to predict over new data using a pipeline object I had trained back in February. Since Friday, February 28th, the predict function no longer works for my pipeline object, citing the error:
>>> df = pd.read_csv('test_df_for_example.csv')
>>> mdl = joblib.load('split_0_model.pkl')
>>> mdl.predict(df)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/sklearn/utils/metaestimators.py", line 116, in <lambda>
out = lambda *args, **kwargs: self.fn(obj, *args, **kwargs)
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/sklearn/pipeline.py", line 419, in predict
Xt = transform.transform(Xt)
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/sklearn/compose/_column_transformer.py", line 587, in transform
self._validate_features(X.shape[1], X_feature_names)
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/sklearn/compose/_column_transformer.py", line 411, in _validate_features
if ((self._feature_names_in is None or feature_names is None)
AttributeError: 'ColumnTransformer' object has no attribute '_feature_names_in'
I am using Microsoft Azure's virtual machines to do this predicting (although the above code I ran on my local computer), so working with the versioning of the modules is difficult, and most of the time I am forced to use the latest versions of packages. I believe this error comes from scikitlearn's new version 0.22.2.post1, which I am using.
I have an example CSV with testing data here
The model file pickled with joblib here
And code to reproduce the error here
And yaml environment file here
Is there any way I can upgrade my model so that this error does not occur?
Thanks! Kristine
I recommend pinning down versions in your YAML, especially with the speed of releases in the azureml space.
So downgrading sklearn to the last stable build for your use case may be the solution, or upgrading the rest of your code base to accommodate the new sklearn version.
Ex.
- pip:
- sklearn==0.20.0
- azureml-sdk==1.0.85
- etc...
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With