Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Pipeline error: "AttributeError: 'ColumnTransformer' object has no attribute '_feature_names_in'"

I am trying to use scikitlearn to predict over new data using a pipeline object I had trained back in February. Since Friday, February 28th, the predict function no longer works for my pipeline object, citing the error:

>>> df = pd.read_csv('test_df_for_example.csv')
>>> mdl = joblib.load('split_0_model.pkl')
>>> mdl.predict(df)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/sklearn/utils/metaestimators.py", line 116, in <lambda>
    out = lambda *args, **kwargs: self.fn(obj, *args, **kwargs)
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/sklearn/pipeline.py", line 419, in predict
    Xt = transform.transform(Xt)
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/sklearn/compose/_column_transformer.py", line 587, in transform
    self._validate_features(X.shape[1], X_feature_names)
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/sklearn/compose/_column_transformer.py", line 411, in _validate_features
    if ((self._feature_names_in is None or feature_names is None)
AttributeError: 'ColumnTransformer' object has no attribute '_feature_names_in'

I am using Microsoft Azure's virtual machines to do this predicting (although the above code I ran on my local computer), so working with the versioning of the modules is difficult, and most of the time I am forced to use the latest versions of packages. I believe this error comes from scikitlearn's new version 0.22.2.post1, which I am using.

I have an example CSV with testing data here

The model file pickled with joblib here

And code to reproduce the error here

And yaml environment file here

Is there any way I can upgrade my model so that this error does not occur?

Thanks! Kristine

like image 328
K List Avatar asked Oct 24 '25 15:10

K List


1 Answers

I recommend pinning down versions in your YAML, especially with the speed of releases in the azureml space.

So downgrading sklearn to the last stable build for your use case may be the solution, or upgrading the rest of your code base to accommodate the new sklearn version.

Ex.

 - pip:
    - sklearn==0.20.0
    - azureml-sdk==1.0.85
    - etc...
like image 109
Nema Sobhani Avatar answered Oct 26 '25 05:10

Nema Sobhani



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!