Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Can you passthrough a specific column in a scikit-learn ColumnTransformer?

I have a fairly large datframe(300 columns) and I'm using sklearn to encode/scale some fields, I like that I can choose the specific columns I want and then it drop the rest. My problem is, now I have two numpy arrays in two columns in my large data frame that I would like passed through while the others I don't list in the sklearn pipeline are dropped.

For example:

from sklearn.preprocessing import LabelEncoder, OneHotEncoder
from sklearn.compose import ColumnTransformer
ct = ColumnTransformer([("Country", OneHotEncoder(), [1])], remainder = 'passthrough')

This would convert the country to onehot and pass through everything. What if I have a column called "numpy_array" how can I get that one only passed through?

like image 610
Lostsoul Avatar asked Nov 16 '25 05:11

Lostsoul


1 Answers

What if I have a column called "numpy_array" how can I get that one only passed through?

from sklearn.compose import ColumnTransformer

ct = ColumnTransformer(
    transformers=[
        ('np_array_transform', 'passthrough', ['numpy_array']),
    ],
    remainder='drop',
)
like image 66
Sanjar Adilov Avatar answered Nov 17 '25 17:11

Sanjar Adilov



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!