In general, we will df.drop('column_name', axis=1) to remove a column in a DataFrame.
I want to add this transformer into a Pipeline
Example:
numerical_transformer = Pipeline(steps=[('imputer', SimpleImputer(strategy='mean')),
('scaler', StandardScaler(with_mean=False))
])
How can I do it?
You can write a custom Transformer like this :
class columnDropperTransformer():
def __init__(self,columns):
self.columns=columns
def transform(self,X,y=None):
return X.drop(self.columns,axis=1)
def fit(self, X, y=None):
return self
And use it in a pipeline :
import pandas as pd
# sample dataframe
df = pd.DataFrame({
"col_1":["a","b","c","d"],
"col_2":["e","f","g","h"],
"col_3":[1,2,3,4],
"col_4":[5,6,7,8]
})
# your pipline
pipeline = Pipeline([
("columnDropper", columnDropperTransformer(['col_2','col_3']))
])
# apply the pipeline to dataframe
pipeline.fit_transform(df)
Output :
col_1 col_4
0 a 5
1 b 6
2 c 7
3 d 8
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With