I have a Spark dataframe with 3k-4k columns and I'd like to drop columns where the name meets certain variable criteria ex. Where ColumnName Like 'foo'.
To get a column names you use df.columns and drop() supports dropping many columns in one call. The below code uses these two and does what you need:
condition = lambda col: 'foo' in col
new_df = df.drop(*filter(condition, df.columns))
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With