How can I drop the row values in Pyspark based on the value of row number/row index value?
I am new to Pyspark (and coding) -- I have tried coding something but it is not working.
You can't drop specific cols, but you can just filter the ones you want, by using filter or its alias, where.
Imagine you want "to drop" the rows where the age of a person is lower than 3. You can just keep the opposite rows, like this:
df.filter(df.age >= 3)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With