I have a dataframe that may have commented characters at the bottom of it. Due to some other reasons, I cannot pass the comment character to initialize the dataframe itself. Here is an example of what I would have:
df = pd.read_csv(file,header=None)
df
0 1
0 132605 1
1 132750 2
2 # total: 100000
Is there a way to remove all rows that start with a comment character in-place -- that is, without having to re-load the data frame?
Using startswith
newdf=df[df.iloc[:,0].str.startswith('#').ne(True)]
Dataframe:
>>> df
0 1
0 132605 1
1 132750 2
2 # total: 100000
3 foo bar
Dropping in-place:
>>> to_drop = df[0].str.startswith('#').where(lambda s: s).dropna().index
>>> df.drop(to_drop, inplace=True)
>>> df
0 1
0 132605 1
1 132750 2
3 foo bar
Assumptions: you want to find rows where the column labeled 0 starts with '#'. Otherwise, adjust accordingly.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With