pandas subtract rows in dataframe according to a few columns

Question

I have the following dataframe

data = [
 {'col1': 11, 'col2': 111, 'col3': 1111},
 {'col1': 22, 'col2': 222, 'col3': 2222},
 {'col1': 33, 'col2': 333, 'col3': 3333},
 {'col1': 44, 'col2': 444, 'col3': 4444}
]

and the following list:

lst = [(11, 111), (22, 222), (99, 999)]

I would like to get out of my data only rows that col1 and col2 do not exist in the lst

result for above example would be:

[
 {'col1': 33, 'col2': 333, 'col3': 3333},
 {'col1': 44, 'col2': 444, 'col3': 4444}
]

how can I achieve that?

import pandas as pd

df = pd.DataFrame(data)

list_df = pd.DataFrame(lst)

# command like ??
# df.subtract(list_df)

jezrael · Accepted Answer

If need test by pairs is possible compare MultiIndex created by both columns in Index.isin with inverted mask by ~ in boolean indexing:

df = df[~df.set_index(['col1','col2']).index.isin(lst)]
print (df)
   col1  col2  col3
2    33   333  3333
3    44   444  4444

Or with left join by merge with indicator parameter:

m = df.merge(list_df, 
             left_on=['col1','col2'],
             right_on=[0,1], 
             indicator=True, 
             how='left')['_merge'].eq('left_only')
df = df[mask]
print (df)
   col1  col2  col3
2    33   333  3333
3    44   444  4444

alparslan mimaroğlu · Answer

You can create a tuple out of your col1 and col2 columns and then check if those tuples are in the lst list. Then drop the fines with True values.

df.drop(df.apply(lambda x: (x['col1'], x['col2']), axis =1)
          .isin(lst)
          .loc[lambda x: x==True]
          .index)

With this solution you don't even have to make the second list a dataframe

pandas subtract rows in dataframe according to a few columns

Tags:

python

pandas

dina

2 Answers

jezrael

alparslan mimaroğlu

Recent Activity

Donate For Us

pandas subtract rows in dataframe according to a few columns

Tags:

python

pandas

dina

2 Answers

jezrael

alparslan mimaroğlu

Related questions

Recent Activity

Donate For Us