Pandas drop rows not in labels

Question

I have two Pandas DataFrames (A and B) with overlapping indexes. I want to drop the rows in DataFrame B for which that index value doesn't exist in DataFrame A.

I have looked into Pandas drop method for DataFrames but it drops the columns with the given labels, I would like to drop the columns for which there are not given labels.

For now I managed to do this with:

B.drop(B.drop(A.index).index)

But this is clearly not the best way to do it (not efficient or readable). Is there a better way?

Example:

DataFrame A:

   index       col1  
     1       some_data    
     2       some_data    
     3       some_data   
     4       some_data

DataFrame B:

   index       col2  
     1       other_data    
     2       other_data    
     3       other_data   
     4       other_data   
     5       other_data   
     6       other_data

I want to obtain DataFrame B':

   index       col2  
     1       other_data    
     2       other_data    
     3       other_data   
     4       other_data

jezrael · Accepted Answer

I think you can use isin with boolean indexing:

print B.index.isin(A.index)
[ True  True  True  True False False]

print B[B.index.isin(A.index)]
             col2
index            
1      other_data
2      other_data
3      other_data
4      other_data

EdChum · Answer

You can use difference to find the row labels that are not in the other df index:

In [6]:
df2.drop(df2.index.difference(df1.index))

Out[6]:
             col2
index            
1      other_data
2      other_data
3      other_data
4      other_data

Pandas drop rows not in labels

Tags:

python

pandas

dataframe

João Almeida

2 Answers

jezrael

EdChum

Recent Activity

Donate For Us

Pandas drop rows not in labels

Tags:

python

pandas

dataframe

João Almeida

2 Answers

jezrael

EdChum

Related questions

Recent Activity

Donate For Us