I am merging two data frames with pandas. I would like to avoid that, when joining, the output includes the join column of the right table.
Example:
import pandas as pd
age = [['tom', 10], ['nick', 15], ['juli', 14]]
df1 = pd.DataFrame(age, columns = ['Name', 'Age'])
toy = [['tom', 'GIJoe'], ['nick', 'car']]
df2 = pd.DataFrame(toy, columns = ['Name_child', 'Toy'])
df = pd.merge(df1,df2,left_on='Name',right_on='Name_child',how='left')
df.columns
will give the output Index(['Name', 'Age', 'Name_child', 'Toy'], dtype='object')
. Is there an easy way to obtain Index(['Name', 'Age', 'Toy'], dtype='object')
instead? I can drop the column afterwards of course like this del df['Name_child']
, but I'd like my code to be as short as possible.
Based on @mgc comments, you don't have to rename the columns of df2. Just you pass df2 to merge
function with renamed columns. df2 column names will remain as it is.
df = pd.merge(df1,df2.rename(columns={'Name_child': 'Name'}),on='Name', how='left')
df
Name Age Toy
0 tom 10 GIJoe
1 nick 15 car
2 juli 14 NaN
df.columns
Index(['Name', 'Age', 'Toy'], dtype='object')
df2.columns
Index(['Name_child', 'Toy'], dtype='object')
Seems to be even simpler to drop the column right after.
df = (pd.merge(df1,df2,left_on='Name',right_on='Name_child',how='left')
.drop('Name_child', axis=1))
#----------------
import this
The Zen of Python, by Tim Peters
Beautiful is better than ugly.
Explicit is better than implicit.
Simple is better than complex.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With