For example, two dataframes named df1, df2 showd like this:
### df1
Name Code Mass
5N11 s1 0.1545
5N12 NaN 0.22
5N13 s3 0.2123
5N15 s5 0.1486
5N17 NaN 0.2100
### df2
Name Code Mass
5N12 s2 0.22
5N13 NaN 0.2123
5N14 s4 0.35
5N16 s6 0.07
5N17 s7 0.2100
Some background introduction:
Some Name in df1 and df2 are same.
The Mass corrsponding the same Name in df1 and df2 are equal.
What I'm trying to do is to merge this two dataframe together by the Name and combine with its code and mass.
My attempt seems work!
df = pd.concat([df1,df2],ignore_index= True)
df = df.dropna(subset= ["Code"])
df = pd.merge(df.groupby('Name').sum().reset_index(),
df[['Name', 'Code',"Mass"]].drop_duplicates(),
how='right')
It seems like reproducing the right result.

Start with df1 and fill in from df2 where df1 is missing. This requires that we set the index for each as the 'Name'
df1.set_index('Name').combine_first(df2.set_index('Name'))
from StringIO import StringIO
import pandas as pd
text1 = """Name Code Mass
5N11 s1 0.1545
5N12 NaN 0.22
5N13 s3 0.2123
5N15 s5 0.1486
5N17 NaN 0.2100"""
text2 = """Name Code Mass
5N12 s2 0.22
5N13 NaN 0.2123
5N14 s4 0.35
5N16 s6 0.07
5N17 s7 0.2100"""
df1 = pd.read_csv(StringIO(text1), delim_whitespace=True)
df2 = pd.read_csv(StringIO(text2), delim_whitespace=True)
df1.set_index('Name').combine_first(df2.set_index('Name')).reset_index()
Looks like:
Name Code Mass
0 5N11 s1 0.1545
1 5N12 s2 0.2200
2 5N13 s3 0.2123
3 5N14 s4 0.3500
4 5N15 s5 0.1486
5 5N16 s6 0.0700
6 5N17 s7 0.2100
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With