Pandas

Question

Given two separate dataframes I'm looking to merge them and unify a set of their joined columns.

Example:

In[1]: df1

Out[1]: 
   a_id     a_time a_val
0     1  100000000     a
1     2  200000000     b
2     3  300000000     c

In[10]: df2

Out[10]: 
   b_id     b_time b_val
0     1  100000000     d
1     2  150000000     e
2     3  350000000     f

The resulting dataframe I'm looking for is the following

   id       time val
0   1  100000000   a
1   1  100000000   d
2   2  150000000   e
3   2  200000000   b
4   3  300000000   c
5   3  350000000   f

Assuming all IDs are present on both tables, the result should be of length len(df1) + len(df2).

I was looking at some results using .stack() but I couldn't really figure out how to make it work on when merging two tables.

Notice the time could be the same, or could be different.

jezrael · Accepted Answer

I think you need same columns in both df and then use concat + sort_values + reset_index:

cols = ['id', 'time', 'val']
df1.columns = cols
df2.columns = cols

df = pd.concat([df1, df2]).sort_values('id').reset_index(drop=True)

print (df)
   id       time val
0   1  100000000   a
1   1  100000000   d
2   2  200000000   b
3   2  150000000   e
4   3  300000000   c
5   3  350000000   f

Pandas - Merge two dataframes and unify set of columns

Tags:

python

bluesummers

1 Answers

jezrael

Recent Activity

Donate For Us