Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How do I horizontally concatenate pandas dataframes in python

I have tried several different ways to horizontally concatenate DataFrame objects from the Python Data Analysis Library (PANDAS), but my attempts have failed so far.

Desired Output for Given Inputs:

I have two dataframes:
d_1:

      col2    col3
col1                
str1     1  1.5728
str2     2  2.4627
str3     3  3.6143

d_2:

      col2    col3
col1              
str1     4  4.5345
str2     5  5.1230
str3     6  6.1233

I want the final resulting dataframe to be d_1 and d_2 side-by-side:

      col2    col3    col1  col2   col3
col1                                  
str1     1  1.5728    str1     4  4.5345
str2     2  2.4627    str2     5  5.1230
str3     3  3.6143    str3     6  6.1233

Creating Test Inputs:

Here is some code which creates the dataframes:

import pandas as pd

column_headers = ["col1", "col2", "col3"]
d_1 = dict.fromkeys(column_headers)
d_1["col1"] = ["str1", "str2", "str3"]
d_1["col2"] = [1, 2, 3]
d_1["col3"] = [1.5728, 2.4627, 3.6143]
df_1 = pd.DataFrame(d_1)
df_1 = df_1.set_index("col1")
print("df_1:")
print(df_1)
print()


d_2 = dict.fromkeys(column_headers)
d_2["col1"] = ["str1", "str2", "str3"]
d_2["col2"] = [4, 5, 6]
d_2["col3"] = [4.5345, 5.123, 6.1233]
df_2 = pd.DataFrame(d_2)
df_2 = df_2.set_index("col1")
print("df_2:")
print(df_2)
print()

Failed Attempts:

Failed Solution 1

Outer join fails to horizontally concatenate d_1 an d_2:

merged_df = df_1.join(df_2, how='outer')

We get the following error message:

ValueError: columns overlap but no suffix specified: Index(['col2', 'col3'], dtype='object')

Failed Solution 2:

Making a dictionary of dictionaries does not work:

# Make a dictionary of dictionaries
merged_d = dict()
merged_d[1] = d_1
merged_d[2] = d_2
merged_df = pd.DataFrame(merged_d)
print(merged_df)

The resulting DataFrame looks like this:

                             1                        2
col1        [str1, str2, str3]       [str1, str2, str3]
col2                 [1, 2, 3]                [4, 5, 6]
col3  [1.5728, 2.4627, 3.6143]  [4.5345, 5.123, 6.1233]

Failed Solution 3:

Subattempt 3a:

Making a dictionary of DataFrames does not seem to work either:

merged_d = dict()
merged_d[1] = df_1
merged_d[2] = df_2
merged_df = pd.DataFrame(merged_d)
print(merged_df)

We get the following error message:

ValueError: If using all scalar values, you must pass an index

Subattempt 3b:

Passing an index into the DataFrame constructor does not help much:

merged_df = pd.DataFrame(data = merged_d, index = [1, 2])

We get the error:

Value Error: cannot copy sequence with size 2 to array axis with dimension 3
like image 684
Toothpick Anemone Avatar asked Dec 11 '25 14:12

Toothpick Anemone


1 Answers

Use concat with axis 1 instead of merge i.e

ndf = pd.concat([df_1, df_2], axis=1)

     col2    col3  col2    col3
col1                            
str1     1  1.5728     4  4.5345
str2     2  2.4627     5  5.1230
str3     3  3.6143     6  6.1233
like image 151
Bharath Avatar answered Dec 14 '25 02:12

Bharath



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!