Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Get the missing columns from one dataframe and append it to another dataframe

I have a Dataframe df1 with the columns. I need to compare the headers of columns in df1 with a list of headers from df2

df1 =['a','b','c','d','f']
df2 =['a','b','c','d','e','f'] 

I need to compare the df1 with df2 and if any missing columns, I need to add them to df1 with blank values.

I tried concat and also append and both didn't work. with concat, I'm not able to add the column e and with append, it is appending all the columns from df1 and df2. How would I get only missing column added to df1 in the same order?

df1_cols = df1.columns
df2_cols = df2._combine_match_columns

if (df1_cols == df2_cols).all():
        df1.to_csv(path + file_name, sep='|')
else:
    print("something is missing, continuing")
    #pd.concat([my_df,flat_data_frame], ignore_index=False, sort=False)
    all_list = my_df.append(flat_data_frame, ignore_index=False, sort=False)

I wanted to see the results as

a|b|c|d|e|f - > headers
1|2|3|4||5 -> values
like image 330
Sparkles Avatar asked Sep 05 '25 03:09

Sparkles


1 Answers

pandas.DataFrame.align

df1.align(df2, axis=1)[0]
  • By default this does an 'outer' join
  • By specifying axis=1 we focus on columns
  • This returns a tuple of both an aligned df1 and df2 with the calling dataframe being the first element. So I grab the first element with [0]

pandas.DataFrame.reindex

df1.reindex(columns=df1.columns | df2.columns)
  • You can treat pandas.Index objects like sets most of the time. So df1.columns | df2.columns is the union of those two index objects. I then reindex using the result.
like image 125
piRSquared Avatar answered Sep 07 '25 19:09

piRSquared