I am attempting to concatenate multiple columns, all strings, in a pandas DataFrame; forming a new column. I am using .str.cat so that I can include a carriage return between columns to concatenate.
However, if any column in a row is blank or NaN I get NaN as the full result for that row.
I have looked at options and the third answer of this looks interesting: pandas combine two strings ignore nan values
However I cannot see a way to extend this to > 2 columns but still not all columns of the DataFrame, which is required.
The first two answers do not allow for addition of a carriage return, which is also required.
Here is my code:
mydf['Address'] = mydf['full name'].str.cat(mydf['Delivery address 1'], sep ='\n').str.cat(mydf['Delivery address 2'], sep ='\n').str.cat(mydf['Delivery city'], sep ='\n').str.cat(mydf['Delivery state'], sep ='\n').str.cat(mydf['Delivery postcode'], sep ='\n')
Which results in blank mydf['Address'] for any row where any of the fields is blank.
What is the error in my code or my approach?
I think you need apply with axis=1 for process by rows with dropna for remove NaNs:
#columns for join
cols = ['full name','Delivery address 1','Delivery address 2',
'Delivery city','Delivery state','Delivery postcode']
mydf['Address'] = mydf[cols].apply(lambda x: '\n'.join(x.dropna()), axis=1)
If some columns is numeric:
mydf['Address'] = mydf[cols].apply(lambda x: '\n'.join(x.dropna().astype(str)), axis=1)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With