Pandas string addition across columns

Question

This is perfectly legal in Python:

In [1]: 'abc' + 'def'
Out[1]: 'abcdef'

If I have an all text Pandas DataFrame, like the example below:

In [2]: df = pd.DataFrame([list('abcd'), list('efgh'), list('ijkl')],
                          columns=['C1','C2','C3','C4'])
        df.loc[[0,2], ['C2', 'C3']] = np.nan
        df
Out[2]:     C1  C2  C3  C4
        0   a   NaN NaN d
        1   e   f   g   h
        2   i   NaN NaN l

Is it possible to do the same with columns of the above DataFrame? Something like:

In [3]: df.apply(+, axis=1) # Or
        df.sum(axis=1)

Note that both of the statements above don't work. Using .str.cat() in a loop is easy, but I am looking for something better.

Expected output is:

Out[3]:    C
        0  ad
        1  efgh
        2  il

Ken Wei · Accepted Answer

You could do

df.fillna('').sum(axis=1)

Of course, this assumes that your dataframe is made up only of strings and NaNs.

piRSquared · Answer

Option 1
stack

I wanted to add it for demonstration. We don't have to accept the rectangular nature of the dataframe and use stack. When we do, stack drops nan by default. Leaving us with a vector of strings and a pd.MultiIndex. We can groupby the first level of this pd.MultiIndex (which used to be row indices) and perform summation:

df.stack().groupby(level=0).sum()

0      ad
1    efgh
2      il
dtype: object

Option2
Use Masked Arrays np.ma.masked_array
I was motivated by @jezrael to post a faster solution (-:

pd.Series(
    np.ma.masked_array(
        df.values,
        df.isnull().values,
    ).filled('').sum(1),
    df.index
)

0      ad
1    efgh
2      il
dtype: object

Timing

df = pd.concat([df]*1000).reset_index(drop=True)

%%timeit
pd.Series(
    np.ma.masked_array(
        df.values,
        df.isnull().values,
        fill_value=''
    ).filled('').sum(1),
    df.index
)

1000 loops, best of 3: 860 µs per loop

%timeit (pd.Series(df.fillna('').values.sum(axis=1), index=df.index))

1000 loops, best of 3: 1.33 ms per loop

Pandas string addition across columns

Tags:

python

pandas

Kartik

2 Answers

Ken Wei

piRSquared

Recent Activity

Donate For Us

Pandas string addition across columns

Tags:

python

pandas

Kartik

2 Answers

Ken Wei

piRSquared

Related questions

Recent Activity

Donate For Us