It is easy to create (or load) a DataFrame with something like an object-typed column, as so:
[In]: pdf = pd.DataFrame({
"a": [1, 2, 3],
"b": [4, 5, 6],
"c": [7, 8, 9],
"combined": [[1, 4, 7], [2, 5, 8], [3, 6, 9]]}
)
[Out]
a b c combined
0 1 4 7 [1, 4, 7]
1 2 5 8 [2, 5, 8]
2 3 6 9 [3, 6, 9]
I am currently in the position where I have, as separate columns, values that I am required to return as a single column, and need to do so quite efficiently. Is there a fast and efficient way to combine columns into a single object-type column?
In the example above, this would mean already having columns a
, b
, and c
, and I wish to create combined
.
I failed to find a similar example of question online, feel free to link if this is a duplicate.
Use DataFrame.agg
and pass list as aggregate method, with axis=1
, then assign it to a new column
>>> pdf.assign(combined=pdf.agg(list, axis=1))
a b c combined
0 1 4 7 [1, 4, 7]
1 2 5 8 [2, 5, 8]
2 3 6 9 [3, 6, 9]
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With