I have a DataFrame where one column contains lists as cell contents, something like following:
import pandas as pd
df = pd.DataFrame({
    'col_lists': [[1, 2, 3], [5]],
    'col_normal': [8, 9]
})
>>> df
   col_lists  col_normal
0  [1, 2, 3]           8
1        [5]           9
I would like to apply some transformation to each element of col_lists, for example:
df['col_lists'] = df.apply(
    lambda row: [ None if (element % 2 == 0) else element for element in row['col_lists'] ], 
    axis=1
)
>>> df
      col_lists  col_normal
0  [1, None, 3]           8
1           [5]           9
With this dataframe this works as I expect, however, when I apply the same code to other dataframe I am getting a bizarre result -- for each row, the column contains only first element of the list:
df2 = pd.DataFrame({
    'col_lists': [[1, 2], [5]], # length of first list is smaller here
    'col_normal': [8, 9]
})
df2['col_lists'] = df2.apply(
    lambda row: [ None if (element % 2 == 0) else element for element in row['col_lists'] ], 
    axis=1
)
>>> df2
   col_lists  col_normal
0        1.0           8
1        5.0           9
I have two questions:
(1) What is going on here? Why I am getting a correct result in case of df, but not df2?
(2) How can I correctly apply some transformations to lists within a DataFrame?
First I think working with lists in pandas is not good idea.
But if really need it, try upgrade pandas, because for me it working nice in pandas 0.23.4:
df2['col_lists'] = df2.apply(
    lambda row: [ None if (element % 2 == 0) else element for element in row['col_lists'] ], 
    axis=1
)
print (df2)
   col_lists  col_normal
0  [1, None]           8
1        [5]           9
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With