Python – Pandas is applicable when a cell contains a list

Pandas is applicable when a cell contains a list… here is a solution to the problem.

Pandas is applicable when a cell contains a list

I have a DataFrame with a column containing a list as cell contents as follows:

import pandas as pd
df = pd. DataFrame({
    'col_lists': [[1, 2, 3], [5]],
    'col_normal': [8, 9]
})

>>> df
   col_lists  col_normal
0  [1, 2, 3]           8
1        [5]           9

I want to apply some transformations to each element of col_lists, for example:

df['col_lists'] = df.apply(
    lambda row: [ None if (element % 2 == 0) else element for element in row['col_lists'] ], 
    axis=1
)

>>> df
      col_lists  col_normal
0  [1, None, 3]           8
1           [5]           9

For this data frame, it worked as I expected, however, when I applied the same code to other data frames, I got a strange result – for each row, the column contains only the first element of the list:

df2 = pd. DataFrame({
    'col_lists': [[1, 2], [5]], # length of first list is smaller here
    'col_normal': [8, 9]
})

df2['col_lists'] = df2.apply(
    lambda row: [ None if (element % 2 == 0) else element for element in row['col_lists'] ], 
    axis=1
)

>>> df2
   col_lists  col_normal
0        1.0           8
1        5.0           9

I have two questions:

(1) What happened here? Why am I getting the correct results in the case of df instead of df2?

(2) How to correctly apply certain transformations to a list in a DataFrame?

Solution

First, I don’t think using list in pandas is good idea

But if you really need it, try upgrading pandas as it works fine in pandas 0.23.4 for me :

df2['col_lists'] = df2.apply(
    lambda row: [ None if (element % 2 == 0) else element for element in row['col_lists'] ], 
    axis=1
)

print (df2)
   col_lists  col_normal
0  [1, None]           8
1        [5]           9

Related Problems and Solutions