Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

In Pandas, how to get the value_counts() of a Series containing lists

Tags:

python

pandas

I have a pandas series df.files which looks like this:

In [79]: df.files
Out[79]:
0        [{'url': 'http://www.apkmirror.com/wp-content/...
1        [{'url': 'http://www.apkmirror.com/wp-content/...
2        [{'url': 'http://www.apkmirror.com/wp-content/...
3        [{'url': 'http://www.apkmirror.com/wp-content/...
4        [{'url': 'http://www.apkmirror.com/wp-content/...
5        [{'url': 'http://www.apkmirror.com/wp-content/...
6        [{'url': 'http://www.apkmirror.com/wp-content/...
7        [{'url': 'http://www.apkmirror.com/wp-content/...
8        [{'url': 'http://www.apkmirror.com/wp-content/...
9        [{'url': 'http://www.apkmirror.com/wp-content/...
10       [{'url': 'http://www.apkmirror.com/wp-content/...
11       [{'url': 'http://www.apkmirror.com/wp-content/...
12       [{'url': 'http://www.apkmirror.com/wp-content/...
13       [{'url': 'http://www.apkmirror.com/wp-content/...
14       [{'url': 'http://www.apkmirror.com/wp-content/...
15       [{'url': 'http://www.apkmirror.com/wp-content/...
16       [{'url': 'http://www.apkmirror.com/wp-content/...
17       [{'url': 'http://www.apkmirror.com/wp-content/...
18       [{'url': 'http://www.apkmirror.com/wp-content/...
19       [{'url': 'http://www.apkmirror.com/wp-content/...
20       [{'url': 'http://www.apkmirror.com/wp-content/...
21       [{'url': 'http://www.apkmirror.com/wp-content/...
22       [{'url': 'http://www.apkmirror.com/wp-content/...
23       [{'url': 'http://www.apkmirror.com/wp-content/...
24       [{'url': 'http://www.apkmirror.com/wp-content/...
25       [{'url': 'http://www.apkmirror.com/wp-content/...
26       [{'url': 'http://www.apkmirror.com/wp-content/...
27       [{'url': 'http://www.apkmirror.com/wp-content/...
28       [{'url': 'http://www.apkmirror.com/wp-content/...
29       [{'url': 'http://www.apkmirror.com/wp-content/...
                               ...                        
16487    [{'url': 'http://www.apkmirror.com/wp-content/...
16488                                                   []
16489    [{'url': 'http://www.apkmirror.com/wp-content/...
16490    [{'url': 'http://www.apkmirror.com/wp-content/...
16491                                                   []
16492    [{'url': 'http://www.apkmirror.com/wp-content/...
16493    [{'url': 'http://www.apkmirror.com/wp-content/...
16494    [{'url': 'http://www.apkmirror.com/wp-content/...
16495                                                   []
16496                                                   []
16497                                                   []
16498    [{'url': 'http://www.apkmirror.com/wp-content/...
16499    [{'url': 'http://www.apkmirror.com/wp-content/...
16500    [{'url': 'http://www.apkmirror.com/wp-content/...
16501    [{'url': 'http://www.apkmirror.com/wp-content/...
16502    [{'url': 'http://www.apkmirror.com/wp-content/...
16503                                                   []
16504                                                   []
16505                                                   []
16506                                                   []
16507                                                   []
16508                                                   []
16509                                                   []
16510                                                   []
16511                                                   []
16512                                                   []
16513                                                   []
16514                                                   []
16515                                                   []
16516                                                   []

Some of the values are empty lists, while others are lists containing a single dictionary with a format similar to the following:

In [80]: df.files.loc[0]
Out[80]: 
[{'checksum': '9f6075f4c561792e48354277b46a6810',
  'path': 'full/80832b9fca82ce0f58f4d23c511e5a1d657c40e8.php?id=2968',
  'url': 'http://www.apkmirror.com/wp-content/themes/APKMirror/download.php?id=2968'}]

I would like to find out how many of the entries of df.files are actually empty lists. However, if I try df.files.value_counts(), I get a TypeError: unhashable type: 'list'. How might I go about solving this?

like image 809
Kurt Peek Avatar asked Jan 22 '26 11:01

Kurt Peek


2 Answers

You can convert to tuple first if want use value_counts:

vc = df.files.apply(tuple).value_counts()

But if need only length of empty lists use str.len for count lists, then sum all Trues of boolean mask:

l = (df['files'].str.len() == 0).sum()

If no NaNs values is possible use IanS solution:

l = (df['files'].apply(len) == 0).sum()
like image 155
jezrael Avatar answered Jan 24 '26 00:01

jezrael


If you're looking for empty lists, why use value_counts?

len([i for i in df.files if len(i) == 0])
like image 27
A.Kot Avatar answered Jan 23 '26 23:01

A.Kot