2017-06-22 35 views
1

我有一个pandas一系列df.files它看起来像这样一个系列的value_counts():在熊猫,如何获取包含列表

In [79]: df.files 
Out[79]: 
0  [{'url': 'http://www.apkmirror.com/wp-content/... 
1  [{'url': 'http://www.apkmirror.com/wp-content/... 
2  [{'url': 'http://www.apkmirror.com/wp-content/... 
3  [{'url': 'http://www.apkmirror.com/wp-content/... 
4  [{'url': 'http://www.apkmirror.com/wp-content/... 
5  [{'url': 'http://www.apkmirror.com/wp-content/... 
6  [{'url': 'http://www.apkmirror.com/wp-content/... 
7  [{'url': 'http://www.apkmirror.com/wp-content/... 
8  [{'url': 'http://www.apkmirror.com/wp-content/... 
9  [{'url': 'http://www.apkmirror.com/wp-content/... 
10  [{'url': 'http://www.apkmirror.com/wp-content/... 
11  [{'url': 'http://www.apkmirror.com/wp-content/... 
12  [{'url': 'http://www.apkmirror.com/wp-content/... 
13  [{'url': 'http://www.apkmirror.com/wp-content/... 
14  [{'url': 'http://www.apkmirror.com/wp-content/... 
15  [{'url': 'http://www.apkmirror.com/wp-content/... 
16  [{'url': 'http://www.apkmirror.com/wp-content/... 
17  [{'url': 'http://www.apkmirror.com/wp-content/... 
18  [{'url': 'http://www.apkmirror.com/wp-content/... 
19  [{'url': 'http://www.apkmirror.com/wp-content/... 
20  [{'url': 'http://www.apkmirror.com/wp-content/... 
21  [{'url': 'http://www.apkmirror.com/wp-content/... 
22  [{'url': 'http://www.apkmirror.com/wp-content/... 
23  [{'url': 'http://www.apkmirror.com/wp-content/... 
24  [{'url': 'http://www.apkmirror.com/wp-content/... 
25  [{'url': 'http://www.apkmirror.com/wp-content/... 
26  [{'url': 'http://www.apkmirror.com/wp-content/... 
27  [{'url': 'http://www.apkmirror.com/wp-content/... 
28  [{'url': 'http://www.apkmirror.com/wp-content/... 
29  [{'url': 'http://www.apkmirror.com/wp-content/... 
           ...       
16487 [{'url': 'http://www.apkmirror.com/wp-content/... 
16488             [] 
16489 [{'url': 'http://www.apkmirror.com/wp-content/... 
16490 [{'url': 'http://www.apkmirror.com/wp-content/... 
16491             [] 
16492 [{'url': 'http://www.apkmirror.com/wp-content/... 
16493 [{'url': 'http://www.apkmirror.com/wp-content/... 
16494 [{'url': 'http://www.apkmirror.com/wp-content/... 
16495             [] 
16496             [] 
16497             [] 
16498 [{'url': 'http://www.apkmirror.com/wp-content/... 
16499 [{'url': 'http://www.apkmirror.com/wp-content/... 
16500 [{'url': 'http://www.apkmirror.com/wp-content/... 
16501 [{'url': 'http://www.apkmirror.com/wp-content/... 
16502 [{'url': 'http://www.apkmirror.com/wp-content/... 
16503             [] 
16504             [] 
16505             [] 
16506             [] 
16507             [] 
16508             [] 
16509             [] 
16510             [] 
16511             [] 
16512             [] 
16513             [] 
16514             [] 
16515             [] 
16516             [] 

一些值是空列表,而另一些则含有名单单字典,类似如下格式:

In [80]: df.files.loc[0] 
Out[80]: 
[{'checksum': '9f6075f4c561792e48354277b46a6810', 
    'path': 'full/80832b9fca82ce0f58f4d23c511e5a1d657c40e8.php?id=2968', 
    'url': 'http://www.apkmirror.com/wp-content/themes/APKMirror/download.php?id=2968'}] 

我想找出多少的df.files条目实际上是空列表。但是,如果我尝试df.files.value_counts(),我会得到TypeError: unhashable type: 'list'。我该如何解决这个问题?

回答

3

可以转换为tuple第一,如果想使用value_counts

vc = df.files.apply(tuple).value_counts() 

但是,如果只需要空lists使用str.lenlength为计数lists,然后sum所有True小号布尔面膜:

l = (df['files'].str.len() == 0).sum() 

如果没有NaN s值是可能的使用IanS solution

l = (df['files'].apply(len) == 0).sum() 
2

如果您正在查找空列表,为什么使用value_counts?

len([i for i in df.files if len(i) == 0]) 
+1

或'(DF [ '文件']申请(LEN)== 0)的.sum()'(可能会更快) – IanS

+0

@伊恩斯是的,你是对的。从来没有使用过。现在我确定。将 –

+0

hm,'df ['files']。str.len()'可能会更快(请参阅其他答案) – IanS

0

你可以写一个for循环通过列表迭代太:

for i in df.files: 
    count = 0 
    if len(i) == 0: 
     count = count + 1 
    else: 
     pass