2017-10-08 77 views
-1

我有一个产品阵列,其看起来像见下表:使用lambda的词典中特定键的值?

+---------------------------+--------------------------------+--------------------------------+ 
| name     | review      | word_count      | 
+---------------------------+--------------------------------+--------------------------------+ 
|       |        | {'and': 5, 'wipes': 1,   | 
| Planetwise    | These flannel wipes are OK, | 'stink': 1, 'because' : 2, ... | 
| Flannel Wipes    | but in my opinion ...   |        | 
|       |        |        | 
+---------------------------+--------------------------------+--------------------------------+ 
|       |        | {'and': 3, 'love': 1,   | 
| Planetwise    | it came early and was not  | 'it': 2, 'highly': 1, ...  | 
| Wipes Pouch    | disappointed. i love ...  |        | 
|       |        |        | 
+---------------------------+--------------------------------+--------------------------------+ 
|       |        | {'shop': 1, 'noble': 1,  | 
|       |        | 'is': 1, 'it': 1, 'as': ... | 
| A Tale of Baby's Days  | Lovely book, it's bound  |        | 
| with Peter Rabbit ... | tightly so you may no ...  |        | 
|       |        |        | 
+---------------------------+--------------------------------+--------------------------------+ 

基本上word_count列包含字的dictionary(key : value)一个发生的review列句子。

现在我想建立一个新的列名and它应该包含在word_count字典and值,如果and存在作为word_count列键,则该值,如果它没有作为一个关键的存在,则0

对于第3行中的新and列看起来是这样的:

+------------+ 
| and  | 
+------------+ 
|   | 
| 5   | 
|   | 
|   | 
+------------+ 
|   | 
| 3   | 
|   | 
|   | 
+------------+ 
|   | 
| 0   | 
|   | 
|   | 
+------------+ 

我写了这个代码和它的正常工作:

def wordcount(x): 
    if 'and' in x: 
     return x['and'] 
    else: 
     return 0 

products['and'] = products['word_count'].apply(wordcount); 

我的问题:有什么办法我可以使用lambda来做到这一点?

什么我迄今所做的是:

products['and'] = products['word_count'].apply(lambda x : 'and' in x.keys()); 

这仅返回01列。我可以在上面的行中添加什么,以便products['and']包含值and作为密钥存在时的密钥products['word_count']

我正在使用ipython notebook和graphlab。

回答

1

你有正确的想法。只要返回值x['and'](如果存在),否则0

例如:

data = {"word_count":[{"foo":1, "and":5}, 
         {"foo":1}]} 
df = pd.DataFrame(data) 
df.word_count.apply(lambda x: x['and'] if 'and' in x.keys() else 0) 

输出:

0 5 
1 0 
Name: word_count, dtype: int64 
1

我不知道什么products['word_count'].apply(wordcount)做,但是从你的问题的休息,而你可以做类似以下与lambda

products['and'] = (
    lambda p: p['and']['and'] if 'and' in p['and'] else 0)(products) 

这是一种丑陋,笨拙,所以我会建议使用内置的字典get()方法,而不是因为它的调试,短,维护更方便,快捷:

products['and'] = products['and'].get('and', 0) 

您在使用lambda固定提醒我所谓的Law of the Instrument:“......它是诱人的,如果你拥有的唯一工具是锤子,把所有东西当作钉子来对待”。