2016-06-30 128 views
1

熊猫str.contains()在下面的方式评估到True返回行。但是如何返回匹配而不是行?熊猫str包含返回匹配

In [1]: df 

language   level 
java programming beginner 
c/c++    intermediate 
php    beginner 

In [2]: df[df['language'].str.contains("java|php|python")==True] 

language   level 
java programming beginner 
php    beginner 

In [3]: #but should return match too instead of row: 
language   level  matched_skill 
java programming beginner java 
php    beginner php 

In [4]: df[['matched_skill']] 

java 
php 
+1

这几乎可以肯定是不必要的:'== TRUE'。 – IanS

回答

2

您可以使用str.extract,然后通过dropnaNaN删除行:

df['matched_skill'] = df['language'].str.extract("(java|php|python)", expand=False) 
print (df) 
      language   level matched_skill 
0 java programming  beginner   java 
1    c/c++ intermediate   NaN 
2    php  beginner   php 

df.dropna(subset=['matched_skill'], inplace=True) 
print (df) 
      language  level matched_skill 
0 java programming beginner   java 
2    php beginner   php