2017-06-23 171 views
1

我有这个样子创建大熊猫基于价值的唯一值的新列

zipcode room_type 
2011  bed 
2012  sofa 

每房源呈现一个制作的Airbnb上市行。我想汇总数据,以便计算所有的唯一值。每个独特的值都有自己的列,数据按邮编分组。所以结果看起来像这样:

zipcode bed sofa ground 
1011  200 36  20 
1012  720 45  89 

我怎样才能得到熊猫这个结果?

回答

1

我已经使用这个索引和重塑完成:

df = DataFrame({'zipcode':[20110,20110,20111,20111,20111], 'room_type': ['bed','sofa', 'bed','bed','sofa']}) 
df.set_index(['zipcode', 'room_type'], inplace=True) 
df 

zipcode room_type 
    20110  bed 
      sofa 
    20111  bed 
       bed 
      sofa 

# count the values and generate a new dataframe 
df2 = DataFrame(df.index.value_counts(), columns=['count']) 
df2.reset_index(inplace=True) 
df2 

      index count 
0 (20111, bed)  2 
1 (20110, bed)  1 
2 (20111, sofa)  1 
3 (20110, sofa)  1 

# split the tuple into new columns 
df2[['zipcode', 'room_type']] = df2['index'].apply(Series) 
df2.drop('index', axis=1, inplace=True) 

# reshape 
df2.pivot(index='zipcode', columns='room_type', values='count') 

room_type bed sofa 
zipcode  
    20110  1 1 
    20111  2 1