2017-06-14 18 views
1

我有一个数据帧,像这样,numpy的人气指数实现

import pandas as pd 
import numpy as np 


df = pd.DataFrame({'a': [0, 0.5, 0.2], 
        'b': [1,1,0.3]}) 
print (df) 
    a b 
0 0.0 1.0 
1 0.5 1.0 
2 0.2 0.3 

我要生成一个系列,看起来像

pd.Series ([np.arange (start = 0, stop = 1, step = 0.1), 
np.arange (start = 0.5, stop = 1, step = 0.1), 
np.arange (start = 0.2, stop = 0.3, step = 0.1)]) 

0 [0.0, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, ... 
1       [0.5, 0.6, 0.7, 0.8, 0.9] 
2            [0.2] 
dtype: object 

我试图用一个lambda函数来做到这一点并得到一个错误,像这样

foo = lambda x: np.arange(start = x.a, stop = x.b, step = 0.1) 
print (df.apply(foo, axis =1)) 

ValueError: Shape of passed values is (3, 10), indices imply (3, 2) 

我不知道这意味着什么。有没有更好/正确的方法来做到这一点?

回答

2

使用itertuplesSeries构造:

s = pd.Series([np.arange(x.a, x.b, .1) for x in df.itertuples()], index=df.index) 
print (s) 
0 [0.0, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, ... 
1       [0.5, 0.6, 0.7, 0.8, 0.9] 
2            [0.2] 
dtype: object 

s = pd.Series([np.arange(x.a, x.b, .1) for i, x in df.iterrows()], index=df.index) 
print (s) 
0 [0.0, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, ... 
1       [0.5, 0.6, 0.7, 0.8, 0.9] 
2            [0.2] 
dtype: object 

随着应用工程只转换为tuple

foo = lambda x: tuple(np.arange(start = x.a, stop = x.b, step = 0.1)) 
print (df.apply(foo, axis = 1)) 
0 (0.0, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, ... 
1       (0.5, 0.6, 0.7, 0.8, 0.9) 
2            (0.2,) 
dtype: object 
+0

tolist()工作并删除错误。但是,它转换float64的dtype并给出不精确的数字 – nitin

2

我会使用一个修真

pd.Series([np.arange(a, b, .1) for a, b in zip(df.a, df.b)], df.index) 

0 [0.0, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, ... 
1       [0.5, 0.6, 0.7, 0.8, 0.9] 
2            [0.2] 
dtype: object