2015-09-22 44 views
1

我有一个简单的问题。我有以下数据帧熊猫Python:如何在每10步获取数据帧的值?

df = 
    time          lat   lon 
    0 2014-03-26 14:46:27.457233+00:00 48.7773  11.428897 
    1 2014-03-26 14:46:28.457570+00:00 48.7773  11.428719 
    2 2014-03-26 14:46:29.457665+00:00 48.7772  11.428542 
    3 2014-03-26 14:46:30.457519+00:00 48.7771  11.428368 
    4 2014-03-26 14:46:31.457855+00:00 48.7770  11.428193 
    5 2014-03-26 14:46:32.457950+00:00 48.7770  11.428018 
    6 2014-03-26 14:46:33.457794+00:00 48.7769  11.427842 
    7 2014-03-26 14:46:34.458131+00:00 48.7768  11.427668 
    8 2014-03-26 14:46:35.458246+00:00 48.7767  11.427501 
    9 2014-03-26 14:46:36.458069+00:00 48.7766  11.427350 
    10 2014-03-26 14:46:37.458416+00:00 48.7766  11.427224 
    11 2014-03-26 14:46:38.458531+00:00 48.7765  11.427129 
    12 2014-03-26 14:46:39.458355+00:00 48.7764  11.427062 
    13 2014-03-26 14:46:40.458702+00:00 48.7764  11.427011 
    14 2014-03-26 14:46:41.458807+00:00 48.7764  11.426963 
    15 2014-03-26 14:46:42.458640+00:00 48.7763  11.426918 
    16 2014-03-26 14:46:43.458977+00:00 48.7763  11.426872 
    17 2014-03-26 14:46:44.459102+00:00 48.7762  11.426822 
    18 2014-03-26 14:46:45.458926+00:00 48.7762  11.426766 
    19 2014-03-26 14:46:46.459262+00:00 48.7761  11.426702 
    20 2014-03-26 14:46:47.459378+00:00 48.7760  11.426628 

我想生成一个新的数据帧df1包含每10个时间步的值。

df1 = 
     time          lat   lon 
     0  2014-03-26 14:46:27.457233+00:00 48.7773  11.428897 
     9  2014-03-26 14:46:46.459262+00:00  48.7761  11.426702 
     19  2014-03-26 14:46:46.459262+00:00 48.7765  11.426787 
     ...  ...   ...     ...  .... 
     len(df) 2014-03-26 14:46:46.459262+00:00 48.7765  11.426787 

我尝试做一些像

df1 = df.iloc[[0:10:len(df)]] 
+1

我不明白你所需的输出。你有索引0,9,19,它先上升9,然后上升10.为什么不是0,10,20(增加10)或0,9,18(增加9)? – DSM

+0

你的切片方法是正确的想法,几乎是正确的:使用'df.iloc [:: 10]'获得每十行。 (我强烈建议你*不要*循环索引。) –

回答

0

如何df.loc[[i for j, i in enumerate(df.index) if j % 10 == 0]]

+0

完美,这是我一直在寻找的。非常感谢。 – emax

6

只需用切片的iloc DF并通过了一步PARAM,切片行为可以解释here但基本上是第三个参数是步长:

In [67]: 
df = pd.DataFrame(np.random.randn(100,2)) 
df.iloc[::10] 

Out[67]: 
      0   1 
0 0.552160 -0.910893 
10 -2.173707 -0.659227 
20 0.811937 0.675416 
30 0.533533 0.336104 
40 1.093083 -0.943157 
50 -0.559221 0.272763 
60 -0.011628 1.002561 
70 -0.114501 0.457626 
80 1.355948 0.236342 
90 -0.151979 -0.746238 
+0

谢谢。它也可以工作。 – emax