Python的大熊猫：追加数据帧的行和删除附加行

import pandas as pd 
df = pd.DataFrame({ 
    'id':[1,2,3,4,5,6,7,8,9,10,11], 
    'text': ['abc','zxc','qwe','asf','efe','ert','poi','wer','eer','poy','wqr']})

我有列的数据帧：Python的大熊猫：追加数据帧的行和删除附加行

id text 
1  abc 
2  zxc 
3  qwe 
4  asf 
5  efe 
6  ert 
7  poi 
8  wer 
9  eer 
10  poy 
11  wqr

我有一个包含ID的列表清单L = [1,3,6,10]。

我想从列表中追加文本列，从我的列表中第一次取1和3（列表中的前两个值），并在我的DataFrame中追加带有id = 1（其id为2的文本列），然后删除行id列2类似然后采取3和6，然后追加文本列其中id = 4,5为id 3，然后删除id = 4和5行，迭代为列表中的元素（x，x + 1）

我的最后输出看起来是这样的：

id text 
1 abczxc   # joining id 1 and 2 
3 qweasfefe  # joining id 3,4 and 5 
6 ertpoiwereer # joining id 6,7,8,9 
10 poywqr   # joining id 10 and 11

来源

2017-04-17 Shubham

您可以使用isin与cumsum的系列，这与applyjoin功能使用了groupby：

s = df.id.where(df.id.isin(L)).ffill().astype(int) 
df1 = df.groupby(s)['text'].apply(''.join).reset_index() 
print (df1) 
    id   text 
0 1  abczxc 
1 3  qweasfefe 
2 6 ertpoiwereer 
3 10  poywqr

这工作，因为：

s = df.id.where(df.id.isin(L)).ffill().astype(int) 
print (s) 
0  1 
1  1 
2  3 
3  3 
4  3 
5  6 
6  6 
7  6 
8  6 
9  10 
10 10 
Name: id, dtype: int32

来源

2017-04-17 15:55:30 jezrael

先生您先前的代码'df.groupby（df.id.isin（L）.cumsum（））['text']。apply（''。join）.reset_index（）。rename（columns = {0：'text '}）'工作正常，你为什么介绍'.ffill（）。astype（int）'？我的意思是它是做什么的？ – Shubham

它在id列[这里]（http://stackoverflow.com/posts/43454862/revisions）中有不同的输出，所以我改变它。 – jezrael

使用pd.cut创建你的箱子然后带有一个lambda函数的groupby可加入该组中的文本。

df.groupby(pd.cut(df.id,L+[np.inf],right=False, labels=[i for i in L])).apply(lambda x: ''.join(x.text))

编辑：

(df.groupby(pd.cut(df.id,L+[np.inf], 
       right=False, 
       labels=[i for i in L])) 
    .apply(lambda x: ''.join(x.text)).reset_index().rename(columns={0:'text'}))

输出：

id   text 
0 1  abczxc 
1 3  qweasfefe 
2 6 ertpoiwereer 
3 10  poywqr

来源

2017-04-17 15:47:40

在你的代码的最后我添加.reset_index（）将其转换成数据帧，但它给了我命名为0科拉姆，而不是'text' – Shubham

@SRingne请参阅编辑。 –

我改变了值不在列表中ffill和GROUPBY np.nan然后。尽管@ Jezrael的方法好得多。我需要记住使用cumsum :)

l = [1,3,6,10] 
df.id[~df.id.isin(l)] = np.nan 
df = df.ffill().groupby('id').sum() 

     text 
id 
1.0  abczxc 
3.0  qweasfefe 
6.0  ertpoiwereer 
10.0 poywqr

来源

2017-04-17 16:04:40 Vaishali

Python的大熊猫：追加数据帧的行和删除附加行

回答

相关问题