在熊猫中随机更改行

我有一个熊猫数据框 - 一列有航空公司名称（或公司名称）。我想通过将名称的一小部分（仅在一列中）更改为相似但不相同的名称来生成“混乱”数据集。因此联合航空公司将成为UNITED AIRLINES的一员。以下是我的数据的一个例子设定在熊猫中随机更改行

Description 
0 United Airlines 
1 Pinnacle Airlines Inc. 
2 Ryanair 
3 British Airways

反正有由行随机应用刺的变化为大熊猫数据帧。有没有人有任何想法？

来源

2014-12-05 Peadar Coyle

您可以使用numpy.random.choice来回报您的索引的随机选择，这需要1-d数组，并返回您传递大小的随机选择：

In [177]: 

rand_indices = np.random.choice(df.index, 2) 
rand_indices.sort() 
rand_indices 
Out[177]: 
array([1, 2], dtype=int64) 
In [178]: 

df.loc[rand_indices] 
Out[178]: 
       Description a 
1 Pinnacle Airlines Inc. 1 
2     Ryanair 2 
In [179]: 

def scramble_text(df, index, col): 
    df.loc[index, col] = df[col].str.upper() 

scramble_text(df, rand_indices, 'Description') 
df 
Out[179]: 
       Description a 
0   United Airlines 0 
1 PINNACLE AIRLINES INC. 1 
2     RYANAIR 2 
3   British Airways 3

来源

2014-12-05 11:25:05 EdChum

感谢，这正是我之后。我需要更好地学习df.loc函数:) – 2014-12-06 13:37:21

在熊猫中随机更改行

回答

相关问题