2017-04-24 37 views
1

我有一个DataFrame,它有一个'pred'列,它是空的,我希望用一些特定的值更新它。他们原本在numpy的数组,但我还是坚持他们在一个叫“本”系列: 打印(类型(预测)) 如何使用新值更新列的特定DataFrame切片?

print(predictions) 
['collection2' 'collection2' 'collection2' 'collection1' 'collection2' 
'collection1'] 

this = pd.Series(predictions, index=test_indices) 

print(type(data)) 
<class 'pandas.core.frame.DataFrame'> 

print(data.shape) 
(35, 4) 

print(data.iloc[test_indices]) 
    class   pred           text \ 
223 collection2 [] Fellow-Citizens of the Senate and House of Rep... 
20 collection1 [] The period for a new election of a citizen to ... 
12 collection1 [] Fellow Citizens of the Senate and of the House... 
13 collection1 [] Whereas combinations to defeat the execution o... 
212 collection2 [] MR. PRESIDENT AND FELLOW-CITIZENS OF NEW-YORK:... 
230 collection2 [] Fellow-Countrymen:\nAt this second appearing t... 

               title 
223        First Annual Message 
20         Farewell Address 
12     Fifth Annual Message to Congress 
13 Proclamation against Opposition to Execution o... 
212        Cooper Union Address 
230       Second Inaugural Address 

print(type(this)) 
<class 'pandas.core.series.Series'> 

print(this.shape) 
(6,) 

print(this) 
0 collection2 
1 collection1 
2 collection1 
3 collection1 
4 collection2 
5 collection2 

我想我可以做这样的:

data.iloc[test_indices, [4]] = this 

但导致

IndexError: positional indexers are out-of-bounds 

data.ix[test_indices, ['pred']] = this 
KeyError: '[0] not in index' 

回答

1

尝试:

data.loc[data.index[test_indices], 'pred'] = this 
1

我喜欢.IX过的.loc。您可以使用

data.ix[bool_series, 'pred'] = this 

这里,bool_series是包含真为你想更新值的行,否则为假的布尔系列。例如:

bool_series = ((data['col1'] > some_number) & (data['col2'] < some_other_number)) 

但是,请确保你已经有了一个“预解码”列使用data.ix之前[bool_series,“预解码”。否则,它会给出错误。

+0

ix将被弃用 – piRSquared

+0

哦,谢谢你的更新。我没有意识到这一点。 –

相关问题