我正在将Excel电子表格转换为Python,以便自动化并加速完成几项任务。我需要向DataFrame添加几列,并根据前一列中的值向它们添加数据。我已经使用两个嵌套for循环工作,但它确实很慢,我知道Pandas并非专为单元格工作而设计。这里是我的问题的一个样本:避免Pandas DataFrame循环的有效方法
import pandas as pd
results = pd.DataFrame({'scores':[78.5, 91.0, 103.5], 'outcomes':[1,0,1]})
thresholds = [103.5, 98.5, 93.5, 88.5, 83.5, 78.5]
for threshold in thresholds:
results[str(threshold)] = 0
for index, row in results.iterrows():
if row['scores'] > threshold:
results.set_value(index, str(threshold), row['outcomes'])
print (results)
和正确的输出:
outcomes scores 103.5 98.5 93.5 88.5 83.5 78.5
0 1 78.5 0 0 0 0 0 0
1 0 91.0 0 0 0 0 0 0
2 1 103.5 0 1 1 1 1 1
什么是这样做的更有效的方法?我一直在尝试将DataFrame转换为按列而不是行来工作,但我无法获得任何工作。 感谢您的帮助!
http://stackoverflow.com/questions/43398468/rounding-to-specific-numbers-in-python-3-6/43398652#43398652 – Serge
http://stackoverflow.com/questions/14947909/ python-checking-to-which-bin-a-value-belong?noredirect = 1&lq = 1 – Serge