0
我有水果的熊猫数据帧::区分大小写熊猫系列匹配和清洁熊猫系列逻辑
df = pd.read_csv(newfile, header=None)
df
0 1 2 3 4 5 6 7
0 Apple Bananas Fig Elderberry Cherry Honeydew NaN NaN
1 Bananas Cherry Dragon Elderberry NaN NaN NaN NaN
2 Cherry Grape NaN NaN NaN NaN NaN NaN
3 Dragon NaN Apple Bananas Cherry Elderberry NaN NaN
4 Elderberry Apple Bananas Fig Grape NaN NaN NaN
5 Fig Cherry Honeydew Apple NaN NaN NaN NaN
6 Grape NaN NaN NaN NaN NaN NaN NaN
7 Honeydew Grape Fig Elderberry Dragon Cherry Bananas Apple
而且我试图找到“果配对”,例如在第一排中,苹果和无花果是一对,第六排无花果和苹果。对苹果接骨木和接骨木 - 苹果也是如此,但苹果和香蕉没有苹果(从香蕉开始就没有苹果了)。
我有下面的代码的工作,而这是否::
fruits = df[0]
stock = df.drop(0, axis=1)
for i in range(len(fruits)):
string1 = str(fruits[i])
full_line = (stock.iloc[i])
line = np.array(full_line.dropna(axis=0))
if len(line) > 0 :
for j in range(len(stock)):
iind = (fruits[fruits == line[j]].index[0])
this_line = stock.iloc[iind]
logic_out = this_line.str.match(string1)
print(logic_out)
BUT! (1)由于Pandas系列区分大小写,因此它在水果==行[j]处突破,并且(2)布尔输出是True,False和NaN的混合。理想情况下,我只想计算Trues。任何和所有的帮助诉非常感谢!
嗨@piRSquared,这看起来不错,但崩溃在第一行,用KeyError异常:“0”的消息。 ...我编辑了上面的代码,告诉你我是如何在df中阅读的,而.cvs文件如下。 – npross
苹果,香蕉,无花果,接骨木,樱桃,蜜瓜,, 香蕉,樱桃,龙,接骨木浆果,,,, 樱桃,葡萄,,,,,, 龙,苹果,,香蕉,樱桃,接骨木, 接骨木,苹果,香蕉,无花果,葡萄,,, 图,樱桃,蜜露,苹果,,,, 葡萄,,,,,,, 蜜露,葡萄,无花果,接骨木,龙,樱桃,香蕉,苹果 – npross
什么是第一列的实际名称。我假设它是'0',因为当我复制并且超过你提供的数据框时,这被解析。立即尝试我的更新。 – piRSquared