1
我有我从一个数据帧创建一个矩阵,我要删除所有列在这里它的每一个值是0。Python的 - 在矩阵删除行和列,其中所有的值是0
我见过的例子使用dropna df2.loc[:, (df2 != 0).any(axis=0)]
但它不会对我的数据框做任何事情。
这是我建立了我的矩阵:
a = ['Psychology','Education','Social policy','Sociology','Pol. sci. & internat. studies','Development studies','Social anthropology','Area Studies','Science and Technology Studies','Law & legal studies','Economics','Management & business studies','Human Geography','Environmental planning','Demography','Social work','Tools, technologies & methods','Linguistics','History']
final_df = new_df[new_df['Subject'].isin(a)]
ctrs = {location: Counter(gp.GrantRefNumber) for location, gp in final_df.groupby('Subject')}
ctrs = list(ctrs.items())
overlaps = [(loc1, loc2, sum(min(ctr1[k], ctr2[k]) for k in ctr1))
for i, (loc1, ctr1) in enumerate(ctrs, start=1)
for (loc2, ctr2) in ctrs[i:] if loc1 != loc2]
overlaps += [(l2, l1, c) for l1, l2, c in overlaps]
df22 = pd.DataFrame(overlaps, columns=['Loc1', 'Loc2', 'Count'])
df22 = df22.set_index(['Loc1', 'Loc2'])
df22 = df22.unstack().fillna(0).astype(int)
#the end part of the next line filters the top 'x' amount.
b = np.sort(np.unique(df22.values.ravel()))[-20:]
df2 = df22.where(df22.isin(b),0.0)
有趣的(或没有),当我输入df2.columns
,我得到:
MultiIndex(levels=[[u'Count'], [u'Area Studies', u'Demography', u'Development studies', u'Economics', u'Education', u'Environmental planning', u'History', u'Human Geography', u'Law & legal studies', u'Linguistics', u'Management & business studies', u'Pol. sci. & internat. studies', u'Psychology', u'Science and Technology Studies', u'Social anthropology', u'Social policy', u'Social work', u'Sociology', u'Tools, technologies & methods']],
labels=[[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0], [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18]],
names=[None, u'Loc2'])
这可能是为什么我在挣扎。
嘿Jezrael,只是都尝试你的例子,它从数据框中删除一切,刚刚离开“LOC1和列表(即删除所有的数字和列标题?我有一些列中的数字,所以它不应该删除所有这些 – ScoutEU
我正在阅读要求作为删除所有值为0的列 - 这将删除那些其中任何值为0 ... –
咋,我的意思是所有的值都在0列:) – ScoutEU