我需要乘以两个具有相同最高级别索引的MultiIndexed帧(比如df1, df2
),以便对于每个最高级索引,每行df1
乘以每行元素为df2
。我已实现了以下的例子,我想要做什么,但它看起来很丑陋:两个熊猫MultiIndex帧每行与每一行相乘
a = ['alpha', 'beta']
b = ['A', 'B', 'C']
c = ['foo', 'bar']
df1 = pd.DataFrame(np.random.randn(6, 4),
index=pd.MultiIndex.from_product(
[a, b],
names=['greek', 'latin']),
columns=['C1', 'C2', 'C3', 'C4'])
df2 = pd.DataFrame(
np.array([[1, 0, 1, 0], [1, 1, 1, 1], [0, 0, 0, 0], [0, 2, 0, 4]]),
index=pd.MultiIndex.from_product([a, c], names=['greek', 'foobar']),
columns=['C1', 'C2', 'C3', 'C4'])
df3 = pd.DataFrame(
columns=['greek', 'latin', 'foobar', 'C1', 'C2', 'C3', 'C4'])
for i in df1.index.get_level_values('greek').unique():
for j in df1.loc[i].index.get_level_values('latin').unique():
for k in df2.loc[i].index.get_level_values('foobar').unique():
df3 = df3.append(pd.Series([i, j, k],
index=['greek', 'latin', 'foobar']
).append(
df1.loc[i, j] * df2.loc[i, k]), ignore_index=True)
df3.set_index(['greek', 'latin', 'foobar'], inplace=True)
正如你所看到的代码是非常手动定义手动多次柱等,并设置指标到底。这里是输入和选择。他们是正确的,正是我想要的:
DF1:
C1 C2 C3 C4
greek latin
alpha A 0.208380 0.856373 -1.041598 1.219707
B 1.547903 -0.001023 0.918973 1.153554
C 0.195868 2.772840 0.060960 0.311247
beta A 0.690405 -1.258012 0.118000 -0.346677
B 0.488327 -1.206428 0.967658 1.198287
C 0.420098 -0.165721 0.626893 -0.377909,
DF2:
C1 C2 C3 C4
greek foobar
alpha foo 1 0 1 0
bar 1 1 1 1
beta foo 0 0 0 0
bar 0 2 0 4
结果:
C1 C2 C3 C4
greek latin foobar
alpha A foo 0.208380 0.000000 -1.041598 0.000000
bar 0.208380 0.856373 -1.041598 1.219707
B foo 1.547903 -0.000000 0.918973 0.000000
bar 1.547903 -0.001023 0.918973 1.153554
C foo 0.195868 0.000000 0.060960 0.000000
bar 0.195868 2.772840 0.060960 0.311247
beta A foo 0.000000 -0.000000 0.000000 -0.000000
bar 0.000000 -2.516025 0.000000 -1.386708
B foo 0.000000 -0.000000 0.000000 0.000000
bar 0.000000 -2.412855 0.000000 4.793149
C foo 0.000000 -0.000000 0.000000 -0.000000
bar 0.000000 -0.331443 0.000000 -1.511638
提前致谢!