使用DataFrame.mul时出错，涉及到ndarray

我参考这篇文章，Get dot-product of dataframe with vector, and return dataframe, in Pandas，使用DataFrame.mul。

我的问题的代码是这样
使用DataFrame.mul时出错，涉及到ndarray

df.mul(weight)

其中重量是具有形状（17L，1L） 'numpy.ndarray' 的数据类型，并且打印结果是

[[ 2.37005330e-07] 
[ 2.80515078e-07] 
[ 2.80267682e-07] 
[ 2.79124521e-07] 
[ 2.01799847e-07] 
[ 2.71495529e-07] 
[ 2.81640566e-07] 
[ 2.30099310e-07] 
[ 1.95221059e-07] 
[ 2.10244387e-07] 
[ 2.82483251e-07] 
[ 2.29050342e-07] 
[ 9.99996381e-01] 
[ 8.95340469e-08] 
[ 3.90767576e-08] 
[ 2.31231511e-07] 
[ 2.79852240e-07]]

其中Df是一个形状为[20208 rows x 17列]的数据框对象，其打印结果类似于

     12&88 17&123 .... 
modified datetime       
2015-09-07 09:19:00 1.000000 1.000000 .... 
2015-09-07 09:30:00 1.000000 1.000000 .... 
2015-09-07 09:31:00 1.000000 0.974714 .... 
2015-09-07 09:32:00 1.000000 0.978203 .... 
2015-09-07 09:33:00 1.000000 0.978203 .... 
2015-09-07 09:34:00 1.000000 0.990576 .... 
....

但是，当我执行df.mul（重量），它发生

ValueError: Shape of passed values is (1, 17), indices imply (17, 20208)

我试图更简单的阵列形状（17L）并且没有使用df.mul.so不知是否应该改变问题重量到ndarray到阵列，但对我来说很难。如何改变或者有没有更好的主意来解决这个问题？非常感谢你的帮助！

这里是我的原代码

weight, means, stds = optimal_portfolio(result_framea.transpose()) 

    c , b= test.pairs_trade(load_path, sNo_list[0]) 
    result_frame = pd.DataFrame(index = c.index) 
    for i, sNo in enumerate(sNo_list): 
     c,b = test.pairs_trade(load_path, sNo) 
     result_frame[sNo[0]+'&'+sNo[1]] = c['returns'] 
    df=result_frame.fillna(method='pad')

各地都很好，直到后df.mul（重量）的时刻。再次谢谢你！

来源

2016-04-13 Yui Hung Cheung

你可以试试'df.mul（weight，axis = 0）'基本上它使用次轴来对齐由于广播规则 – EdChum

我也尝试axis = 1后axis = 0，仍然是相同的值错误。 –

但是当我随机设置一个像数组一样的新“权重”（[1,2,3,4,1,2,3,4,1,2,3,4,1,2,3,4,1]），它可以工作。 –

从help(pd.DataFrame.mul)：

mul(self, other, axis='columns', level=None, fill_value=None) unbound pandas.core.frame.DataFrame method

Multiplication of dataframe and other , element-wise (binary operator mul).

Equivalent to dataframe * other , but with support to substitute a fill_value for missing data in one of the inputs.

这表明，在最简单的情况df.mul只会执行相应阵列的numpy的风格乘法。所以，你试图将形状为(20208,17)的数组与形状(17,1)中的一个相乘。这不起作用。

array broadcasting在numpy中的工作方式是，具有某些奇异维度的数组可以通过numpy自动扩展，以便将它们与算术运算中的其他更大的数组进行匹配。值得注意的是，如果其中一个阵列具有较小的尺寸，则假定单体尺寸为，其中前导单元为。

因此，例如，下面的阵列形状可以成倍/添加/分/等一起没有问题：

(1,17)和(20208,17)因为非单尺寸相符
(17,)和(20208,17)因为首先隐含地与(1,17)兼容（假定前导单独尺寸）
（5,1,17）and（1,20208,17）(or just（20208,17）`）

以下不能播在一起：

(1,16)和(20208,17)因为有尺寸不符
(16,) and（20208,17）because the mismatch is there even after implicitly expanding the first one to shape（1,16）`
为(17,1)和(20208,17)现在显而易见的原因

问题是，熊猫显示

ValueError: Shape of passed values is (1, 17), indices imply (17, 20208)

同样看起来像这样在numpy的（尝试np.random.rand(17,1)*np.random.rand(20208,17)）：神秘的错误消息，你在你的问题引述

ValueError: operands could not be broadcast together with shapes (17,1) (20208,17)

后者的错误是清澈的，并会可能救你很多头部划伤。

解决方案很简单：reshape您的形状为(17,1)（2d数组中的列向量）的重量数组形成(17,)（1d数组）。这可以与您的大型阵列广播。要做到这一点，只需拨打reshape与-1尺寸参数，告诉numpy的，以确定您的一维数组的长度：

df.mul(weight.reshape(-1))

注意，resut将是相同的shape的数据df，但每列将乘以从weight相应的元素。

来源

2016-04-15 23:27:27

使用DataFrame.mul时出错，涉及到ndarray

回答

相关问题