除了内存方面的考虑,我相信你可以在einsum
的单个调用中完成收缩,不过你需要一些预处理。我并不完全确定“”是什么意思,因为我收缩了一对指数,结果是一个新的张量,它不属于列表L
“,但我认为在单一步骤中收缩会完全解决这个问题。
我建议使用可替代的einsum
数字索引语法:
einsum(op0, sublist0, op1, sublist1, ..., [sublistout])
所以你需要做的是编码指数在整数序列收缩。首先,您需要首先设置一系列唯一索引,并将其他副本保留为sublistout
。然后,遍历您的连接图,您需要在必要时将合同指数设置为相同的指数,并同时从sublistout
中删除合同指数。
import numpy as np
def contract_all(tensors,conns):
'''
Contract the tensors inside the list tensors
according to the connectivities in conns
Example input:
tensors = [np.random.rand(2,3),np.random.rand(3,4,5),np.random.rand(3,4)]
conns = [((0,1),(2,0)), ((1,1),(2,1))]
returned shape in this case is (2,3,5)
'''
ndims = [t.ndim for t in tensors]
totdims = sum(ndims)
dims0 = np.arange(totdims)
# keep track of sublistout throughout
sublistout = set(dims0.tolist())
# cut up the index array according to tensors
# (throw away empty list at the end)
inds = np.split(dims0,np.cumsum(ndims))[:-1]
# we also need to convert to a list, otherwise einsum chokes
inds = [ind.tolist() for ind in inds]
# if there were no contractions, we'd call
# np.einsum(*zip(tensors,inds),sublistout)
# instead we need to loop over the connectivity graph
# and manipulate the indices
for (m,i),(n,j) in conns:
# tensors[m][i] contracted with tensors[n][j]
# remove the old indices from sublistout which is a set
sublistout -= {inds[m][i],inds[n][j]}
# contract the indices
inds[n][j] = inds[m][i]
# zip and flatten the tensors and indices
args = [subarg for arg in zip(tensors,inds) for subarg in arg]
# assuming there are no multiple contractions, we're done here
return np.einsum(*args,sublistout)
一个简单的例子:
>>> tensors = [np.random.rand(2,3), np.random.rand(4,3)]
>>> conns = [((0,1),(1,1))]
>>> contract_all(tensors,conns)
array([[ 1.51970003, 1.06482209, 1.61478989, 1.86329518],
[ 1.16334367, 0.60125945, 1.00275992, 1.43578448]])
>>> np.einsum('ij,kj',tensors[0],tensors[1])
array([[ 1.51970003, 1.06482209, 1.61478989, 1.86329518],
[ 1.16334367, 0.60125945, 1.00275992, 1.43578448]])
如果有多个收缩,循环物流变得更加复杂一点,因为我们需要处理所有的副本。然而,逻辑是一样的。此外,上述显然缺少检查,以确保相应的指数可以约定。
在事后看来,我意识到默认的sublistout
不必指定,einsum
无论如何都使用该顺序。我决定在代码中保留这个变量,因为如果我们想要一个非平凡的输出索引顺序,我们必须适当地处理这个变量,它可能会派上用场。
至于收缩顺序的优化,可以在np.einsum
影响内部优化为1.12版本(由@hpaulj在现在删除的注释说明)。该版本将optimize
可选关键字参数引入np.einsum
,允许选择缩减顺序,以内存为代价减少计算时间。由于optimize
关键字会通过'greedy'
或'optimal'
关键字使numpy选择一个缩减顺序,大小顺序为尺寸的大小。
可用于optimize
关键字来自显然无证(只要在线文档去; help()
幸运作品)的选项功能np.einsum_path
:
einsum_path(subscripts, *operands, optimize='greedy')
Evaluates the lowest cost contraction order for an einsum expression by
considering the creation of intermediate arrays.
从np.einsum_path
输出收缩路径也可以用作输入参数np.einsum
。在你的问题中,你担心使用的内存太多,所以我怀疑没有优化(默认情况下运行时间更长,内存占用更小)。
@hpaulj即使在[特定版本的文档](https://docs.scipy.org/doc/numpy-1.12.0/reference/generated/numpy.einsum.html)我只能看到提及'np.einsum_path',但没有链接...感谢'优化'提示,我完全错过了(即使我有1.12)! –
我通过'Ipython'''操作符读取大部分文档。 – hpaulj
@hpaulj that figures :)我只是恢复到当我知道我正在寻找的功能... –