2014-04-16 55 views
4

我有一个A*x = B形式的方程组,其中[A]是一个三对角线系数矩阵。使用Numpy求解器numpy.linalg.solve我可以求解x的方程组。优化三对角线系数矩阵的A * x = B解

请参阅下面的示例以了解如何开发三角形[A] martix。该{B}载体,解决了x

# Solve system of equations with a tridiagonal coefficient matrix 
# uses numpy.linalg.solve 

# use Python 3 print function 
from __future__ import print_function 
from __future__ import division 

# modules 
import numpy as np 
import time 

ti = time.clock() 

#---- Build [A] array and {B} column vector 

m = 1000 # size of array, make this 8000 to see time benefits 

A = np.zeros((m, m))  # pre-allocate [A] array 
B = np.zeros((m, 1))  # pre-allocate {B} column vector 

A[0, 0] = 1 
A[0, 1] = 2 
B[0, 0] = 1 

for i in range(1, m-1): 
    A[i, i-1] = 7 # node-1 
    A[i, i] = 8  # node 
    A[i, i+1] = 9 # node+1 
    B[i, 0] = 2 

A[m-1, m-2] = 3 
A[m-1, m-1] = 4 
B[m-1, 0] = 3 

print('A \n', A) 
print('B \n', B) 

#---- Solve using numpy.linalg.solve 

x = np.linalg.solve(A, B)  # solve A*x = B for x 

print('x \n', x) 

#---- Elapsed time for each approach 

print('NUMPY time', time.clock()-ti, 'seconds') 

所以我的问题涉及到上面的例子中的两个部分:

  1. 因为我处理了[A]一个角矩阵,也称为带状矩阵,有没有更有效的方法来解决方程组而不是使用numpy.linalg.solve
  2. 此外,有没有更好的方法来创建三角对角矩阵,而不是使用for-loop

上述示例根据time.clock()函数在Linux上运行约0.08 seconds

numpy.linalg.solve功能工作正常,但我试图找到一种方法,在进一步加快解决方案的希望采取的[A]的三对角形式的优势,然后将该方法更加复杂的例子。

+1

你的意思是像scipy.linalg.solve_banded()? –

+0

@CraigJCopi'scipy.linalg.solve_banded()'需要LU元组。计算LU元组然后求解solve_banded会更快吗? – wigging

+0

您可以在这里使用Thomas算法,这可能会更快。维基百科有一个实现http://en.wikipedia.org/wiki/Tridiagonal_matrix_algorithm#Python – y300

回答

2

有两种立即性能改进(1)不使用循环,(2)使用scipy.linalg.solve_banded()

我会写代码的东西更像

import scipy.linalg as la 

# Create arrays and set values 
ab = np.zeros((3,m)) 
b = 2*ones(m) 
ab[0] = 9 
ab[1] = 8 
ab[2] = 7 

# Fix end points 
ab[0,1] = 2 
ab[1,0] = 1 
ab[1,-1] = 4 
ab[2,-2] = 3 
b[0] = 1 
b[-1] = 3 

return la.solve_banded ((1,1),ab,b) 

可能有更优雅的方式来构建矩阵,但这个工程。

ipython中使用%timeit原始码花了112毫秒为m = 1000。这个代码在m = 10,000时需要2.94 ms,这是一个数量级更大的问题,但仍然快了两个数量级!我没有耐心等待m = 10,000的原始代码。原来的大部分时间可能在构建阵列,我没有测试这个。无论如何,对于大型数组来说,只存储矩阵的非零值会更有效率。

1

有一个scipy.sparse矩阵类型,称为scipy.sparse.dia_matrix,它很好地捕获了矩阵的结构(它将存储3个数组,在“位置”0(对角线),1(上方)和-1(下方))。使用这种类型的矩阵,你可以尝试scipy.sparse.linalg.lsqr来解决。如果你的问题有一个确切的解决方案,它会被发现,否则它会找到最小二乘意义的解决方案。

from scipy import sparse 
A_sparse = sparse.dia_matrix(A) 
ret_values = sparse.linalg.lsqr(A_sparse, C) 
x = ret_values[0] 

然而,这可能不是在剥削triadiagonal结构方面完全优化,有可能使这一快的理论方法。这种转换对你的作用是将矩阵乘法开销减少到必要条件:仅使用3个频段。这与迭代求解器lsqr的组合应该已经产生加速。

注意:我并不是主张scipy.sparse.linalg.spsolve,因为你的矩阵转换为csr格式。但是,用spsolve代替lsqr值得一试,尤其是因为spsolve可以绑定UMFPACK,请参阅相关doc on spsolve。另外,请看this stackoverflow question and answer relating to UMFPACK

+0

我找不到模块'scipy.sparse.lsqr'。你是不是指'scipy.sparse.linalg.lsqr'? – wigging

+0

哦,是的,当然是'scipy.sparse.linalg.lsqr'。对不起,感谢您的输入。 (编辑) – eickenberg

+0

如果有帮助,我用更好的例子更新了我的问题。 – wigging

1

您可以使用scipy.linalg.solveh_banded

编辑:你为你的矩阵不是对称的,我认为这是CAN NOT使用上述。然而,如上面的评论中提到的,Thomas算法是为这个伟大的

a =  [7] * (m - 2) + [3] 
b = [1] + [8] * (m - 2) + [4] 
c = [2] + [9] * (m - 2) 
d = [1] + [2] * (m - 2) + [3] 

# This is taken directly from the Wikipedia page also cited above 
# this overwrites b and d 
def TDMASolve(a, b, c, d): 
    n = len(d) # n is the numbers of rows, a and c has length n-1 
    for i in xrange(n-1): 
     d[i+1] -= 1. * d[i] * a[i]/b[i] 
     b[i+1] -= 1. * c[i] * a[i]/b[i] 
    for i in reversed(xrange(n-1)): 
     d[i] -= d[i+1] * c[i]/b[i+1] 
    return [d[i]/b[i] for i in xrange(n)] 

此代码不能优化也不会使用np,但如果我(或任何这里的其他精细人的)有时间,我会编辑它,以便它做到这些。目前,m = 10000时约为10毫秒。

+0

如果有帮助,我用一个更好的例子更新了我的问题。 – wigging

0

这可能会帮助 有一个函数creates_tridiagonal将创建三对角矩阵。还有另一个函数可以按照SciPy solve_banded函数的要求将矩阵转换为对角有序形式。

import numpy as np  

def lu_decomp3(a): 
    """ 
    c,d,e = lu_decomp3(a). 
    LU decomposition of tridiagonal matrix a = [c\d\e]. On output 
    {c},{d} and {e} are the diagonals of the decomposed matrix a. 
    """ 
    n = np.diagonal(a).size 
    assert(np.all(a.shape ==(n,n))) # check if square matrix 

    d = np.copy(np.diagonal(a)) # without copy (assignment destination is read-only) error is raised 
    e = np.copy(np.diagonal(a, 1)) 
    c = np.copy(np.diagonal(a, -1)) 

    for k in range(1,n): 
     lam = c[k-1]/d[k-1] 
     d[k] = d[k] - lam*e[k-1] 
     c[k-1] = lam 
    return c,d,e 

def lu_solve3(c,d,e,b): 
    """ 
    x = lu_solve(c,d,e,b). 
    Solves [c\d\e]{x} = {b}, where {c}, {d} and {e} are the 
    vectors returned from lu_decomp3. 
    """ 
    n = len(d) 
    y = np.zeros_like(b) 

    y[0] = b[0] 
    for k in range(1,n): 
     y[k] = b[k] - c[k-1]*y[k-1] 

    x = np.zeros_like(b) 
    x[n-1] = y[n-1]/d[n-1] # there is no x[n] out of range 
    for k in range(n-2,-1,-1): 
     x[k] = (y[k] - e[k]*x[k+1])/d[k] 
    return x 

from scipy.sparse import diags 
def create_tridiagonal(size = 4): 
    diag = np.random.randn(size)*100 
    diag_pos1 = np.random.randn(size-1)*10 
    diag_neg1 = np.random.randn(size-1)*10 

    a = diags([diag_neg1, diag, diag_pos1], offsets=[-1, 0, 1],shape=(size,size)).todense() 
    return a 

a = create_tridiagonal(4) 
b = np.random.randn(4)*10 

print('matrix a is\n = {} \n\n and vector b is \n {}'.format(a, b)) 

c, d, e = lu_decomp3(a) 
x = lu_solve3(c, d, e, b) 

print("x from our function is {}".format(x)) 

print("check is answer correct ({})".format(np.allclose(np.dot(a, x), b))) 


## Test Scipy 
from scipy.linalg import solve_banded 

def diagonal_form(a, upper = 1, lower= 1): 
    """ 
    a is a numpy square matrix 
    this function converts a square matrix to diagonal ordered form 
    returned matrix in ab shape which can be used directly for scipy.linalg.solve_banded 
    """ 
    n = a.shape[1] 
    assert(np.all(a.shape ==(n,n))) 

    ab = np.zeros((2*n-1, n)) 

    for i in range(n): 
     ab[i,(n-1)-i:] = np.diagonal(a,(n-1)-i) 

    for i in range(n-1): 
     ab[(2*n-2)-i,:i+1] = np.diagonal(a,i-(n-1)) 


    mid_row_inx = int(ab.shape[0]/2) 
    upper_rows = [mid_row_inx - i for i in range(1, upper+1)] 
    upper_rows.reverse() 
    upper_rows.append(mid_row_inx) 
    lower_rows = [mid_row_inx + i for i in range(1, lower+1)] 
    keep_rows = upper_rows+lower_rows 
    ab = ab[keep_rows,:] 


    return ab 

ab = diagonal_form(a, upper=1, lower=1) # for tridiagonal matrix upper and lower = 1 

x_sp = solve_banded((1,1), ab, b) 
print("is our answer the same as scipy answer ({})".format(np.allclose(x, x_sp)))