2016-11-26 51 views
3

我想重建(逼近)在SVD中分解的原始矩阵。有没有办法做到这一点,而不必将V factor本地Matrix转换为DenseMatrix如何使用Spark重构svd组件的原始矩阵

下面是基于该documentation分解(注意,注释是从DOC例子)

import org.apache.spark.mllib.linalg.Matrix 
import org.apache.spark.mllib.linalg.SingularValueDecomposition 
import org.apache.spark.mllib.linalg.Vector 
import org.apache.spark.mllib.linalg.distributed.RowMatrix 

val data = Array(
    Vectors.dense(1.0, 0.0, 7.0, 0.0, 0.0), 
    Vectors.dense(2.0, 0.0, 3.0, 4.0, 5.0), 
    Vectors.dense(4.0, 0.0, 0.0, 6.0, 7.0)) 

val dataRDD = sc.parallelize(data, 2) 

val mat: RowMatrix = new RowMatrix(dataRDD) 

// Compute the top 5 singular values and corresponding singular vectors. 
val svd: SingularValueDecomposition[RowMatrix, Matrix] = mat.computeSVD(5, computeU = true) 
val U: RowMatrix = svd.U // The U factor is a RowMatrix. 
val s: Vector = svd.s // The singular values are stored in a local dense vector. 
val V: Matrix = svd.V // The V factor is a local dense matrix. 

重构原始矩阵,我必须计算U *对角线(秒)*转置(V )。

首先要将奇异值向量s转换成对角矩阵S

import org.apache.spark.mllib.linalg.Matrices 
val S = Matrices.diag(s) 

但是,当我尝试计算U *对角线(s)*转置(V):我得到以下错误。

val dataApprox = U.multiply(S.multiply(V.transpose)) 

我收到以下错误:

error: type mismatch; found: org.apache.spark.mllib.linalg.Matrix required: org.apache.spark.mllib.linalg.DenseMatrix

它的工作原理,如果我转换MatrixVDenseMatrixVdense

import org.apache.spark.mllib.linalg.DenseMatrix 
val Vdense = new DenseMatrix(V.numRows, V.numCols, V.toArray) 
val dataApprox = U.multiply(S.multiply(Vdense.transpose)) 

有没有办法让原始矩阵的约dataApprox没有这个转换的svd输出?

回答

0

下面的代码为我工作

//numTopSingularValues=Features used for SVD 
val latentFeatureArray=s.toArray 

//Making a ListBuffer to Make a DenseMatrix for s 
var denseMatListBuffer=ListBuffer.empty[Double] 
val zeroListBuffer=ListBuffer.empty[Double] 
var addZeroIndex=0 
while (addZeroIndex < numTopSingularValues) 
    { 
    zeroListBuffer+=0.0D 
    addZeroIndex+=1 
    } 
var addDiagElemIndex=0 
while(addDiagElemIndex<(numTopSingularValues-1)) 
    { 
    denseMatListBuffer+=latentFeatureArray(addDiagElemIndex) 
    denseMatListBuffer.appendAll(zeroListBuffer) 
    addDiagElemIndex+=1 
    } 
denseMatListBuffer+=latentFeatureArray(numTopSingularValues-1) 

val sDenseMatrix=new DenseMatrix(numTopSingularValues,numTopSingularValues,denseMatListBuffer.toArray) 

val vMultiplyS=V.multiply(sDenseMatrix) 

val postMulWithUDenseMat=vMultiplyS.transpose 

val dataApprox=U.multiply(postMulWithUDenseMat)