2016-05-11 22 views
0

我在matlab中实现期望最大化算法。算法在214096 x 2数据矩阵上运行,并且在计算概率时,存在(214096 x 2)*(2 x 2)*(2 x 214096)矩阵的乘法,这会导致matlab内存不足。有没有办法解决这个问题?期望最大化算法matlab内存不足错误

Equation

MATLAB代码:

  enter image description here D = size(X,2); % dimension 
      N = size(X,1); % number of samples 
      K = 4; % number of Gaussian Mixture components (Also number of clusters) 

      % Initialization 
      p = [0.2, 0.3, 0.2, 0.3]; % arbitrary pi, probabilities of clusters, apriori probability of cluster 
      [idx,mu] = kmeans(X,K); % initial means of the components, theta is mu and variance 

      % compute the covariance of the components 
      sigma = zeros(D,D,K); 
      for k = 1:K 
       tempmat = X(idx==k,:); 
       sigma(:,:,k) = cov(tempmat); % Sigma j 
       sigma_det(k) = det(sigma(:,:,k)); 
      end 

      % calculate x-mu 
      for k=1: K 
          check=length(X(idx == k,1)) 
          for lidx = 1: length(X(idx == k,1)) 

           cidx = find(idx == k) ; 
           Xmu(cidx(lidx),:) = X(cidx(lidx),:) - mu(k,:); %(x-mu) calculation on cluster level 
          end 
      end 


      % compute P(Cj|x; theta(t)), and take log to simplified calculation 

      %Eq 14.14 denominator 
      denom = 0; 
      for k=1:K 
       calc_sigma_1_2 = sigma_det(k)^(-1/2); 
       calc_x_mu = Xmu(idx == k,:); 
       calc_sigma_inv = inv(sigma(:,:,k)); 
       calc_x_mu_tran = calc_x_mu.'; 
       factor = calc_sigma_1_2 * exp (-1/2 * calc_x_mu * calc_sigma_inv * calc_x_mu_tran ) * p(k); 

       denom = denom + factor; 
      end 


      for k =1:K 
       calc_sigma_1_2 = sigma_det(k)^(-1/2); 
       calc_x_mu = Xmu(idx == k,:); 
       calc_sigma_inv = inv(sigma(:,:,k)); 
       calc_x_mu_tran = calc_x_mu.'; 
       factor = calc_sigma_1_2 * exp (-1/2 * calc_x_mu_tran * calc_sigma_inv * calc_x_mu) * p(k); 

       pdf(k) = factor/denom; 
      end 

      %%%% Equation 14.14 ends 
+0

是214096维数/特征的数量? – lejlot

+0

214096是2个维度中每个维度的观察次数 – Umar

+0

在EM算法中,您会得到N^2个元素的矩阵吗?它似乎不正确。你为什么需要Gramian? – lejlot

回答

0

看来你试图通过简单的矩阵替代向量应用基于矢量方程,这是不是它是如何工作

(x - mu).' * Inv(sigma) * (x-mu) 

是应该是mahalanobis(x-mu)的范数,并且您想要获得每行矩阵的这个值X,从而

(X - mu).' * Inv(sigma) =: A <- this is ok, this results in N x d matrix 

,现在你要做的逐点乘法与(X - 亩),而不是一个点的产品,最后总结了第二轴(列),这样一来你结束与N元素矢量相关,每个元素包含从X起对应行的mahalanobis范数。

+0

非常感谢@lejlot,即时修复这个问题 – Umar