2012-04-13 19 views
0

我打算将各种文件夹和文本文件中的数据导入到matlab中。使用cellfun向量化脚本

clear all 
main_folder = 'E:\data'; 
    %Directory of data 
TopFolder = dir(main_folder); 
    %exclude the first two cells as they are just pointers. 
TopFolder = TopFolder(3:end); 
TopFolder = struct2cell(TopFolder); 
Name1 = TopFolder(1,:); 
    %obtain the name of each folder 
dirListing = cellfun(@(x)dir(fullfile(main_folder,x,'*.txt')),Name1,'un',0); 
Variables = cellfun(@(x)struct2cell(x),dirListing,'un',0); 
FilesToRead = cellfun(@(x)x(1,:),Variables,'un',0); 
    %obtain the name of each text file in each folder 

这提供了'main_folder'内每个文件夹中每个文本文件的名称。我现在试图加载数据而不使用for循环(我意识到for循环有时会更快,但我正在瞄准一个紧凑的脚本)。

我会使用的方法for循环将是:

for k = 1:length(FilesToRead); 
    filename{k} = cellfun(@(x)fullfile(main_folder,Name{k},x),FilesToRead{k},'un',0); 
    fid{k} = cellfun(@(x)fopen(x),filename{k},'un',0); 
    C{k} = cellfun(@(x)textscan(x,'%f'),fid{k},'un',0); 
end 

是否有这不会使用循环在所有涉及到的方法? cellfun内的cellfun可能吗?

回答

0
folder = 'E:\data'; 
files = dir(fullfile(folder, '*.txt')); 
full_names = strcat(folder, filesep, {files.name}); 
fids = cellfun(@(x) fopen(x, 'r'), full_names); 
c = arrayfun(@(x) textscan(x, '%f'), fids); % load data here 
res = arrayfun(@(x) fclose(x), fids); 
assert(all(res == 0), 'error in closing files'); 

但如果数据是CSV格式它可以更容易:

folder = 'E:\data'; 
files = dir(fullfile(folder, '*.txt')); 
full_names = strcat(folder, filesep, {files.name}); 
c = cellfun(@(x) csvread(x), full_names, 'UniformOutput', false); 

现在所有的数据都存储在c

0

是的。这将是非常可怕的,因为C取决于fid取决于文件名。其基本思路是:

deal(feval(@(filenames_fids){filenames_fids{1}, filenames_fids{2}, ... 
    <compute C>}, feval(@(filenames){filenames, <compute fid>}, ... 
    <compute filenames>))); 

让我们开始计算文件名:

arrayfun(@(x)cellfun(@(x)fullfile(main_folder,Name{k},x),FilesToRead{k},... 
    'un',0), 1:length(FilesToRead), 'uniformoutput', 0); 

这会给我们的文件名的K-通过-1单元阵列。现在,我们可以用它来计算FIDS:

{filenames, arrayfun(@(k)cellfun(@(x)fopen(x),filenames{k},'un',0), ... 
    1:length(FilesToRead), 'uniformoutput', 0)}; 

我们坚持FIDS一起在K-通过-2单元阵列的文件名,准备通过在计算我们的最终产出:

{filenames_fids{1}, filenames_fids{2}, ... 
    arrayfun(@(k)cellfun(@(x)textscan(x,'%f'), ... 
    filenames_fid{2}{k},'un',0), 1:length(FilesToRead), 'uniformoutput', 0)} 

然后我们将最后的单元阵列放入交易中,以便结果以三个不同的变量结束。

[filenames fid C] = deal(feval(@(filenames_fids){filenames_fids{1}, ... 
    filenames_fids{2}, arrayfun(@(k)cellfun(@(x)textscan(x,'%f'), ... 
    filenames_fid{2}{k},'un',0), 1:length(FilesToRead), 'uniformoutput', 0)}, ... 
    feval(@(filenames){filenames, arrayfun(@(k)cellfun(@(x)fopen(x), ... 
    filenames{k},'un',0), 1:length(FilesToRead), 'uniformoutput', 0)}, ... 
    arrayfun(@(x)cellfun(@(x)fullfile(main_folder,Name{k},x),FilesToRead{k}, ... 
    'un',0), 1:length(FilesToRead), 'uniformoutput', 0)))); 

ERRM ...有可能做到这一点,如果你不介意保持filenamesfid一个更好的方式。也许使用cellfun而不是arrayfun也可以使它更加简洁,但我对cellfuns并不是很好,所以这就是我想出来的。无论如何,我认为for循环版本更加紧凑​​! (另外,我没有真正测试过它,它可能需要一些调试)。

+0

我同意,环路版本似乎是方式更紧凑。 – Emma 2012-04-13 11:07:55