如何导入复杂的CSV文件导入数字矢量到Matlab

我想知道我们应该如何从包含字符串，双打和字符一个复杂的CSV文件读取等如何导入复杂的CSV文件导入数字矢量到Matlab

例如，你可以请提供了成功的可以在这个csv文件中提取数值的命令？

点击here。

例如：

yield curve data 2013-10-04  
Yields in percentages per annum.   


Parameters - AAA-rated bonds   
Series key Parameters Description 
YC.B.U2.EUR.4F.G_N_A.SV_C_YM.BETA0 2.03555 Euro area (changing composition) - Government bond, nominal, all issuers whose rating is triple A - Svensson model - continuous compounding - yield error minimisation - Yield curve parameters, Beta 0 - Euro, provided by ECB 
YC.B.U2.EUR.4F.G_N_A.SV_C_YM.BETA1 -2.009068 Euro area (changing composition) - Government bond, nominal, all issuers whose rating is triple A - Svensson model - continuous compounding - yield error minimisation - Yield curve parameters, Beta 1 - Euro, provided by ECB 
YC.B.U2.EUR.4F.G_N_A.SV_C_YM.BETA2 24.54184 Euro area (changing composition) - Government bond, nominal, all issuers whose rating is triple A - Svensson model - continuous compounding - yield error minimisation - Yield curve parameters, Beta 2 - Euro, provided by ECB 
YC.B.U2.EUR.4F.G_N_A.SV_C_YM.BETA3 -21.80556 Euro area (changing composition) - Government bond, nominal, all issuers whose rating is triple A - Svensson model - continuous compounding - yield error minimisation - Yield curve parameters, Beta 3 - Euro, provided by ECB 
YC.B.U2.EUR.4F.G_N_A.SV_C_YM.TAU1 5.351378 Euro area (changing composition) - Government bond, nominal, all issuers whose rating is triple A - Svensson model - continuous compounding - yield error minimisation - Yield curve parameters, Tau 1 - Euro, provided by ECB 
YC.B.U2.EUR.4F.G_N_A.SV_C_YM.TAU2 4.321162 Euro area (changing composition) - Government bond, nominal, all issuers whose rating is triple A - Svensson model - continuous compounding - yield error minimisation - Yield curve parameters, Tau 2 - Euro, provided by ECB

这些都是信息的一部分，在文件中。我试图csvread('yc_latest.csv', 6, 1, [6,1,6,1])来获取值2.03555，但它给了我下面的错误：

Error using dlmread (line 139) 
    Mismatch between file and format string. 
    Trouble reading number from file (row 1u, field 3u) ==> "Euro area (changing composition) - 
    Government bond, nominal, all issuers whose rating is triple A - Svensson model - continuous 
    compounding - yield error minimisation - Yield curve parameters, Beta 0 

    Error in csvread (line 50) 
     m=dlmread(filename, ',', r, c, rng);

来源

2013-10-07 Cancan

您的感谢尚不成熟。向我们展示您的代码，您的最佳尝试，我们可能会提供帮助。链接到狡猾网站上的zip文件并不鼓励许多SOE遵循它们。 –

你能给我们一个你想如何解析一行的例子吗？（你实际需要哪些数据） –

对不起，我刚刚编辑 – Cancan

我强烈建议您使用“导入数据”从MATLAB的功能（这是在“HOME”工具栏）。

特别注意在截图中，它也可以为您生成代码，以便将来可以自动运行它。 enter image description here

来源

2013-10-08 08:25:44 bdecaf

对于混合数据（数字和文本），我通常会推荐单元格数组选项。 –

真的，我从MATLAB自动发现的设置中截取了屏幕截图。假设有很多需要调整的地方。 – bdecaf

这里是一个非常哈克解决方案。不幸的是，Matlab在阅读csv文件方面大打口水，使得这种hackery成为不幸的必需品。在光明的一面，你可能只需要编写一次这样的代码。

fid = fopen('yc_latest.csv'); %// open the file 

%// parse as csv, skipping the first six lines 
contents = textscan(fid, '%s %f %[^\n]', 'HeaderLines', 6); 

%// unpack the fields and give them meaningful names 
[seriesKey, parameters, description] = contents{:}; 

fclose(fid);     %// don't forget this!

来源

2013-10-07 15:03:55

请注意，您可以使用'textscan（...，'HeaderLines'，6）'而不是循环。 ** P.S **：我认为MATLAB解析CSV文件非常重要！ –

@EitanT将它与R进行比较，其中代码将是'x < - read.csv（“yc_latest.csv”，skip = 5，header = TRUE，stringsAsFactors = FALSE）。这对于列名更改顺序，或者列的添加/删除顺序也很稳定（这在我工作的地方会发生很多！）而Matlab解决方案则涉及单独提取标题并匹配它们。令我感到沮丧的是，并没有内置到Matlab的全功能csv阅读功能。不过，关于“HeaderLines”的好点，我会编辑以包含它。 –

好吧，['csvread']（http://www.mathworks.com/help/matlab/ref/csvread.html），但由于这个文件不是真正的逗号分隔的，所以你不能抱怨MATLAB在这里。这就好像说C语言在阅读文件时一样，这绝对是无稽之谈。也许这个功能没有被嵌入到语言中，但是你可以轻松地创建一个相同的东西。顺便说一句，我认为你可以在参数列的格式化字符串中使用'％f'来代码中进行另一次改进，它将为你节省以后执行'str2double'的麻烦。 –

从克里斯到该溶液中的另一种：

fid=fopen('yc_latest.csv'); 
Rows = textscan(fid,'%s', 'delimiter','\n'); %Creates a temporary cell array with the rows 
fclose(fid); 

%looks for the lines with a euro value: 
value=strfind(Rows,'Euro'); 
Idx = find(~cellfun('isempty', value)); 

Columns= cellfun(@(x) textscan(x,'%f','delimiter','\t','CollectOutput',1), Rows); 
Columns= cellfun(@transpose, Columns, 'UniformOutput', 0);

与实际的欧元值的所有行的索引被存储在IDX。

来源

2013-10-07 16:05:57

您可能想要使用textscan这种方式。

每一行被解析正则分隔符（制表符，空格），和所使用的格式是%*s用星跳过所述第一元件（YC.B.U2.EUR.4F.G_N_A.SV_C_YM.BETA0），然后%f获取感兴趣的值，最后%*[^\n]跳过剩下的线。

fid = fopen(filename);         
C = textscan(fid, '%*s%f%*[^\n]', 'HeaderLines', 6); 
fclose(fid); 

values = C{1};

来源

2013-10-08 07:37:16 marsei

如何导入复杂的CSV文件导入数字矢量到Matlab

回答

相关问题