2011-07-15 80 views
5

我是一个试图学习MATLAB的完整编程初学者。我想从一堆不同的xml文件中提取数字数据。数字数据项由标签和边界限定。我如何在MATLAB中编写程序?使用MATLAB从xml文件中提取数据

我的算法:

1. Open the folder 
2. Look into each of 50 xml files, one at a time 
3. Where the tag <HNB.1></HNB.1> exists, copy numerical contents between said tag and write results into a new file 
4. The new file name given for step 3 should be the same as the initial file name read in Step 2, being appended with "_data extracted" 

例如:

FileName = Stewart.xml 
Contents = blah blah blah <HNB.1>2</HNB.1> blah blah 
NewFileName = Stewart_data extracted.txt 
Contents = 2 
+0

可能重复:HTTP://计算器.COM /问题/ 6582250 /提取数据之间两标签功能于HTML文件,MATLAB – Amro

回答

8

在MATLAB的基本函数来读取XML数据xmlread;但是如果你是一个完整的初学者,那么使用它可能会很棘手。试试this series of blog postings,告诉你如何把它放在一起。

1

假设你想阅读本文件:

<PositiveSamples numImages="14"> 
 
<image numSubRegions="2" filename="TestingScene.jpg"> 
 
\t <subregion yStart="213" yEnd="683" xStart="1" xEnd="236"/> 
 
\t <subregion yStart="196" yEnd="518" xStart="65" xEnd="226"/> 
 
</image> 
 
</PositiveSamples>

然后在MATLAB中,读取文件内容如下:

%read xml file 
xmlDoc = xmlread('PositiveSamples.xml'); 

%Get root element 
root = xmlDoc.getDocumentElement(); 

%Read attributevale 
numOfImages = root.getAttribute('numImages'); 
numOfImages = char(numImages); 
numOfImages = uint16(eval(numImages));