2016-11-05 174 views
1

我想使用Matlab正则表达式分析一个xml文件。具体而言,我喜欢检索 “存款”和“/存款”之间出现的所有“曲线点”字词的数组。所以对于XML低于它应该是一个[6X1]数组一样正则表达式匹配两个其他词之间的单词(Matlab正则表达式)

<curvepoint> 
<curvepoint> 
<curvepoint> 
<curvepoint> 
<curvepoint> 
<curvepoint> 

下面我尝试因为有很多其他每个文本“curvepoint之间穿插不起作用“发言和前瞻/背后,但我不知道如何处理这一点。

regexp(XMLText,'(?<=<deposits>)(<curvepoint>)(?=</deposits>)','match')' 

XMLTEXT是

<?xml version="1.0" encoding="utf-8"?> 
<interestRateCurve> 
    <effectiveasof>2016-11-07</effectiveasof> 
    <currency>EUR</currency> 
    <baddayconvention>M</baddayconvention> 
    <deposits> 
     <daycountconvention>ACT/360</daycountconvention> 
     <snaptime>2016-11-04T15:00:00.000Z</snaptime> 
     <spotdate>2016-11-09</spotdate> 
     <calendars> 
     <calendar>none</calendar> 
     </calendars> 
     <curvepoint> 
     <tenor>1M</tenor> 
     <maturitydate>2016-12-09</maturitydate> 
     <parrate>-0.00373</parrate> 
     </curvepoint> 
     <curvepoint> 
     <tenor>2M</tenor> 
     <maturitydate>2017-01-09</maturitydate> 
     <parrate>-0.00339</parrate> 
     </curvepoint> 
     <curvepoint> 
     <tenor>3M</tenor> 
     <maturitydate>2017-02-09</maturitydate> 
     <parrate>-0.00312</parrate> 
     </curvepoint> 
     <curvepoint> 
     <tenor>6M</tenor> 
     <maturitydate>2017-05-09</maturitydate> 
     <parrate>-0.00213</parrate> 
     </curvepoint> 
     <curvepoint> 
     <tenor>9M</tenor> 
     <maturitydate>2017-08-09</maturitydate> 
     <parrate>-0.0013</parrate> 
     </curvepoint> 
     <curvepoint> 
     <tenor>1Y</tenor> 
     <maturitydate>2017-11-09</maturitydate> 
     <parrate>-0.00071</parrate> 
     </curvepoint> 
    </deposits> 
    <swaps> 
     <fixeddaycountconvention>30/360</fixeddaycountconvention> 
     <floatingdaycountconvention>ACT/360</floatingdaycountconvention> 
     <fixedpaymentfrequency>1Y</fixedpaymentfrequency> 
     <floatingpaymentfrequency>6M</floatingpaymentfrequency> 
     <snaptime>2016-11-04T15:00:00.000Z</snaptime> 
     <spotdate>2016-11-09</spotdate> 
     <calendars> 
     <calendar>none</calendar> 
     </calendars> 
     <curvepoint> 
     <tenor>2Y</tenor> 
     <maturitydate>2018-11-09</maturitydate> 
     <parrate>-0.00157</parrate> 
     </curvepoint> 
     <curvepoint> 
     <tenor>3Y</tenor> 
     <maturitydate>2019-11-09</maturitydate> 
     <parrate>-0.00115</parrate> 
     </curvepoint> 
     <curvepoint> 
     <tenor>4Y</tenor> 
     <maturitydate>2020-11-09</maturitydate> 
     <parrate>-0.00059</parrate> 
     </curvepoint> 
     <curvepoint> 
     <tenor>5Y</tenor> 
     <maturitydate>2021-11-09</maturitydate> 
     <parrate>0.00017</parrate> 
     </curvepoint> 
     <curvepoint> 
     <tenor>6Y</tenor> 
     <maturitydate>2022-11-09</maturitydate> 
     <parrate>0.00108</parrate> 
     </curvepoint> 
     <curvepoint> 
     <tenor>7Y</tenor> 
     <maturitydate>2023-11-09</maturitydate> 
     <parrate>0.0021</parrate> 
     </curvepoint> 
     <curvepoint> 
     <tenor>8Y</tenor> 
     <maturitydate>2024-11-09</maturitydate> 
     <parrate>0.00316</parrate> 
     </curvepoint> 
     <curvepoint> 
     <tenor>9Y</tenor> 
     <maturitydate>2025-11-09</maturitydate> 
     <parrate>0.00419</parrate> 
     </curvepoint> 
     <curvepoint> 
     <tenor>10Y</tenor> 
     <maturitydate>2026-11-09</maturitydate> 
     <parrate>0.00513</parrate> 
     </curvepoint> 
     <curvepoint> 
     <tenor>12Y</tenor> 
     <maturitydate>2028-11-09</maturitydate> 
     <parrate>0.00673</parrate> 
     </curvepoint> 
     <curvepoint> 
     <tenor>15Y</tenor> 
     <maturitydate>2031-11-09</maturitydate> 
     <parrate>0.00838</parrate> 
     </curvepoint> 
     <curvepoint> 
     <tenor>20Y</tenor> 
     <maturitydate>2036-11-09</maturitydate> 
     <parrate>0.00966</parrate> 
     </curvepoint> 
     <curvepoint> 
     <tenor>30Y</tenor> 
     <maturitydate>2046-11-09</maturitydate> 
     <parrate>0.01006</parrate> 
     </curvepoint> 
    </swaps> 
</interestRateCurve> 

回答

0

切勿使用正则表达式来解析XML。充其量,解决方案将变得脆弱。改为使用真正的XML解析器。

在MATLAB中,使用xmlread,xmlwritexslt函数来读取,写入和转换XML。

请注意MathWorks blog has XML posts关于在MATLAB中使用这些函数。

+0

感谢kj,我没有时间去完成这个matlab的xml功能(这只是我的“项目”中很小的一部分)。我认为正则表达式只会像正则表达式的逻辑一样脆弱 - 一个经验丰富的正则表达式应该能够找到一个严格的表达式。为了获得特定的解决方案,我花了很多时间来解决这个问题。 – user152112

+0

按照你的意愿去做,但是如果你确切地意识到了一个不匹配的正则表达式对解析XML有多大的影响,那么你就不会问这个问题了。这些风险在其他地方都有记载我没有多少时间来重新应对这些风险,而不是你必须学会​​以正确的方式去做。 – kjhughes

相关问题