2013-04-04 49 views
2

我想从这个饲料解析RSS数据:http://fulltextrssfeed.com/feeds.bbci.co.uk/news/rss.xml,这是使用FullTextRssFeed现场使用产生的。唯一的问题是,当我尝试获得描述时,我收到'<',其他一切都正常!我已经试过使用JSoup这一点,但我不知道怎么样。你能建议如何? 我使用的代码是一样的,在this tutorial使用,但我已经取代使用的RSS URL。再次感谢! *Here is my RSS reader in action*RSS订阅描述回报“<”

+1

“我使用的代码与本教程中的代码相同”。这是在我的问题的后半部分提到的。 – AndroidDev 2013-04-04 16:02:31

+0

我的错误,我以为你说*你*使用jsoup,而不是你*尝试*使用jsoup。无论如何,如果您将url指向rss feed而不是您的rss feed,它会正常工作吗? – FoamyGuy 2013-04-04 16:04:11

+0

试试这个[link](http://www.ibm.com/developerworks/opensource/library/x-android/)我用这个例子来得到RSS源,它们工作正常。 – 2013-04-04 16:06:18

回答

1

在寻找关于如何做到这一点的想法在网上,我发现,这样做实际上是illegal,因为它让内容的这种方法违反了使用很多的网络资源我希望使用的条款。现在你将不得不坚持使用简短的RSS源。

3

你的问题是因为你的RSS提要里面的描述包含html而不是纯文本。下面是描述内容:

<div><span class="story-date"><span class="date">3 April 2013</span> <span class="time-text">Last updated at</span> <span class="time">23:25 ET</span></span> <p><img src="http://news.bbcimg.co.uk/media/images/66739000/jpg/_66739180_philpotts.jpg" width="464" height="261" alt="Mick and Mairead Philpott, Paul Mosley"/><span class="c2">Mick and Mairead Philpott, and Paul Mosley, will be sentenced on Thursday</span></p> <p class="introduction" id="story_continues_1">A couple convicted of killing six of their children in a house fire in Derby are due to be sentenced later.</p> <p>Mick and Mairead Philpott will reappear at Nottingham Crown Court where they were found guilty of six counts of manslaughter, along with their friend Paul Mosley, on Tuesday.</p> <p>The maximum sentence for the crime is life imprisonment.</p> <p>Mrs Justice Thirlwall was due to pass sentence on Wednesday but needed more time to consider mitigation.</p> <p>The court was told that Philpott, 56, was jailed for seven years in 1978 for attempting to murder a previous girlfriend and given a concurrent five-year sentence for stabbing the woman's mother.</p> <p>In 1991 he received a conditional discharge for assault after he head-butted a colleague</p> <p>And in 2010 he was given a police caution after slapping Mairead and dragging her outside by her hair.</p> <p>When Philpott set fire to his house in Victory Road, Derby, he was also facing trial over a road rage incident in which he punched a motorist in the face.</p> <p>He had admitted common assault in relation to the incident but denied dangerous driving.</p> <span class="cross-head">Rape allegation</span> <p>Police have also confirmed that they intend to "thoroughly" investigate an allegation that Philpott raped a woman several years ago.</p> <p>She made the allegation after the death of Philpott's children, but police decided to wait until the end of the manslaughter trial before investigating the complaint further.</p> <p>On Tuesday the jury returned unanimous manslaughter verdicts on Philpott and Mosley, 46, while Mairead Philpott, 32, was convicted by a majority.</p> <p>Jade Philpott, 10, John, nine, Jack, eight, Jesse, six, and Jayden, five, died on the morning of the fire on 11 May 2012.</p> <p>Mairead Philpott's son from a previous relationship, 13-year-old Duwayne, died later in hospital.</p> </div><img src="http://pixel.quantserve.com/pixel/p-89EKCgBk8MZdE.gif" border="0" height="1" width="1" /> 

你需要改变一些方式,它可以忽略的是描述里面的html内容内的解析器。一旦你得到完整的html代码片段,你可以在WebView中渲染它。我认为通常CDATA是在XML数据(如RSS提要)内存在其他类型的XML内容(本例中为HTML)时使用的。老实说,虽然我不熟悉它的来龙去脉,但我可能是不正确的。

+0

你对[CDATA](http://www.w3schools.com/xml/xml_cdata.asp)部分是对的。 – 2013-04-12 06:44:51

2

myRssFeed.getDescription()得到的HTML看起来是这样的:

<div><span class="story-date"><span class="date">6 April 2013</span> <span class="time-text">Last updated at</span> <span class="time">08:57 ET</span></span> <p><img src="http://news.bbcimg.co.uk/media/images/51606000/jpg/_51606573_fa1d16c0-9c6c-4f82-b0b8-ab66ddd94f78.jpg" width="304" height="171" alt="Breaking news"/></p> <p class="introduction">Nelson Mandela has been discharged from hospital after treatment for pneumonia, South Africa's government has said.</p> <p>It said there had been "a sustained and gradual improvement in his condition".</p> <p>The 94-year-old was admitted on 27 March for a recurring lung infection and had fluid drained at the undisclosed hospital.</p> <p>Mr Mandela served as South Africa's first black president from 1994 to 1999 and is regarded by many as the father of the nation.</p> <p>The <a href="http://redirect.viglink.com?key=11fe087258b6fc0532a5ccfc924805c0&u=http%3A%2F%2Fwww.thepresidency.gov.za%2Fpebble.asp%3Frelid%3D15178">presidency statement read</a>: "Former President Nelson Mandela has been discharged from hospital today, 6 April, following a sustained and gradual improvement in his general condition.</p> <p>"The former president will now receive home-based high care. President [Jacob] Zuma thanks the hard working medical team and hospital staff for looking after Madiba so efficiently."</p> <p>Madiba is Mr Mandela's clan name.</p> <p>The statement continued: "[Mr Zuma] also extended his gratitude to all South Africans and friends of the Republic in Africa and around the world for support."</p> </div><img src="http://pixel.quantserve.com/pixel/p-89EKCgBk8MZdE.gif" border="0" height="1" width="1" /> 

使用Jsoup你可以试试这个(未经测试):

而不是

feedDescribtion.setText(myRssFeed.getDescription()); 

使用这样的:

feedDescribtion.setText(extractDescriptionText(myRssFeed.getDescription()); 

用以下方法:

private String extractDescriptionText(String description) { 
    StringBuffer b = new StringBuffer(); 
    Document dom = Jsoup.parse(description); 
    Elements paragraphs = dom.getElementsByTag("p"); 
    for (int i=1; i<paragraphs.size(); i++) { // start with 1 to skip the 'breaking news' paragraph 
     Element p = paragraphs.get(i); 
     b.append(p.text()); 
     b.append("\n"); // line-break after each paragraph 
    } 
    return b.toString(); 
} 

这应该有效。也许一些微调是必要的,但这可以通过Jsoup的帮助很容易地实现。

编辑:

这是extractDescriptionText()给出了上面的例子:

纳尔逊·曼德拉已经从医院治疗肺炎 出院后,南非政府已经说。它说有 “他的病情持续和逐渐改善”。该 94岁考入3月27日为一个反复出现的肺部感染 ,并在流体未公开的医院倒掉。曼德拉先生担任 成为南非第一位黑人总统1994年至1999年,是 被许多人视为民族的父亲认为。总统声明 的内容如下:“前总统纳尔逊曼德拉已于4月6日从 医院出院,继续改善 。”前总统现在将获得 家庭护理。总统[雅各布]祖马感谢勤奋工作的 医疗队和医院的工作人员照顾麦迪巴,所以 有效。“麦迪巴是曼德拉先生的氏族名称。声明 继续说:“[祖马先生]还向非洲和非洲共和国的朋友以及全世界的 的南非 表示感谢,以获得支持。”

+0

你试过这个吗?我在假期时远离机器,所以我无法为自己尝试这个。 – AndroidDev 2013-04-06 19:22:47

+0

不,它没有经过测试,但我之前和Jsoup一起工作过,我很确定它会起作用。如上所述,可能需要进行一些微调,例如上面的例子中有一个嵌入式链接,我不确定Element#text()方法如何处理它。 – Ridcully 2013-04-06 19:37:08

+0

好的,这将是大约6/5天,直到我能够测试此代码,并希望能够接受并奖励这个答案,谢谢! – AndroidDev 2013-04-06 21:10:23

1

我会评论,但我没有足够的分数。

我会建议使用雅虎管道重定向您的rss提要。你甚至可以选择它重定向为json而不是xml。

http://pipes.yahoo.com/pipes/

如果您的解析器正在大多数网站确定你去过这将解决您的问题最简单的方法。