我一直在为此奋斗了很长一段时间。将特定的HTML结构转换为特定的XML结构
我想将html转换为xml。结构如下所示。
我正在使用“HtmlAgilityPack”将html转换为有效的xml结构。所以,在此之后,我的HTML看起来像这样:
<div class="menuItem1" video="" preview="">
Menu 1
<div class="subMenu1">
<div class="menuItem2" video="" preview="">
Menu 2
<div class="subMenu2">
<div class="menuItem3" video="" preview="">
Menu 3
<div class="subMenu3">
<div class="" video="" preview="">Menu 4</div>
</div>
<div class="treeExpand"></div>
</div>
<div class="menuItem3" video="" preview="">Menu 3</div>
<div class="menuItem3" video="" preview="">Menu 3</div>
</div>
<div class="treeExpand"></div>
</div>
</div>
<div class="treeExpand"></div>
</div>
<div class="menuItem1" video="" preview="">
Menu 1
<div class="subMenu1">
<div class="menuItem2" video="" preview="">
Menu 2
<div class="subMenu2">
<div class="menuItem3" video="" preview="">
Menu 3
<div class="subMenu3">
<div class="" video="" preview="">Menu 4</div>
</div>
<div class="treeExpand"></div>
</div>
<div class="menuItem3" video="" preview="">Menu 3</div>
<div class="menuItem3" video="" preview="">Menu 3</div>
</div>
<div class="treeExpand"></div>
</div>
</div>
<div class="treeExpand"></div>
</div>
这正是我想要的。现在我能得到这个成的XElement,使用该C#代码:
XDocument doc = XDocument.Parse(THE_HTML_STRING_AS_SHOWN_ABOVE);
XDocument docw = new XDocument(new XElement("Navigation", doc.Root));
XElement root = docw.Root;
我创建一个方法,该方法我可以通过根成:
GenerateXmlFromHtml(root);
此方法的代码:
private string GenerateXmlFromHtml(XElement elem)
{
StringBuilder sbNavigationXml = new StringBuilder();
try
{
//HTML will always have a video and preview, according to the generation of the html structure.
string text = string.Empty;
string videopath = string.Empty;
string previewpath = string.Empty;
XText textNode;
foreach (XElement element in elem.Elements())
{
element.Name = "MenuItem"; //Change element name.
string htmlClass;
try { htmlClass = element.Attribute("class").Value; }
catch { htmlClass = ""; }
if (!string.IsNullOrEmpty(htmlClass))
{
if (htmlClass.Contains("subMenu"))
{
element.AddBeforeSelf(element.Elements());
element.Remove();
GenerateXmlFromHtml(element);
}
else if (htmlClass.Contains("menuItem"))
{
textNode = element.Nodes().OfType<XText>().FirstOrDefault();
text = textNode.Value;
videopath = element.Attribute("video").Value;
previewpath = element.Attribute("preview").Value;
if (element.HasElements)
{
sbNavigationXml.AppendLine("<MenuItem Text=\"" + text + "\" VideoPath=\"" + videopath + "\" PreviewPath=\"" + previewpath + "\">");
sbNavigationXml.AppendLine(GenerateXmlFromHtml(element));
sbNavigationXml.AppendLine("</MenuItem>");
}
else
{
sbNavigationXml.AppendLine("<MenuItem Text=\"" + text + "\" VideoPath=\"" + videopath + "\" PreviewPath=\"" + previewpath + "\" />");
}
}
else if (htmlClass.Contains("treeExpand"))
{
element.AddBeforeSelf(element.Elements());
element.Remove();
GenerateXmlFromHtml(element);
}
}
else
{
element.AddBeforeSelf(element.Elements());
element.Remove();
GenerateXmlFromHtml(element);
}
}
}
catch (Exception)
{
throw;
}
return sbNavigationXml.ToString();
}
最后,我想这产生XML输出:
<Navigation>
<MenuItem Text="Menu 1" VideoPath="" PreviewPath="">
<MenuItem Text="Menu 2">
<MenuItem Text="Menu 3">
<MenuItem Text="Menu 4" VideoPath="" PreviewPath="" />
</MenuItem>
<MenuItem Text="Menu 3" />
<MenuItem Text="Menu 3" />
</MenuItem>
</MenuItem>
<MenuItem Text="Menu 1" VideoPath="" PreviewPath="">
<MenuItem Text="Menu 2">
<MenuItem Text="Menu 3">
<MenuItem Text="Menu 4" VideoPath="" PreviewPath="" />
</MenuItem>
<MenuItem Text="Menu 3" />
<MenuItem Text="Menu 3" />
</MenuItem>
</MenuItem>
</Navigation>
换句话说,子菜单应该消失,并且树扩展div,然后我想生成XML,但目前,我仍然失败悲惨。请问是否有不清楚的地方。任何帮助赞赏!
============================================== ================================================== ===
编辑: 固定递归方法,任何人谁希望看到:
private string GenerateXmlFromHtml(XElement elem)
{
//HTML will always have a video and preview, according to the generation of the html structure.
StringBuilder sbNavigationXml = new StringBuilder();
string text = string.Empty;
string videopath = string.Empty;
string previewpath = string.Empty;
XText textNode;
try
{
foreach (XElement element in elem.Elements())
{
//element.Name = "MenuItem"; //Change element name.
string htmlClass;
try { htmlClass = element.Attribute("class").Value; }
catch { htmlClass = ""; }
if (!string.IsNullOrEmpty(htmlClass))
{
if (htmlClass.Contains("subMenu"))
{
if (element.HasElements)
{
sbNavigationXml.AppendLine(GenerateXmlFromHtml(element));
}
}
else if (htmlClass.Contains("menuItem"))
{
textNode = element.Nodes().OfType<XText>().FirstOrDefault(); //Get node Text attribute value.
text = textNode.Value;
videopath = element.Attribute("video").Value; //Get node VideoPath attribute value.
previewpath = element.Attribute("preview").Value; //Get node PreviewPath attribute value.
if (element.HasElements)
{
sbNavigationXml.AppendLine("<MenuItem Text=\"" + text + "\" VideoPath=\"" + videopath + "\" PreviewPath=\"" + previewpath + "\">");
sbNavigationXml.AppendLine(GenerateXmlFromHtml(element));
sbNavigationXml.AppendLine("</MenuItem>");
}
else
{
sbNavigationXml.AppendLine("<MenuItem Text=\"" + text + "\" VideoPath=\"" + videopath + "\" PreviewPath=\"" + previewpath + "\" />");
}
}
else if (htmlClass.Contains("treeExpand"))
{
//DO NOTHING
}
}
else
{
if (element.HasElements)
{
sbNavigationXml.AppendLine(GenerateXmlFromHtml(element));
}
}
}
}
catch (Exception)
{
throw;
}
return sbNavigationXml.ToString();
}
边注:通常人们把它搞砸其他方式周围 - 解析与正则表达式的HTML,但仍构造XML适当的API。有什么原因需要使用字符串连接来构建XML? – 2014-11-04 15:12:57
@AlexeiLevenkov - 不,我可以做我想做的任何事情......这只是我采用的路径,但其他任何产生XML输出的东西都可以,即使我必须做一些完全不同的事情。 – 2014-11-04 15:14:26
查看[如何在C#中构建XML](http://stackoverflow.com/questions/284324/how-can-i-build-xml-in-c)以获取指导。 – 2014-11-04 15:15:34