2013-02-28 43 views
0

我有一个卡在我的项目中,无法克服这个困难。我想从别人的一些帮助给我这个问题的解决方案:获取字符串中的令牌块

我有一个字符串,并在该字符串内有一些标记文本,我想手动将它们取出并将它们放入一个数组列表字符串。最终的结果可能有两个数组列表,一个是普通文本,另一个是标记文本。下面是一个字符串示例,其中包含一些由开放标记“[[”和关闭标记“]]”包围的标记。


第一步,通过将淀粉源与热水混合制备麦芽汁,称为[[Textarea]]。热水与捣碎的麦芽或麦芽混合。糖化过程需要[[CheckBox]],在这期间淀粉转化为糖,然后甜麦芽汁从谷物中排出。现在谷物被称为[[Radio]]。这种洗涤使酿酒商尽可能地从谷物中收集[[DropDownList]]可发酵液体。


有两个数组列表操纵串后得到:

结果:

Normal Text ArrayList { "The first step, where the wort is prepared by mixing the starch source with hot water, is known as ", ". Hot water is mixed with crushed malt or malts in a mash tun. The mashing process takes around ", ", during which the starches are converted to sugars, and then the sweet wort is drained off the grains. The grains are now washed in a process known as ", ". This washing allows the brewer to gather ", " the fermentable liquid from the grains as possible." } 

Token Text ArrayList { "[[Textarea]]", "[[CheckBox]]", "[[Radio]]", "[[DropDownList]]" } 

两个数组列表,一种是正常的文本数组列表已经5个元件,其文本之前或者在令牌之后,另一个是令牌文本数组列表具有4个元素,它们是字符串内的令牌文本。

这项工作可以完成哪些技术的剪切和子字符串,但它是一个很长的文本太难了,并会很容易得到错误和一些时间不能得到我想要的。如果在这个问题上有一些帮助,请在C#中发布,因为我使用C#来完成这项任务。

回答

1

这似乎做的工作(但请注意,此刻,我tokens数组包含普通令牌,而不是将它们包裹与[[]]

var inp = @"The first step, where the wort is prepared by mixing the starch source with hot water, is known as [[Textarea]]. Hot water is mixed with crushed malt or malts in a mash tun. The mashing process takes around [[CheckBox]], during which the starches are converted to sugars, and then the sweet wort is drained off the grains. The grains are now washed in a process known as [[Radio]]. This washing allows the brewer to gather [[DropDownList]] the fermentable liquid from the grains as possible."; 

var step1 = inp.Split(new string[] { "[[" }, StringSplitOptions.None); 
//step1 should now contain one string that's due to go into normal, followed by n strings which need to be further split 
var step2 = step1.Skip(1).Select(a => a.Split(new string[] { "]]" }, StringSplitOptions.None)); 
//step2 should now contain pairs of strings - the first of which are the tokens, the second of which are normal strings. 

var normal = step1.Take(1).Concat(step2.Select(a => a[1])).ToArray(); 
var tokens = step2.Select(a => a[0]).ToArray(); 

这还假定不存在不平衡[[]]序列输入

是进入该解决方案的意见:如果你要围绕每[[对原文中第一分割字符串,那么第一个输出字符串已经制作完毕。此外,第一个字符串之后的每个字符串都由一个标记,]]对和一个普通文本组成。例如。第二个结果中step1是:“多行文本]热水用的糖化桶粉碎的麦芽或麦芽混合糖化过程大约需要。”

所以,如果你身边的]]对分割这些结果,然后第一个结果是一个标记,第二个结果是一个普通的字符串。

+0

是的。这真太了不起了。这是我需要知道的问题。这是排序和计划,以获得错误的泄漏。我已经测试过,并根据需要得到最终结果。非常感谢你的帮助。 – 2013-02-28 07:32:06

+0

对不起,在通知之前发布我的答案已经有一个答案。但非常感谢您的帮助。 – 2013-02-28 07:33:10