这里是低于c#字符串处理中的错误是什么?
<h5>Sample Document </h5>
<h3> Present Tense </h3>
</p><p>The present tense is just as you have learned. You take the dictionary form of a verb, drop the 다, add the appropriate ending.
</p><p>먹다 - 먹 + 어요 = 먹어요 <br />
마시다 - 마시 + 어요 - 마시어요 - 마셔요. <br />
</p><p>This tense is used to represent what happens in the present. I eat. I drink. It is a general term for the present.
因为考虑到我传递的MemoryStream的代码下面提到的程序包含三个功能
Main
ReadDocument
TestByteOffSet
样本HTML文件
主函数将上述指定的HTML文档转换为memoryStream,然后将其进一步传递给ReadDocument函数,该函数将结果存储在名为docContent的变量中。它是一个类级别的变量。
然后,主函数使用myRange.Text尝试在给定文档中查找其索引。一旦找到索引,它将存储在intByteOffSet变量中。
现在第三个函数TestByteOffSet试图确保存储在intByteOffSet中的索引是否正确。
这里我有问题,当我尝试从byteOffSet获取字符串时,我没有收到选定的文本。
的源代码
using System;
using System.Collections.Generic;
using System.Text;
namespace MultiByteStringHandling
{
class Program
{
static void Main(string[] args)
{
FileStream fs = new FileStream(FileName, FileMode.Open);
BinaryReader br = new BinaryReader(fs);
byte[] bit = br.ReadBytes((int)fs.Length);
MemoryStream Mr = new MemoryStream(bit);
ReadDocument(Mr);
mshtml.IHTMLTxtRange CompleteRange =
_body.createTextRange().duplicate();
int intByteOffset = 0;
Regex reg = default(Regex);
try
{
// Get all of the text that is in between HTML tags.
string regSearchText = myRange.htmlText;
string strTemp = regSearchText + "\\s*";
string strExp = ">(([^<])*?)" + strTemp + "(([^<])*?)<";
string _cleanedSource = "";
_cleanedSource = CompleteRange.htmlText;
// Use regular expressions to find a collection of matches
//that match a certain pattern.
foreach (Match m in Regex.Matches(_cleanedSource, strExp,
RegexOptions.IgnoreCase))
{
Int32 ret = default(Int32);
Int32 index = default(Int32);
string strMatch = m.Value;
foreach (Match m2 in Regex.Matches(strMatch, strTemp,
RegexOptions.IgnoreCase))
{
// Increment counter when finding a match.
intCount += 1;
// If counter matches occurrence number, return
//source offset.
if (intCount == OccurenceNo)
{
//Source offset is the index of the overall
//match + index innerText Match.
int intCharOffset = m.Index + m2.Index;
System.Text.UTF8Encoding d = new
System.Text.UTF8Encoding();
// Using the SourceText will give an accurate
//byte offset.
intByteOffset = d.GetBytes(
_cleanedSource.Substring(0, intCharOffset)).Length;
}
}
}
}
catch (Exception ex)
{
throw ex;
}
finally
{
}
}
private void ReadDocument(Stream sD)
{
System.IO.MemoryStream ms = new System.IO.MemoryStream();
System.IO.BinaryWriter bw = new System.IO.BinaryWriter(ms);
bool hasMore = true;
sD.Position = 0;
using (System.IO.BinaryReader br = new System.IO.BinaryReader(sD))
{
while (hasMore)
{
byte[] buffer = br.ReadBytes(8192);
hasMore = buffer.Length > 0;
if (hasMore)
{
bw.Write(buffer);
}
}
}
byte[] docBuffer = ms.GetBuffer();
docContent = new byte[docBuffer.Length + 1];
Array.Copy(docBuffer, docContent, docBuffer.Length);
}
private bool TestByteOffset(TransparencyItemType transparency)
{
System.Text.UTF8Encoding encoding = default(System.Text.UTF8Encoding);
string byteOffsetLabel = null;
Int32 iLength = default(Int32);
Int32 offset = default(Int32);
if (((transparency.Label == null) == false))
{
iLength = Convert.ToInt32(transparency.Label.IEOffset.Length);
offset = Convert.ToInt32(transparency.Label.IEOffset.Offset);
}
else if (((transparency.Value == null) == false))
{
if(transparency.Value.ByteOffset!=null)
{
if (transparency.Value.ByteOffset.Offset != -1)
{
iLength = Convert.ToInt32(transparency.Value.ByteOffset.Length);
offset = Convert.ToInt32(transparency.Value.ByteOffset.Offset);
}
}
}
else
{
return false;
}
}
因此,您将得到一个表示文件的Stream,将其读入一个字节数组,然后从该字节数组创建一个新的流,并将该流传递给ReadDocument,从而将其转换为一个字节数组。为什么不简单地将第一个FileStream传递给ReadDocument。或者更好的是,将FileStream的内容读入一个字符串,并对该字符串进行操作? – 2009-11-13 08:53:25
我明白你的意思,但我想我做的是同样的事情,虽然没有在适当的方式。 – Sandhurst 2009-11-13 08:57:32
开始删除你有的空'catch'。你可以放一个'throw;'来重新抛出异常。 空捕获将默默吞下任何异常,这意味着你不知道代码是否正在工作,并且你不知道问题出现在哪里或为什么。 – Guffa 2009-11-13 09:00:36