我得到了以下用于读取txt文件并返回字典的方法。读取〜5MB文件需要大约7分钟(67000行,每行70个字符)。 如何加快此代码?
public static Dictionary<string, string> FASTAFileReadIn(string file)
{
Dictionary<string, string> seq = new Dictionary<string, string>();
Regex re;
Match m;
GroupCollection group;
string currentName = string.Empty;
try
{
using (StreamReader sr = new StreamReader(file))
{
string line = string.Empty;
while ((line = sr.ReadLine()) != null)
{
if (line.StartsWith(">"))
{// Match Sequence
re = new Regex(@"^>(\S+)");
m = re.Match(line);
if (m.Success)
{
group = m.Groups;
if (!seq.ContainsKey(group[1].Value))
{
seq.Add(group[1].Value, string.Empty);
currentName = group[1].Value;
}
}
}
else if (Regex.Match(line.Trim(), @"\S+").Success &&
currentName != string.Empty)
{
seq[currentName] += line.Trim();
}
}
}
}
catch (IOException e)
{
Console.WriteLine("An IO exception has benn thrown!");
Console.WriteLine(e.ToString());
}
finally { }
return seq;
}
代码的哪些部分是最耗时的,如何加快步伐?
感谢
相关:http://stackoverflow.com/questions/3927/what-are-some-good-net-profilers – 2012-07-24 03:05:33
@布莱恩,谢谢,这可以节省一些时间。 :) – sarnold 2012-07-24 03:05:49
不要每次都创建一个新的正则表达式。创建一次,并使用'RegexOptions.Compiled'标志来获得额外的性能。 – Ryan 2012-07-24 03:06:55