2013-03-02 56 views
0

任何人都可以推荐一个好的方法使用C#(也许文件帮助),让我解析格式化在C#中这样的文件?解析带有多个表格的制表符分隔文件

%T person 
%F id name address city 
%R 1 Bob 999 Main St Burbank 
%R 2 Sara 829 South st Pasadena 
%T houses 
%F id personid housetype Color 
%R 25 1  House Red 
%R 26 2  condo Green 

我想将两个表格放入数据表或我可以用linq查询的东西。

该文件是制表符分隔

+0

怎么样分裂成基于%T标识符多个TSV的,然后解析使用如[FileHelpers]一个库中的每个TSV(http://www.filehelpers.com/) 。 – publicgk 2013-03-02 22:11:42

+0

你如何建议分裂成多个TSV?你的意思是在内存,类或数据表中执行此操作吗? – 2013-03-07 03:09:01

回答

0

采样分析器对这种数据

public IEnumerable<Dictionary<string, string>> Parse(TextReader reader) 
{ 
    var state = new State { Handle = ExpectTableTitle }; 
    return GenerateFrom(reader) 
     .Select(line => state.Handle(line.Split('\t'), state)) 
     .Where(returnIt => returnIt) 
     .Select(returnIt => state.Row); 
} 

private bool ExpectTableTitle(string[] lineParts, State state) 
{ 
    if (lineParts[0] == "%T") 
    { 
     state.TableTitle = lineParts[1]; 
     state.Handle = ExpectFieldNames; 
    } 
    else 
    { 
     Console.WriteLine("Expected %T but found '"+lineParts[0]+"'"); 
    } 
    return false; 
} 

private bool ExpectFieldNames(string[] lineParts, State state) 
{ 
    if (lineParts[0] == "%F") 
    { 
     state.FieldNames = lineParts.Skip(1).ToArray(); 
     state.Handle = ExpectRowOrTableTitle; 
    } 
    else 
    { 
     Console.WriteLine("Expected %F but found '" + lineParts[0] + "'"); 
    } 
    return false; 
} 

private bool ExpectRowOrTableTitle(string[] lineParts, State state) 
{ 
    if (lineParts[0] == "%R") 
    { 

     state.Row = lineParts.Skip(1) 
      .Select((x, i) => new { Value = x, Index = i }) 
      .ToDictionary(x => state.FieldNames[x.Index], x => x.Value); 
     state.Row.Add("_tableTitle",state.TableTitle); 
     return true; 
    } 
    return ExpectTableTitle(lineParts, state); 
} 

public class State 
{ 
    public string TableTitle; 
    public string[] FieldNames; 
    public Dictionary<string, string> Row; 
    public Func<string[], State, bool> Handle; 
} 

private static IEnumerable<string> GenerateFrom(TextReader reader) 
{ 
    string line; 
    while ((line = reader.ReadLine()) != null) 
    { 
     yield return line; 
    } 
} 

的然后就转换/每产生的字典映射到基于该_tableTitle进入你的域对象之一。

下面是使用您的示例数据的测试工具。要从文件读取,请传入StreamReader而不是StringReader。

const string data = @"%T\tperson 
%F\tid\tname\taddress\tcity 
%R\t1\tBob\t999 Main St\tBurbank 
%R\t2\tSara\t829 South st\tPasadena 
%T\thouses 
%F\tid\tpersonid\thousetype\tColor 
%R\t25\t1\tHouse\tRed 
%R\t26\t2\tcondo\tGreen"; 

var reader = new StringReader(data.Replace("\\t","\t")); 

var rows = Parse(reader); 
foreach (var row in rows) 
{ 
    foreach (var entry in row) 
    { 
     Console.Write(entry.Key); 
     Console.Write('\t'); 
     Console.Write('='); 
     Console.Write('\t'); 
     Console.Write(entry.Value); 
     Console.WriteLine(); 
    } 
    Console.WriteLine(); 
} 

输出:

id = 1 
name = Bob 
address = 999 Main St 
city = Burbank 
_tableTitle = person 

id = 2 
name = Sara 
address = 829 South st 
city = Pasadena 
_tableTitle = person 

id = 25 
personid = 1 
housetype = House 
Color = Red 
_tableTitle = houses 

id = 26 
personid = 2 
housetype = condo 
Color = Green 
_tableTitle = houses 
相关问题