2012-10-15 78 views

我不知道是否有一种简单的方法来读取Excel 2010 XML?
该XML与我以前阅读的结构不同。如何阅读Excel XML(C#)

特别是SS:索引属性(SS:指数** = “7”)使事情多一点

编辑复杂: 更好地说明:

  • 我有一个XML文件可以很容易地在Excel中打开的扩展名
  • 我正在寻找一种方法来阅读此表(例如复制到DataTable)
  • 注意到这是不常用的XML我曾经工作
  • XML在开始时定义字段,而不是使用ROW,CELL和DATA标记
  • 令我惊讶的是,当有例如3个字段(单元格)但第2个字段没有值时,此字段在XML中被“跳过”但第三字段具有一些额外的“索引”属性例如:SS:指数** =“3”(这表明,即使它是在第二位置右索引应该是“3”片段例如XML

 <Row ss:AutoFitHeight="0"> 
     <Cell><Data ss:Type="String">Johny</Data></Cell> 
     <Cell ss:Index="3"><Data ss:Type="String">NY</Data></Cell> 

好吧,我可以在您的文章阅读。你到底在找什么? – RBarryYoung


寻找简单的方法让我们来说,将Excel文件存储为XML到DataTable。我知道如何处理XML,但是我遇到了ss问题:索引属性(在单元格为空时使用) – Maciej


这可能有助于解释*那个*问题。 – RBarryYoung





public static class XMLtoDataTable { 
    private static ColumnType getDefaultType() { 
    return new ColumnType(typeof(String)); 

     struct ColumnType { 
      public Type type; 
      private string name; 
      public ColumnType(Type type) { this.type = type; this.name = type.ToString().ToLower(); } 
      public object ParseString(string input) { 
       if (String.IsNullOrEmpty(input)) 
        return DBNull.Value; 
       switch (type.ToString()) { 
        case "system.datetime": 
         return DateTime.Parse(input); 
        case "system.decimal": 
         return decimal.Parse(input); 
        case "system.boolean": 
         return bool.Parse(input); 
         return input; 

    private static ColumnType getType(XmlNode data) { 
    string type = null; 
    if (data.Attributes["ss:Type"] == null || data.Attributes["ss:Type"].Value == null) 
     type = ""; 
     type = data.Attributes["ss:Type"].Value; 

    switch (type) { 
     case "DateTime": 
      return new ColumnType(typeof(DateTime)); 
     case "Boolean": 
      return new ColumnType(typeof(Boolean)); 
     case "Number": 
      return new ColumnType(typeof(Decimal)); 
     case "": 
      decimal test2; 
      if (data == null || String.IsNullOrEmpty(data.InnerText) || decimal.TryParse(data.InnerText, out test2)) { 
       return new ColumnType(typeof(Decimal)); 
      } else { 
       return new ColumnType(typeof(String)); 
      return new ColumnType(typeof(String)); 

    public static DataSet ImportExcelXML (string fileName, bool hasHeaders, bool autoDetectColumnType) { 
     StreamReader sr = new StreamReader(fileName); 
     Stream st = (Stream) sr.BaseStream; 
     return ImportExcelXML(st, hasHeaders, autoDetectColumnType); 

    private static DataSet ImportExcelXML(Stream inputFileStream, bool hasHeaders, bool autoDetectColumnType) { 
     XmlDocument doc = new XmlDocument(); 
     doc.Load(new XmlTextReader(inputFileStream)); 
     XmlNamespaceManager nsmgr = new XmlNamespaceManager(doc.NameTable); 

     nsmgr.AddNamespace("o", "urn:schemas-microsoft-com:office:office"); 
     nsmgr.AddNamespace("x", "urn:schemas-microsoft-com:office:excel"); 
     nsmgr.AddNamespace("ss", "urn:schemas-microsoft-com:office:spreadsheet"); 

     DataSet ds = new DataSet(); 

     foreach (XmlNode node in 
      doc.DocumentElement.SelectNodes("//ss:Worksheet", nsmgr)) { 
      DataTable dt = new DataTable(node.Attributes["ss:Name"].Value); 
      XmlNodeList rows = node.SelectNodes("ss:Table/ss:Row", nsmgr); 
      if (rows.Count > 0) { 

       //Add Columns To Table from header row 
       List<ColumnType> columns = new List<ColumnType>(); 
       int startIndex = 0; 
       if (hasHeaders) { 
        foreach (XmlNode data in rows[0].SelectNodes("ss:Cell/ss:Data", nsmgr)) { 
         columns.Add(new ColumnType(typeof(string)));//default to text 
         dt.Columns.Add(data.InnerText, typeof(string)); 
       //Update Data-Types of columns if Auto-Detecting 
       if (autoDetectColumnType && rows.Count > 0) { 
        XmlNodeList cells = rows[startIndex].SelectNodes("ss:Cell", nsmgr); 
        int actualCellIndex = 0; 
        for (int cellIndex = 0; cellIndex < cells.Count; cellIndex++) { 
         XmlNode cell = cells[cellIndex]; 
         if (cell.Attributes["ss:Index"] != null) 
          actualCellIndex = 
           int.Parse(cell.Attributes["ss:Index"].Value) - 1; 

         ColumnType autoDetectType = 
          getType(cell.SelectSingleNode("ss:Data", nsmgr)); 

         if (actualCellIndex >= dt.Columns.Count) { 
          dt.Columns.Add("Column" + 
           actualCellIndex.ToString(), autoDetectType.type); 
         } else { 
          dt.Columns[actualCellIndex].DataType = autoDetectType.type; 
          columns[actualCellIndex] = autoDetectType; 

       //Load Data 
       for (int i = startIndex; i < rows.Count; i++) { 
        DataRow row = dt.NewRow(); 
        XmlNodeList cells = rows[i].SelectNodes("ss:Cell", nsmgr); 
        int actualCellIndex = 0; 
        for (int cellIndex = 0; cellIndex < cells.Count; cellIndex++) { 
         XmlNode cell = cells[cellIndex]; 
         if (cell.Attributes["ss:Index"] != null) 
          actualCellIndex = int.Parse(cell.Attributes["ss:Index"].Value) - 1; 

         XmlNode data = cell.SelectSingleNode("ss:Data", nsmgr); 

         if (actualCellIndex >= dt.Columns.Count) { 
          for (int ii = dt.Columns.Count; ii < actualCellIndex; ii++) { 
           dt.Columns.Add("Column" + actualCellIndex.ToString(), typeof(string));columns.Add(getDefaultType()); 
          } // ii 
          ColumnType autoDetectType = 
           getType(cell.SelectSingleNode("ss:Data", nsmgr)); 
          dt.Columns.Add("Column" + actualCellIndex.ToString(), 
         if (data != null) 
          row[actualCellIndex] = data.InnerText; 


     return ds; 


只想说谢谢!这个作品完美无瑕,在我将双打(ss :)改为单打(s :)后,由CarlosAq XML Excel Writer Library生成的xml。为我节省了数小时的工作时间!感谢分享! – Umo


这是所有意图和目的的一个很好的解决方案,但要注意重复的列名称,此代码不会考虑它们,并且需要在遇到它们时进行修改。 –


先试用官方API(Microsoft Open XML SDK)。


是否阅读.XLSX?我正在寻找方式来读取XML扩展名的文件。试过这个:http://msdn.microsoft.com/en-us/library/hh298534.aspx,但说“文件包含损坏的数据。” – Maciej


是的,它用于阅读OOXML容器。但是,您必须做额外的工作来提取原始XML并处理它,为什么不避免这些并直接处理容器? –



var textReader = new XmlTextReader("...\\YourFile.xml"); 
    // Read until end of file 
    while (textReader.Read()) 
     XmlNodeType nType = textReader.NodeType; 
     // If node type us a declaration 
     if (nType == XmlNodeType.XmlDeclaration) 
      Console.WriteLine("Declaration:" + textReader.Name.ToString()); 
     // if node type is a comment 
     if (nType == XmlNodeType.Comment) 
      Console.WriteLine("Comment:" + textReader.Name.ToString()); 
     // if node type us an attribute 
     if (nType == XmlNodeType.Attribute) 
      Console.WriteLine("Attribute:" + textReader.Name.ToString()); 
     // if node type is an element 
     if (nType == XmlNodeType.Element) 
      Console.WriteLine("Element:" + textReader.Name.ToString()); 
     // if node type is an entity\ 
     if (nType == XmlNodeType.Entity) 
      Console.WriteLine("Entity:" + textReader.Name.ToString()); 
     // if node type is a Process Instruction 
     if (nType == XmlNodeType.Entity) 
      Console.WriteLine("Entity:" + textReader.Name.ToString()); 
     // if node type a document 
     if (nType == XmlNodeType.DocumentType) 
      Console.WriteLine("Document:" + textReader.Name.ToString()); 
     // if node type is white space 
     if (nType == XmlNodeType.Whitespace) 
      Console.WriteLine("WhiteSpace:" + textReader.Name.ToString()); 

如何得到它:“ss:Index”属性值? – Maciej


与此块Value属性,如果(n类型== XmlNodeType.Attribute) {VAR的结果= textReader.Value} –


不是真的...当我这样做:当(textReader.Read())\t \t \t Console.WriteLine (textReader.NodeType +“ - >”+ textReader。价值);任何属性已列出 – Maciej


即使没有coulmn型它的工作也很好。谢谢 !!!


