2009-07-13 119 views
0

如何从Microsoft Word文档读取单词注释(注释)?如何阅读来自apache poi的word文档中的注释?

请提供一些示例代码,如果可能的话...

感谢你......

+0

Word文档有多种形式。你能澄清一下你想读的文件类型吗? Word 97/2003 .doc,Word 2007 XML等 – 2009-07-13 14:59:03

+0

我想阅读97/2003/xp和2007 word文件中的评论... – Garudadwajan 2009-07-14 03:47:44

回答

2

给你一个SummaryInformation对象。最后,我找到了答案

这里是代码片段...

File file = null; 
    FileInputStream fis = null; 
    HWPFDocument document = null; 
    Range commentRange = null; 
    try { 
     file = new File(fileName); 
     fis = new FileInputStream(file); 
     document = new HWPFDocument(fis); 
     commentRange = document.getCommentsRange(); 
     int numComments = commentRange.numParagraphs(); 
     for (int i = 0; i < numComments; i++) { 
      String comments = commentRange.getParagraph(i).text(); 
      comments = comments.replaceAll("\\cM?\r?\n", "").trim(); 
      if (!comments.equals("")) { 
       System.out.println("comment :- " + comments); 
      } 
     } 
    } catch (Exception e) { 
     e.printStackTrace(); 
    } 

我正在使用Poi poi-3.5-beta7-20090719.jar,poi-scratchpad-3.5-beta7-20090717.jar。其他档案 - poi-ooxml-3.5-beta7-20090717.jar和poi-dependencies-3.5-beta7-20090717.zip - 如果您希望在基于OpenXML的文件格式上工作,将需要其他档案。

我很欣赏马克B的帮助究竟是谁发现了这个解决方案....

0

获取HWPFDocument对象(通过在输入流中传递一个Word文档,说的)。

然后你就可以通过getSummaryInformation()得到总结,这将通过getSummary()

+0

非常感谢Brian ... – Garudadwajan 2009-07-15 04:09:56

0

我也是新到apache poi。听到是我的程序工作正常这个程序提取word格式的文本到文本...我希望这个程序将帮助你在你运行这个程序之前,你可以在你的类路径中设置相应的lib文件。

/* 
* FileExtract.java 
* 
* Created on April 12, 2010, 9:46 AM 
* 
* To change this template, choose Tools | Template Manager 
* and open the template in the editor. 
*/ 
import java.io.File; 
import java.io.FileInputStream; 
import java.io.IOException; 
import java.io.InputStream; 
import javax.swing.text.BadLocationException; 
import javax.swing.text.DefaultStyledDocument; 
import javax.swing.text.rtf.RTFEditorKit; 
import java.io.*; 
import org.apache.poi.POIOLE2TextExtractor.*; 
import org.apache.poi.POIOLE2TextExtractor; 
import org.apache.poi.POITextExtractor; 
import org.apache.poi.extractor.ExtractorFactory; 
import org.apache.poi.hdgf.extractor.VisioTextExtractor; 
import org.apache.poi.hslf.extractor.PowerPointExtractor; 
import org.apache.poi.hssf.usermodel.HSSFWorkbook; 
import org.apache.poi.hwpf.extractor.WordExtractor; 
import org.apache.poi.poifs.filesystem.POIFSFileSystem; 
import org.apache.poi.ss.extractor.ExcelExtractor; 
import org.apache.poi.xwpf.extractor.XWPFWordExtractor; 
import javax.swing.text.Document; 
/** 
* 
* @author ChandraMouil V 
*/ 
public class RtfDocTextExtract { 
    /** Creates a new instance of FileExtract */ 
    static String filePath; 
    static String rtfFile; 
    static FileInputStream fis; 
    static int x=0; 
    public RtfDocTextExtract() { 
    } 
    //This function for .DOC File 
    public static void meth(String filePath) { 
     try { 
      if(x!=0){ 
       fis = new FileInputStream("D:/DummyRichTextFormat.doc"); 
       POIFSFileSystem fileSystem = new POIFSFileSystem(fis); 
       WordExtractor oleTextExtractor = (WordExtractor) ExtractorFactory.createExtractor(fileSystem); 
       String[] paragraphText = oleTextExtractor.getParagraphText(); 
       FileWriter fw = new FileWriter("E:/resume-template.txt"); 
       for (String paragraph : paragraphText) { 
        fw.write(paragraph); 
       } 
       fw.flush(); 
      } 
     }catch(Exception e){ 
      e.printStackTrace(); 
     } 
    } 
}