2013-08-22 32 views
0

比较注释(带黄金标准),我感到非常舒服UIMA,但我的新的工作需要我用GATE如何存储和GATE

于是,我开始学习GATE。我的问题是关于如何计算我的标记引擎(基于Java)的性能。

使用UIMA,我通常会将所有系统注释转储到xmi文件中,然后使用Java代码将其与人类注释(金标准)注释进行比较,以计算Precision/Recall和F-score。

但是,我仍然在努力寻找与GATE类似的东西。 经过Gate Annotation-Diff和该页面上的其他信息后,我可以感觉到必须有一个简单的方法来在JAVA中做到这一点。但是,我无法弄清楚如何使用JAVA来完成它。想到在这里提出这个问题,有人可能已经知道了这一点。

  1. 如何以编程方式将系统注释存储到xmi或任何格式的文件中。
  2. 如何为性能计算创建一次黄金标准数据(即人类注释数据)。

让我知道,如果你需要更具体或细节。

回答

0

此代码似乎有助于将注释写入xml文件。 http://gate.ac.uk/wiki/code-repository/src/sheffield/examples/BatchProcessApp.java

 String docXMLString = null; 
     // if we want to just write out specific annotation types, we must 
     // extract the annotations into a Set 
     if(annotTypesToWrite != null) { 
      // Create a temporary Set to hold the annotations we wish to write out 
      Set annotationsToWrite = new HashSet(); 

      // we only extract annotations from the default (unnamed) AnnotationSet 
      // in this example 
      AnnotationSet defaultAnnots = doc.getAnnotations(); 
      Iterator annotTypesIt = annotTypesToWrite.iterator(); 
      while(annotTypesIt.hasNext()) { 
       // extract all the annotations of each requested type and add them to 
       // the temporary set 
       AnnotationSet annotsOfThisType = 
         defaultAnnots.get((String)annotTypesIt.next()); 
       if(annotsOfThisType != null) { 
        annotationsToWrite.addAll(annotsOfThisType); 
       } 
      } 

      // create the XML string using these annotations 
      docXMLString = doc.toXml(annotationsToWrite); 
     } 
     // otherwise, just write out the whole document as GateXML 
     else { 
      docXMLString = doc.toXml(); 
     } 

     // Release the document, as it is no longer needed 
     Factory.deleteResource(doc); 

     // output the XML to <inputFile>.out.xml 
     String outputFileName = docFile.getName() + ".out.xml"; 
     File outputFile = new File(docFile.getParentFile(), outputFileName); 

     // Write output files using the same encoding as the original 
     FileOutputStream fos = new FileOutputStream(outputFile); 
     BufferedOutputStream bos = new BufferedOutputStream(fos); 
     OutputStreamWriter out; 
     if(encoding == null) { 
      out = new OutputStreamWriter(bos); 
     } 
     else { 
      out = new OutputStreamWriter(bos, encoding); 
     } 

     out.write(docXMLString); 

     out.close(); 
     System.out.println("done");