Itext：如何检索未嵌入的PDF字体的列表

-1

我想检查PDF是否嵌入了所有字体。我遵循How to check that all used fonts are embedded in PDF with Java iText?中提到的编码，但仍然无法获得所用字体的正确列表。Itext：如何检索未嵌入的PDF字体的列表

查看我的示例pdf：https://www.dropbox.com/s/anvm49vh87d8yqs/000024944.pdf?dl=0，编码完全没有字体，只是acrobat中提到的文档属性为Helvetica + Verdana（嵌入子集）+ Verdana-Bold（嵌入子集）。对于其他pdf的我得到Verdana嵌入子集，只有这些类型的PDF我没有得到字体列表。

由于我们必须处理大量来自内部的pdf作为外部来源，我们需要能够嵌入字体以便打印它们。因为嵌入所有字体几乎是不可能的，所以我们只想嵌入常用字体，对于异国情调的字体我们会忽略printrequest。

任何人都可以帮我解决这个问题吗？谢谢

来源

2015-08-24 Jan Naessens

正确的链接到PDF的https：/ /www.dropbox.com/s/anvm49vh87d8yqs/000024944.pdf?dl=0 –

我使用callas pdfToolbox检查了您的文件（谨慎，我与此工具有关），并声明Verdana和Verdana粗体已嵌入（并且已设置子集）但Helvetica不是嵌入式的;这与Adobe Acrobat报告相同。 –

还有一点“旁边的话题”评论 - 你意识到嵌入标准字体是一件危险的事情吗？不能保证您的字体副本与原始PDF文件创建者使用的字体相同，并且在嵌入字体时您可能会得到不同的宽度或编码问题。 –

我设法通过结合编码How to check that all used fonts are embedded in PDF with Java iText?和http://itextpdf.com/examples/iia.php?id=288得到了一些结果。最初它并不像font.getAsName（PdfName.BASEFONT）.toString（）;在我的情况下不起作用，但我做了一些小改动并获得了一些结果。

下面是我的编码：

/** 
* Creates a Set containing information about the fonts in the src PDF file. 
* @param src the path to a PDF file 
* @throws IOException 
*/ 
public void listFonts(PdfReader reader, Set<String> set) throws IOException { 

    int n = reader.getXrefSize(); 
    PdfObject object; 
    PdfDictionary font; 

    for (int i = 0; i < n; i++) { 
     object = reader.getPdfObject(i); 
     if (object == null || !object.isDictionary()) { 
      continue; 
     } 

     font = (PdfDictionary)object; 

     if (font.get(PdfName.FONTNAME) != null) { 

      System.out.println("fontname " + font.get(PdfName.FONTNAME)); 
      processFont(font,set); 

     } 
    } 
} 

/** 
* Finds out if the font is an embedded subset font 
* @param font name 
* @return true if the name denotes an embedded subset font 
*/ 
private boolean isEmbeddedSubset(String name) { 
    //name = String.format("%s subset (%s)", name.substring(8), name.substring(1, 7)); 
    return name != null && name.length() > 8 && name.charAt(7) == '+'; 
} 

private void processFont(PdfDictionary font, Set<String> set) { 

    String name = font.get(PdfName.FONTNAME).toString(); 

    if(isEmbeddedSubset(name)) { 
     return; 
    } 

    PdfDictionary desc = font.getAsDict(PdfName.FONTDESCRIPTOR); 

    //nofontdescriptor 
    if (desc == null) { 
     System.out.println("desc null "); 
     PdfArray descendant = font.getAsArray(PdfName.DESCENDANTFONTS); 

     if (descendant == null) { 
      System.out.println("descendant null "); 
      set.add(name.substring(1));    
     } 
     else { 
      System.out.println("descendant not null "); 
      for (int i = 0; i < descendant.size(); i++) { 
       PdfDictionary dic = descendant.getAsDict(i); 
       processFont(dic, set);      
      }    
     }    
    } 
    /** 
     * (Type 1) embedded 
    */ 
    else if (desc.get(PdfName.FONTFILE) != null) { 
     System.out.println("(TrueType) embedded "); 
    } 

    /** 
     * (TrueType) embedded 
    */ 
    else if (desc.get(PdfName.FONTFILE2) != null) { 
     System.out.println("(FONTFILE2) embedded "); 
    } 

    /** 
    * " (" + font.getAsName(PdfName.SUBTYPE).toString().substring(1) + ") embedded" 
    */  
    else if (desc.get(PdfName.FONTFILE3) != null) { 
     System.out.println("(FONTFILE3) "); 
    } 

    else { 
     set.add(name.substring(1));   
    } 
}

}

因此，而不是使用字符串名称= font.getAsName（PdfName.BASEFONT）的ToString（）;我将它改为String name = font.get（PdfName.FONTNAME）.toString（）;

这肯定会得到一些更好的结果，因为它给了我不同的字体。但是，我没有得到fontdescriptor和descendantfonts的结果。或者他们根本不在我的pdf中，或者因为我改变了编码，我永远不会在那里结束。我可以假设是否发现一个子集被嵌入的字体，如果没有子集availbale在字体名称我可以假定字体没有嵌入？

来源

2015-08-25 10:59:12

这不是一个答案，对吧？您可以使用**编辑**按钮来更改您的初始文章。我不确定底部的*新*问题是否符合上述问题的逻辑，或者最好是作为全新的问题发布。 – usr2564301

这是部分答案，因为我得到了一些更好的结果。但是仍然有一些疑问，主要是因为我没有完全理解子集和字体类型的使用。 –

得到它后工作的所有参考，而不是字体BASEFONT：

/** 
* Creates a Set containing information about the fonts in the src PDF file. 
* @param src the path to a PDF file 
* @throws IOException 
*/ 
public void listFonts(PdfReader reader, Set<String> set) throws IOException { 

    try { 

     int n = reader.getXrefSize(); 
     PdfObject object; 
     PdfDictionary font; 

     for (int i = 0; i < n; i++) { 
      object = reader.getPdfObject(i); 
      if (object == null || !object.isDictionary()) { 
       continue; 
      } 

      font = (PdfDictionary)object; 

      if (font.get(PdfName.BASEFONT) != null) { 
       System.out.println("fontname " + font.getAsName(PdfName.BASEFONT).toString()); 
       processFont(font,set); 

      } 

     } 


    } catch (Exception e) { 
     System.out.println("error " + e.getMessage()); 
    } 


} 

/** 
* Finds out if the font is an embedded subset font 
* @param font name 
* @return true if the name denotes an embedded subset font 
*/ 
private boolean isEmbeddedSubset(String name) { 
    //name = String.format("%s subset (%s)", name.substring(8), name.substring(1, 7)); 
    return name != null && name.length() > 8 && name.charAt(7) == '+'; 
} 

private void processFont(PdfDictionary font, Set<String> set) { 

     **String name = font.getAsName(PdfName.BASEFONT).toString();** 

     if(isEmbeddedSubset(name)) { 
      return; 
     } 

     PdfDictionary desc = font.getAsDict(PdfName.FONTDESCRIPTOR); 

     //nofontdescriptor 
     if (desc == null) { 
      System.out.println("desc null "); 
      PdfArray descendant = font.getAsArray(PdfName.DESCENDANTFONTS); 

      if (descendant == null) { 
       System.out.println("descendant null "); 
       set.add(name.substring(1));    
      } 
      else { 
       System.out.println("descendant not null "); 
       for (int i = 0; i < descendant.size(); i++) { 
        PdfDictionary dic = descendant.getAsDict(i); 
        processFont(dic, set);      
        }    
      }    
     } 
     /** 
     * (Type 1) embedded 
     */ 
     else if (desc.get(PdfName.FONTFILE) != null) { 
      System.out.println("(TrueType) embedded "); 
     } 

     /** 
     * (TrueType) embedded 
     */ 
     else if (desc.get(PdfName.FONTFILE2) != null) { 
      System.out.println("(FONTFILE2) embedded "); 
     } 

     /** 
     * " (" + font.getAsName(PdfName.SUBTYPE).toString().substring(1) + ") embedded" 
     */  
     else if (desc.get(PdfName.FONTFILE3) != null) { 
      System.out.println("(FONTFILE3) "); 
     } 

     else { 
      set.add(name.substring(1));   
     } 


}

这给了我同样的结果在Acrobat Reader的字体列表>性能

来源

2015-08-25 12:39:09

Itext：如何检索未嵌入的PDF字体的列表

回答

相关问题