2013-12-17 30 views
1

我们有一个简单的脚本来读取传入的PDF文件。如果横向将其旋转到纵向以供其他程序以后使用。所有运行良好的pyPdf,直到我遇到了一个IndirectObject文件作为页面上的/旋转键的值。该对象是可解析的,所以我可以告诉什么是/ Rotate值,但是当试图旋转Clockwise或rotateCounterClockwise时,我得到了一个回溯,因为pyPdf并不期望在/ Rotate中有一个IndirectObject。我已经做了相当多的尝试用该值覆盖IndirectObject的文件,但我没有得到任何地方。我甚至尝试传递相同的IndirectObject rotateClockwise,它会抛出相同的回溯,在pdf.pyc早一行pyPdf IndirectObject in/Rotate

我简单的问题就是。 。 。是否有pyPdf或PyPDF2的补丁,使它不会窒息这种设置,或者我可以旋转页面的不同方式,或者我还没有看到/考虑过的不同库?我试过PyPDF2,它有同样的问题。我曾将PDFMiner视为替代品,但它似乎更倾向于将信息从PDF文件中取出,而不是操纵它们。下面是我在IPython中pyPDF文件播放输出,输出为PyPDF2是非常相似,但有些信息的格式稍有不同:

In [1]: from pyPdf import PdfFileReader 

In [2]: mypdf = PdfFileReader(open("RP121613.pdf","rb")) 

In [3]: mypdf.getNumPages() 
Out[3]: 1 

In [4]: mypdf.resolvedObjects 
Out[4]: 
{0: {1: {'/Pages': IndirectObject(2, 0), '/Type': '/Catalog'}, 
    2: {'/Count': 1, '/Kids': [IndirectObject(4, 0)], '/Type': '/Pages'}, 
    4: {'/Count': 1, 
    '/Kids': [IndirectObject(5, 0)], 
    '/Parent': IndirectObject(2, 0), 
    '/Type': '/Pages'}, 
    5: {'/Contents': IndirectObject(6, 0), 
    '/MediaBox': [0, 0, 612, 792], 
    '/Parent': IndirectObject(4, 0), 
    '/Resources': IndirectObject(7, 0), 
    '/Rotate': IndirectObject(8, 0), 
    '/Type': '/Page'}}} 

In [5]: mypage = mypdf.getPage(0) 

In [6]: myrotation = mypage.get("/Rotate") 

In [7]: myrotation 
Out[7]: IndirectObject(8, 0) 

In [8]: mypdf.getObject(myrotation) 
Out[8]: 0 

In [9]: mypage.rotateCounterClockwise(90) 
--------------------------------------------------------------------------- 
TypeError         Traceback (most recent call last) 

/root/<ipython console> in <module>() 

/usr/lib/python2.7/site-packages/pyPdf/pdf.pyc in rotateCounterClockwise(self, angle) 
    1049  def rotateCounterClockwise(self, angle): 
    1050   assert angle % 90 == 0 
-> 1051   self._rotate(-angle) 
    1052   return self 
    1053 

/usr/lib/python2.7/site-packages/pyPdf/pdf.pyc in _rotate(self, angle) 
    1054  def _rotate(self, angle): 
    1055   currentAngle = self.get("/Rotate", 0) 
-> 1056   self[NameObject("/Rotate")] = NumberObject(currentAngle + angle) 
    1057 
    1058  def _mergeResources(res1, res2, resource): 

TypeError: unsupported operand type(s) for +: 'IndirectObject' and 'int' 

In [10]: mypage.rotateClockwise(90)  
--------------------------------------------------------------------------- 
TypeError         Traceback (most recent call last) 

/root/<ipython console> in <module>() 

/usr/lib/python2.7/site-packages/pyPdf/pdf.pyc in rotateClockwise(self, angle) 
    1039  def rotateClockwise(self, angle): 
    1040   assert angle % 90 == 0 
-> 1041   self._rotate(angle) 
    1042   return self 
    1043 

/usr/lib/python2.7/site-packages/pyPdf/pdf.pyc in _rotate(self, angle) 
    1054  def _rotate(self, angle): 
    1055   currentAngle = self.get("/Rotate", 0) 
-> 1056   self[NameObject("/Rotate")] = NumberObject(currentAngle + angle) 
    1057 
    1058  def _mergeResources(res1, res2, resource): 

TypeError: unsupported operand type(s) for +: 'IndirectObject' and 'int' 

In [11]: mypage.rotateCounterClockwise(myrotation) 
--------------------------------------------------------------------------- 
TypeError         Traceback (most recent call last) 

/root/<ipython console> in <module>() 

/usr/lib/python2.7/site-packages/pyPdf/pdf.pyc in rotateCounterClockwise(self, angle) 
    1048  # @param angle Angle to rotate the page. Must be an increment of 90 deg. 

    1049  def rotateCounterClockwise(self, angle): 
-> 1050   assert angle % 90 == 0 
    1051   self._rotate(-angle) 
    1052   return self 

TypeError: unsupported operand type(s) for %: 'IndirectObject' and 'int' 

我会很乐意提供我的文件如果有人想深入研究它的话。

回答

1

您需要的getObject适用于IndirectObject的一个实例,所以你的情况应该是

myrotation.getObject() 
0

我意识到这是一个老问题,但我发现这个职位在我的搜索,试图解决更快比我找到我的解决方案。从我的理解是一个错误。

https://github.com/mstamy2/PyPDF2/pull/338/files

总之,我直接编辑的PyPDF2源实现修复。找到PyPDF2/pdf.py并搜索def _rotate(self,angle):行。替换如下:

def _rotate(self, angle): 
    rotateObj = self.get("/Rotate", 0) 
    currentAngle = rotateObj if isinstance(rotateObj, int) else rotateObj.getObject() 
    self[NameObject("/Rotate")] = NumberObject(currentAngle + angle) 

它现在的作品就像一个魅力。