2016-03-17 84 views
1

我试图使用Python将PDF上传到OneNote。根据OneNote API,我需要发布这样的请求:如何使用Python将多部分PDF请求发送到OneNote

Content-Type:multipart/form-data; boundary=MyAppPartBoundary 
Authorization:bearer tokenString 

--MyAppPartBoundary 
Content-Disposition:form-data; name="Presentation" 
Content-type:text/html 

<!DOCTYPE html> 
<html> 
    <head> 
    <title>A page with an embedded and displayed PDF file</title> 
    </head> 
    <body> 
     <p>Attached is the lease agreement for the expanded offices!</p> 
     <object 
     data-attachment="OfficeLease.pdf" 
     data="name:OfficeLeasePartName" 
     type="application/pdf" /> 
     <p>Here's the contents of our new lease.</p> 
     <img data-render-src="name:OfficeLeasePartName" width="900"/> 
    </body> 
</html> 

--MyAppPartBoundary 
Content-Disposition:form-data; name="OfficeLeasePartName" 
Content-type:application/pdf 

... PDF binary data ... 

--MyAppPartBoundary-- 

但是,我不知道如何在Python中执行多部分请求。我可以做一个基本的文本/ HTML请求就好了:

url = ROOT_URL+"pages" 

headers = {"Content-Type":"text/html", 
      "Authorization" : "bearer " + access_token} 

# Format html (title & text) 

html = "<html><head><title>" + title + "</title></head>" 
html += "<body><p>" + text + "</p></body></html>" 

# Send request 

session = requests.Session() 
request = requests.Request(method="POST", headers=headers, 
          url=url, data=html) 
prepped = request.prepare() 
response = session.send(prepped) 

我该如何修改多部分的Python代码?

[########### UPDATE ############]

基于jayongg的建议下,我尝试以下。当我这样做时,我从“页面创建请求”中切换的错误需要内容为多部分,并将“演示文稿部分”改为“多部分有效内容格式错误”。我认为这是因为我实际上并没有将pdf文件附加到某处?我也不确定OneNote api示例中的OfficeLease.pdf和OfficeLeasePartName之间的区别。

这是我的当前代码:

url = ROOT_URL+"pages" 

path = os.path.join(pdfFolder, pdfName + ".pdf") 

headers = {"Content-Type":"multipart/form-data; boundary=MyAppPartBoundary", 
      "Authorization" : "bearer " + access_token} 

f = open(path, "rb").read() 

txt = """--MyAppPartBoundary 
     Content-Disposition:form-data; name="Presentation" 
     Content-type:text/html 

     <!DOCTYPE html> 
     <html> 
      <head> 
      <title>A page with an embedded and displayed PDF file</title> 
      </head> 
      <body> 
       <p>Attached is the lease agreement for the expanded offices!</p> 
       <object 
       data-attachment="Sample5.pdf" 
       data="name:Sample5" 
       type="application/pdf" /> 
       <p>Here's the contents of our new lease.</p> 
       <img data-render-src="name:Sample5" width="900"/> 
      </body> 
     </html> 

     --MyAppPartBoundary 
     Content-Disposition:form-data; name="OfficeLeasePartName" 
     Content-type:application/pdf 
     """ + f + """ 
     --MyAppPartBoundary--""" 

session = requests.Session() 
request = requests.Request(method="POST", headers=headers, 
          url=url, data=txt) 
prepped = request.prepare() 
response = session.send(prepped) 

[########## UPDATE 2 ##############]

如果我使代码更简单,它仍然导致格式错误:

headers = {"Content-Type":"multipart/form-data; boundary=MyAppPartBoundary", 
      "Authorization" : "bearer " + access_token} 

txt = """--MyAppPartBoundary 
     Content-Disposition:form-data; name="Presentation" 
     Content-type:text/html 

     <!DOCTYPE html> 
     <html> 
      <head> 
      <title>One Note Text</title> 
      </head> 
      <body> 
       <p>Hello OneNote World</p> 
      </body> 
     </html> 

     --MyAppPartBoundary-- 
     """ 

session = requests.Session() 
request = requests.Request(method="POST", headers=headers, 
          url=url, data=txt) 

我也试过这样。同样的事情:

headers = {"Content-Type":"multipart/form-data; boundary=MyAppPartBoundary", 
      "Authorization" : "bearer " + access_token} 

txt = """<!DOCTYPE html> 
     <html> 
      <head> 
      <title>One Note Text</title> 
      </head> 
      <body> 
       <p>Hello OneNote World</p> 
      </body> 
     </html>"""  

files = {'file1': ('Presentation', txt, 'text/html')} 

session = requests.Session() 
request = requests.Request(method="POST", headers=headers, 
          url=url, files=files) 
prepped = request.prepare() 
response = session.send(prepped) 
+0

尝试用最后--MyAppPartBoundary-- – jayongg

+0

@jayongg我做了后加入一个换行符,但它仍然给出同样的错误。即使我使代码更简单,它也会给出错误的错误(参见上面的update2)。 – Elliptica

回答

0

您需要构建完整的HTTP请求,而不仅仅是发送HTML。

对于您的身体,尝试构建您发布在问题中的完整身体。

--MyAppPartBoundary 
Content-Disposition:form-data; name="Presentation" 
Content-type:text/html 

<!DOCTYPE html> 
<html> 
    // truncated 
</html> 

--MyAppPartBoundary 
Content-Disposition:form-data; name="OfficeLeasePartName" 
Content-type:application/pdf 

... PDF binary data ... 

--MyAppPartBoundary-- 

确保您设置Content-Type头正确:

Content-Type:multipart/form-data; boundary=MyAppPartBoundary 
+0

谢谢,我尝试了你的建议,现在它给出的错误是“多部分负载格式错误”,而不是“页面创建请求要求内容是多部分的,并且有一个演示文稿部分。”你能看看我更新的例子,看看我还在哪里错了吗? – Elliptica

2

事实证明,答案是Python的编码空白行为 “\ n”,但OneNote中要求为 “\ r \ n” 。它还需要在最终边界之后有一个空行(“\ r \ n”)。最后,对于Content-Type和Content-Disposition行,它不能在主体中具有任何前导空白(无缩进)。 (每个Content-Disposition行后面还应该有一个空行。)

例如,如果这是身体:

"""--MyBoundary 
Content-Type: text/html 
Content-Disposition: form-data; name="Presentation" 

Some random text 
--MyBoundary 
Content-Type: text/text 
Content-Disposition: form-data; name="more"; filename="more.txt" 

More text 
--MyBoundary-- 
""" 

应该由字符串

'--MyBoundary\r\nContent-Type: text/html\r\nContent-Disposition: form-data; name="Presentation"\r\n\r\nSome random text\r\n--MyBoundary\r\nContent-Type: text/text\r\nContent-Disposition: form-data; name="more"; filename="more.txt"\r\n\r\nMore text\r\n--MyBoundary--\r\n' 

可以只需键入内三“””引号中的文字进行表示(它会自动在最后一个空格中创建\ n,然后用“\ r \ n”替换“\ n”):

body = body.replace("\n", "\r\n") 

的标题是:

headers = {"Content-Type":"multipart/form-data; boundary=MyBoundary", 
      "Authorization" : "bearer " + access_token} 

最后,你会后这样的电话:

session = requests.Session() 
request = requests.Request(method="POST", headers=headers, 
          url=url, data=body) 
prepped = request.prepare() 
response = session.send(prepped) 
+0

很高兴它解决了。所以我认为这也解决了,对吧? http://stackoverflow.com/questions/36069553/python-multipart-post-malformed – jayongg

+0

顺便说一句,“\ r \ n”是HTTP标准的一部分。 – jayongg

+0

是的:)我发布他们作为不同的问题,因为一般只是如何格式multipost,但另一个是错误特定的。然而,解决方案回答了他们两个。 谢谢你,这是我第一次探索HTTP标准,我想我错过了关于“\ r \ n”的部分!我非常感谢你的帮助。 – Elliptica

相关问题