无法提取PDF文件作为二进制数据

我试图获取从PDF文件：无法提取PDF文件作为二进制数据

网址：https://domain_name/xyz/_id/download/

其中它不指向一个直接的PDF文件和每个独特的文件被下载解释一个特定的012字段。

我把浏览器和PDF文件的地址栏这个环节被立即下载，而当我试图通过HttpsURLConnection的它的内容类型是“text/html的”形式，而应该把它牵在'application/pdf'中。

我连接，但文件总是“text/html的”形式获取下载之前还试图“调用setRequestProperty”到“应用程序/ PDF”。

方法我使用的是“GET”

1）我需要使用的HttpClient，而不是HttpsURLConnection的吗？

2）这些类型的链接是用来提高安全性吗？

3）请指出我的错误。

4）如何知道服务器上存在的文件名？

我下面主要代码粘贴，我已经实现了：

URL url = new URL(sb.toString()); 

    //created new connection 
    HttpsURLConnection urlConnection = (HttpsURLConnection) url.openConnection(); 

    //have set the request method and property 
    urlConnection.setRequestMethod("GET"); 
    urlConnection.setDoOutput(true); 
    urlConnection.setRequestProperty("Content-Type", "application/pdf"); 

    Log.e("Content Type--->", urlConnection.getContentType()+" "+ urlConnection.getResponseCode()+" "+ urlConnection.getResponseMessage()+"    "+urlConnection.getHeaderField("Content-Type")); 

    //and connecting! 
    urlConnection.connect(); 

    //setting the path where we want to save the file 
    //in this case, going to save it on the root directory of the 
    //sd card. 
    File SDCardRoot = Environment.getExternalStorageDirectory(); 

    //created a new file, specifying the path, and the filename 

    File file = new File(SDCardRoot,"example.pdf"); 

    if((Environment.getExternalStorageState()).equals(Environment.MEDIA_MOUNTED_READ_ONLY)) 

    //writing the downloaded data into the file we created 
    FileOutputStream fileOutput = new FileOutputStream(file); 

    //this will be used in reading the data from the internet 
    InputStream inputStream = urlConnection.getInputStream(); 

    //this is the total size of the file 
    int totalSize = urlConnection.getContentLength(); 

    //variable to store total downloaded bytes 
    Log.e("Total File Size ---->", ""+totalSize); 
    int downloadedSize = 0; 

    //create a buffer... 
    byte[] buffer = new byte[1024]; 
    int bufferLength = 0; //used to store a temporary size of the buffer 

    //Reading through the input buffer and write the contents to the file 
    while ((bufferLength = inputStream.read(buffer)) > 0) { 

     //add the data in the buffer to the file in the file output stream (the file on the sd card 
     fileOutput.write(buffer, 0, bufferLength); 


     //adding up the size 
     downloadedSize += bufferLength; 

     //reporting the progress: 
     Log.e("This much downloaded---->",""+ downloadedSize); 

    } 
    //closed the output stream 
    fileOutput.close();

我寻觅了很多，无法得到结果。如果可能，请尝试详细说明我的错误，因为我第一次实施这个的事情。

**试图直接读取PDF链接，如：http://labs.google.com/papers/bigtable-osdi06.pdf ，他们很容易被下载的，而且他们的 'Content-Type的' 也是 '应用程序/ PDF' **

感谢。

来源

2011-03-10 iabhi

您是否检查过服务器响应的MIME类型？ – 2011-03-10 08:05:41

此主题让我对我的问题的解决方案！当您尝试从WebView下载流式PDF时，如果您使用HttpURLConnection，则还需要从Web视图中传递Cookie。

String cookie = CookieManager.getInstance().getCookie(url.toString()); 
if (cookie != null) connection.setRequestProperty("cookie", cookie);

来源

2012-11-23 14:00:16 Predders

理论1：服务器响应的内容类型不正确。如果服务器代码是由您编写和部署的，请检查该代码。

理论2：网址传回其中有一些JavaScript重定向哪个页面实际PDF文件的URL的HTML页面。

来源

2011-03-10 08:18:50 Nishan

我试图打开的URL有一些内嵌的pdf渲染，其中显示嵌入在网页中的pdf文件。你认为这可能是一个问题吗？因为，当我使用Firefox浏览器在WebPage中呈现它时，但是当我在Chrome浏览器中打开此链接时，它会下载该文件。那么，有什么我可以做的，直接获取PDF格式为二进制而不是接收'HTML /文本'或修改需要在服务器端进行。我没有部署服务器代码。 – iabhi 2011-03-10 14:39:02

@ al-sutton @nishan我已经通过FireBug进行了检查，显示它为application/pdf对象。那么，我需要做一些改变来访问网页中的嵌入式pdf吗？ – iabhi 2011-03-11 05:50:16

此外，我可以下载PDF的确切文件大小，但在'text/html'中，而不是将其作为'application/pdf'接收，因此它显示“无法打开文本/ html文件类型” – iabhi 2011-03-11 05:58:02

无法提取PDF文件作为二进制数据

回答

相关问题