如何避免java.net.URI中的冗余双重编码？

我正在写一个从URL到URI的转换，仅用于将java.net.URI作为参数的Http方法。如何避免java.net.URI中的冗余双重编码？

我的实现是这样的：

new URI(url.getProtocol(), url.getAuthority(), url.getPath(), url.getQuery(), null);

所以不会对具有空间（不正确的网址为借口）URL打破。但是，在编码以下url时：

http://www.****.ca/en-ca/Catalog/Gallery.aspx?ID=Mass%20Spectrometry%20[GC/MS%20and%20ICP-MS]&PID=Gas%20Chromatography%20Mass%20Spectrometry%20Consumables

它会将所有％20转换为％2520，从而导致地址无效。

Java中有没有一种方法可以正确解析各种URL？包括具有％20和空格的那些？像浏览器或wget命令一样。

来源

2015-05-16 tribbloid

这是我自己的解决方案，到目前为止的作品，但我不知道这是否会在另一个奇怪的URI字符串突破：

public static URI uri(String s) throws URISyntaxException { 
    try { 
     return new URI(s); 
    } 
    catch (URISyntaxException e) { 
     try { 
     URL url = new URL(s); 
     return new URI(url.getProtocol(), url.getAuthority(), url.getPath(), url.getQuery(), null); 
     } catch (MalformedURLException ee) { 
     URL url; 
     try { 
      url = new URL(dummyURL, s); 
     } catch (MalformedURLException eee) { 
      throw new RuntimeException(eee); 
     } 
     return new URI(null, null, url.getPath(), url.getQuery(), null); //this will generate a relative URI the string itself is relative 
     } 
    } 
    }

来源

2015-05-16 18:45:21 tribbloid

如何避免java.net.URI中的冗余双重编码？

回答

相关问题