2014-07-18 157 views
1

我需要编写一个Web客户端,它可以访问旧版网络应用,登录到该网页,从/widget页面提取一些信息,并根据此页面的HTML做一些工作。我选择使用Groovy/HttpBuilder解决方案,原因不在此问题范围之内。Groovy HttpBuilder与饼干问题

唯一的缺点(从我可以告诉的是)HttpBuilder不支持在请求之间保留cookie。这是一个重大的问题,因为(Java)的Web应用程序使用JSESSIONID cookie来确定用户是否登录,拥有权限等

因此,首先,如果我上面的说法是不正确的,HttpBuilder 确实支持跨请求保留Cookie,请纠正我,也许这里的答案是一个解决方案,告诉我如何利用HttpBuilder的这一部分。在这种情况下,我所有的代码都是没有意义的。

假设我是正确的,这不是由HttpBuilder处理,我发现this excellent solution,我无法工作出于某种原因,因此我的问题。

我的代码的调整(见上面的链接)如下:

TaskAutomator.groovy 
==================== 
package com.me.myapp.tasker 

import groovyx.net.http.ContentType 
import groovyx.net.http.Method 

class TaskAutomator { 
    static void main(String[] args) { 
     TaskAutomator tasker = new TaskAutomator() 
     String result = tasker.doWork("http://myapp.example.com") 

     println result 
    } 

    String doWork(String baseUrl) { 
     CookieRetainingHttpBuilder cookiedBuilder = new CookieRetainingHttpBuilder(baseUrl) 
     Map logins = [username: 'user', password: '12345'] 

     // Go to the main page where we will get back the HTML for a login screen. 
     // We don't really care about the response here, so long as its HTTP 200. 
     cookiedBuilder.request(Method.GET, ContentType.HTML, "", null) 

     // Log in to the app, where, on success, we will get back the HTML for a the 
     // "Main Menu" screen users see when they log in. We don't really care about 
     // the response here, so long as its HTTP 200. 
     cookiedBuilder.request(Method.POST, ContentType.HTML, "/auth", logins) 

     // Finally, now that our JSESSIONID cookies is authenticated, go to the widget page 
     // which is what we actually care about interacting with. 
     def response = cookiedBuilder.request(Method.GET, ContentType.HTML, "/widget", null) 

     // Test to make sure the response is what I think it is. 
     print response 

     String result 

     // TODO: Now actually do work based off the response. 

     result 
    } 
} 

CookieRetainingHttpBuilder 
========================== 
package com.me.myapp.tasker 

import groovyx.net.http.ContentType 
import groovyx.net.http.HTTPBuilder 
import groovyx.net.http.HttpResponseDecorator 
import groovyx.net.http.Method 

class CookieRetainingHttpBuilder { 
    private String baseUrl 
    private HTTPBuilder httpBuilder 
    private List<String> cookies 

    CookieRetainingHttpBuilder(String baseUrl) { 
     this.baseUrl = baseUrl 
     this.httpBuilder = initializeHttpBuilder() 
     this.cookies = [] 
    } 

    public def request(Method method, ContentType contentType, String url, Map<String, Serializable> params) { 
     httpBuilder.request(method, contentType) { request -> 
      uri.path = url 
      uri.query = params 
      headers['Cookie'] = cookies.join(';') 
     } 
    } 

    private HTTPBuilder initializeHttpBuilder() { 
     def httpBuilder = new HTTPBuilder(baseUrl) 

     httpBuilder.handler.success = { HttpResponseDecorator resp, reader -> 
      resp.getHeaders('Set-Cookie').each { 
       String cookie = it.value.split(';')[0] 
       cookies.add(cookie) 
      } 

      reader 
     } 

     httpBuilder 
    } 
} 

当我运行这段代码,我得到了下面的堆栈跟踪(我已经抹去了非有趣的部分是它的相当大):

Exception in thread "main" groovyx.net.http.HttpResponseException: Not Found 
    at groovyx.net.http.HTTPBuilder.defaultFailureHandler(HTTPBuilder.java:642) 
     ... (lines omitted for brevity) 
    at groovyx.net.http.HTTPBuilder$1.handleResponse(HTTPBuilder.java:494) 
     ... (lines omitted for brevity) 
    at groovyx.net.http.HTTPBuilder.doRequest(HTTPBuilder.java:506) 
    at groovyx.net.http.HTTPBuilder.doRequest(HTTPBuilder.java:425) 
    at groovyx.net.http.HTTPBuilder.request(HTTPBuilder.java:374) 
    at groovyx.net.http.HTTPBuilder$request.call(Unknown Source) 
    at com.me.myapp.tasker.CookieRetainingHttpBuilder.request(CookieRetainingHttpBuilder.groovy:20) 
     ... (lines omitted for brevity) 
    at com.me.myapp.tasker.TaskAutomator.doWork(TaskAutomator.groovy:23) 
     ... (lines omitted for brevity) 
    at com.me.myapp.tasker.TaskAutomator.main(TaskAutomator.groovy:13) 

CookieRetainingHttpBuilder:20是从request这一行:

httpBuilder.request(method, contentType) { request -> 

任何人都可以看到为什么我得到这个?此外,我想在TaskAutomater#doWork(...)方法中确认我的方法/策略。 是在我感觉我使用CookieRetainingHttpBuilder“纠正”

  1. 进入主/登录页面
  2. 发布登录creds和登录
  3. 将小部件页

还是有不同的方式来使用HttpBuilder,这里更好/更高效(记住CookieRetainingHttpBuilder毕竟只是HttpBuilder的包装器)。

+0

基于从https://github.com/jgritman/httpbuilder/blob/master/src/main/java/groovyx/net/http/HTTPBuilder.java代码,你就得到了http请求的一些失败。我不确定究竟是什么“未找到”,可能的主持人。您可以捕获https://github.com/jgritman/httpbuilder/blob/master/src/main/java/groovyx/net/http/HttpResponseException.java并检查内部。 – Vartlok

+1

即使您决定使用'HttpBuilder',我建议查看[Fluent HttpClient](http://hc.apache.org/httpcomponents-client-ga/tutorial/html/fluent.html) - 查看cookie使用请参阅http://java.dzone.com/tips/fluency-and-control-httpclient – ChrLipp

回答

3

我相信错误可能是由于缺少导入,或者可能是旧版本的HttpBuilder。展望HttpBuilder.Class,我看到这一点,它通知我的建议:

protected java.lang.Object parseResponse(org.apache.http.HttpResponse resp, java.lang.Object contentType) throws groovyx.net.http.HttpResponseException { /* compiled code */ } 

我相当肯定你可以在你的httpBuilder设置中使用headers.'Set-Cookie。语法与您所拥有的语法不同,但这种更改很小且很简单,这是使用HttpBuilder时所使用的基本方法。

@Grab(group = 'org.codehaus.groovy.modules.http-builder', module = 'http-builder', version = '0.7) 
    import groovyx.net.http.HTTPBuilder 
    import org.apache.http.HttpException 
    import static groovyx.net.http.ContentType.TEXT 
    import static groovyx.net.http.Method.GET 

    def http = new HTTPBuilder(urlToHit) 
    http.request(urlToHit, GET, TEXT) { req -> 

    headers.'User-Agent' = ${userAgent} 
    headers.'Set-Cookie' = "${myCookie}" 

    response.success = { resp, reader -> 
     html = reader.getText() 
    } 

    response.failure = { resp, reader -> 
     System.err.println "Failure response: ${resp.status}" 
     throw new HttpException()   
    }   
} 

还有一点要注意的是,你没有失败处理。我不知道这是否会引发例外,但可能值得探讨。

编辑 至于建议,我合并我的答案(谢谢你让我知道......我不知道是什么适当的礼仪)。

这是我想出来的。我尽我所能重复使用您发布的代码。我尽我所能评论。如果您有任何问题,请告诉我。

@Grab(group = 'org.codehaus.groovy.modules.http-builder', module = 'http-builder', version = '0.7') 
import static groovyx.net.http.ContentType.HTML 
import static groovyx.net.http.Method.POST 
import static groovyx.net.http.Method.GET 
import groovyx.net.http.ContentType 
import groovyx.net.http.HTTPBuilder 
import groovyx.net.http.URIBuilder 
import groovyx.net.http.Method 
import org.apache.http.HttpException 

/** 
* This class defines the methods used for getting and using cookies 
* @param baseUrl The URL we will use to make HTTP requests. In this example, it is https://www.pinterest.com 
*/ 

class CookieRetainingHttpBuilder { 

    String baseUrl 

    /** 
    * This method makes an http request and adds cookies to the array list for later use 
    * @param method The method used to make the http request. In this example, we use GET and POST 
    * @param contentType The content type we are requesting. In this example, we are getting HTML 
    * @param url The URI path for the appropriate page. For example, /login/ is for the login page 
    * @param params The URI query used for setting parameters. In this example, we are using login credentials 
    */ 

    public request (Method method, ContentType contentType, String url, Map<String, Serializable> params) { 

     List<String> cookies = new ArrayList<>() 

     def http = new HTTPBuilder(baseUrl) 

     http.request(baseUrl, method, contentType) { req -> 

      URIBuilder uriBuilder = new URIBuilder(baseUrl) 
      uriBuilder.query = params 
      uriBuilder.path = url 

      headers.'Accept' = HTML 
      headers.'User-Agent' = "Mozilla/5.0 (Windows NT 6.3; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/37.0.2049.0 Safari/537.36" 
      headers.'Set-Cookie' = cookies.join(";") 

      response.success = { resp, reader -> 

       resp.getHeaders('Set-Cookie').each { 
        def cookie = it.value.split(";").toString() 
        cookies.add(cookie) 
       } 

       return reader 

      } 

      response.failure = { resp, reader -> 
       System.err.println "Failure response: ${resp.status}" 
       throw new HttpException() 
      } 

     } 

    } 

} 

/** 
* This class contains the method to make HTTP requests in the proper sequence 
* @param base The base URL 
* @param user The username of the site being logged in to 
* @param pass The password for the username 
*/ 

class TaskAutomator { 

    private static String base = "http://myapp.example.com" 
    private static String user = "thisIsMyUser" 
    private static String pass = "thisIsMyPassword" 

    /** 
    * This method contains the functions in proper order to set cookies and login to a site 
    * @return response Returns the HTML from the final GET request 
    */ 

    static String doWork() { 

     CookieHandler.setDefault(new CookieManager()); 

     CookieRetainingHttpBuilder cookiedBuilder = new CookieRetainingHttpBuilder(baseUrl: base) 
     Map logins = [username: user, password: pass] 

     // Go to the main page where we will get back the HTML for a login screen. 
     // We don't really care about the response here, so long as its HTTP 200. 
     cookiedBuilder.request(GET, HTML, "", null) 

     // Log in to the app, where, on success, we will get back the HTML for a the 
     // "Main Menu" screen users see when they log in. We don't really care about 
     // the response here, so long as its HTTP 200. 
     cookiedBuilder.request(POST, HTML, "/login/", logins) 

     // Finally, now that our JSESSIONID cookies is authenticated, go to the widget page 
     // which is what we actually care about interacting with. 
     def response = cookiedBuilder.request(GET, HTML, "/", null) 

     // Test to make sure the response is what I think it is. 
     return response 

     // TODO: Now actually do work based off the response. 

    } 

} 

TaskAutomator tasker = new TaskAutomator() 
String result = tasker.doWork() 
println result 
+1

我刚刚意识到我错过了@Grab上的一个撇号,所以它应该是'@Grab(group ='org.codehaus。 groovy.modules.http-builder',module ='http-builder',version ='0.7')' – paranoid

+0

Thanks @paranoid(+1) - 我对你的解决方案很感兴趣,任何机会你都可以更新它来显示它用法(特别是对我的用例)?我将如何使用它来:(1)打一个登录页面,(2)在该页面保存'JSESSIONID',(3)登录,(4)转到同一站点上的另一个(已认证)页面,使用cookie ?再次感谢! – IAmYourFaja