2014-02-25 62 views
0

我是WSJ的付费会员。我想使用HtmlUnit登录WSJ,但无法这样做。以下是我的代码:无法使用HtmlUnit/HttpClient登录WSJ

WebClient webClient = new WebClient(BrowserVersion.FIREFOX_24); 
    webClient.getOptions().setJavaScriptEnabled(true); 
    webClient.getOptions().setCssEnabled(false); 
    webClient.getOptions().setRedirectEnabled(true); 
    webClient.getOptions().setThrowExceptionOnScriptError(false); 
    webClient.setAjaxController(new NicelyResynchronizingAjaxController()); 
    webClient.getCookieManager().setCookiesEnabled(true); 


    final HtmlPage page1 = WebClient.getPage("https://id.wsj.com/access/50f57264bd7fb2d2f6629af6/latest/login_standalone.html"); 
    final HtmlForm form = page1.getForms().get(0); 

    final HtmlTextInput textField = form.getInputByName("username"); 
    final HtmlPasswordInput pwd = form.getInputByName("password");   
    textField.setValueAttribute("xxxxx"); 
    pwd.setValueAttribute("xxxx"); 

    final HtmlSubmitInput button = (HtmlSubmitInput) form.getInputsByValue("Log In").get(0); 
    final HtmlPage page2 = button.click(); 

我不知道我缺少其中.. 早些时候,我使用的Apache HttpClient的,但仍然没有sucess。

的HttpClient代码:

CloseableHttpClient httpclient = HttpClientBuilder.create().build(); 
    CookieStore cookieStore = new BasicCookieStore(); 
    HttpContext httpContext = new BasicHttpContext(); 
    httpContext.setAttribute(ClientContext.COOKIE_STORE, cookieStore); 
    HttpPost httpGet = new HttpPost("https://id.wsj.com/access/50f57264bd7fb2d2f6629af6/latest/login_standalone.html"); 
    httpGet.setHeader("Content-type", "application/json"); 
    httpGet.setHeader("Accept-Encoding","gzip, deflate"); 
    httpGet.setHeader("Host","id.wsj.com"); 
    httpGet.setHeader("User-Agent","Mozilla/5.0 (Windows NT 6.1; WOW64; rv:27.0) Gecko/20100101 Firefox/27.0"); 
    httpGet.setHeader("X-HTTP-Method-Override","POST"); 
    httpGet.setHeader("X-Requested-With","XMLHttpRequest"); 

    List<NameValuePair> urlParameters = new ArrayList<NameValuePair>(); 

    urlParameters.add(new BasicNameValuePair("landing_page", "http%3A%2F%2Findia.wsj.com%2F")); 
    urlParameters.add(new BasicNameValuePair("realm", "default")); 
    urlParameters.add(new BasicNameValuePair("template", "default")); 
    urlParameters.add(new BasicNameValuePair("username", "xxxx")); 
    urlParameters.add(new BasicNameValuePair("password", "xxxx")); 
    urlParameters.add(new BasicNameValuePair("savelogin", "true")); 

    httpGet.setEntity(new UrlEncodedFormEntity(urlParameters)); 

    HttpResponse response1 = httpclient.execute(httpGet, httpContext); 

    System.out.println(response1.getStatusLine().getStatusCode()); 

    HttpGet getRequest = new HttpGet("http://online.wsj.com/news/articles/SB10001424052702304834704579404391984581058?mod=WSJ_LatestHeadlines&mg=reno64-wsj"); 

    response1 = httpclient.execute(getRequest, httpContext); 
    StringWriter writer = new StringWriter(); 
    IOUtils.copy(response1.getEntity().getContent(), writer, "UTF-8"); 
    String theString = writer.toString(); 
    FileWriter fileWriter = new FileWriter("C:/Users/xxxsx/Desktop/xx.html"); 
    fileWriter.write(theString); 
    fileWriter.close(); 

请帮家伙?

伙计们终于用Selenium登录了!

+0

使用HTMLUNIT时,你有没有收到任何异常?或者可以ü粘贴用户名,密码和点击按钮html代码 – Kick

+1

没有先生,没有产生异常。不,我不能让用户/通过公开 –

+0

我dnt问凭证,只是再次阅读我问html代码。我有一个问题,当我输入虚拟用户名\密码,并单击按钮没有行动发生?该页面如何工作,在这种情况下,必须输入错误的用户名/密码信息。 – Kick

回答

0

愉快地使用Selenium登录成功!