2017-02-12 60 views
1

我已经成功地获取原始html(已与其他产品一起检索),然后让phantomjs采用原始html并渲染整个页面,包括运行任何/所有javascript。我最近遇到了一个没有呈现JavaScript的页面。PhantomJs无法呈现源文件中的特定页面

这是我如何运行它...

phantomjs myscript.js > OUTPUT.txt 2>&1 

下面是一个说明该问题的myscript.js文件...

var page = require('webpage').create(), 
var system = require('system'); 
var address = 'http://cloud.firebrandtech.com/#!/login'; 
var rawHtml = '<!DOCTYPE html>\ 
<html>\ 
<head>\ 
    <meta charset="utf-8">\ 
<meta http-equiv="X-UA-Compatible" content="IE=edge">\ 
<meta name="viewport" content="width=device-width, initial-scale=1.0">\ 
<meta name="description" content="Web Portal for managing Cloud Products, Assets, and Distributions">\ 
<meta name="author" content="Firebrand Technologies">\ 
<title>Firebrand Cloud</title>\ 
<link rel="stylesheet" href="/widgets/css/widgets.css">\ 
<link rel="stylesheet" href="/css/portal.css">\ 
</head>\ 
<body ng-app="portal" fc-app="cloud" fc-direct="true" class="fc">\ 
    <div>\ 
     <div data-ng-if="user.isLoaded" data-ng-controller="PortalCtrl">\ 
      <div data-ng-include="getView()"></div>\ 
      <div class="container">\ 
       <div data-ui-view></div>\ 
      </div>\ 
     </div>\ 
    </div>\ 
    <script src="/widgets/js/widgets.js"></script>\ 
<script src="/js/vendor.js"></script>\ 
<script src="/js/portal.js"></script>\ 
</body>\ 
</html>'; 

page.settings.resourceTimeout = 5000; 
page.settings.loadImages = false; 
page.setContent(rawHtml, address); 
window.setTimeout(function() { 
    if(page.content.indexOf('Sign In') > -1) 
     console.log('YAY!!! Javascript Rendered!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!') 
    else 
     console.log('BOO!!! Javascript NOT Rendered!!!!!!!!!!!!!!!!!!!!!!!!!!')  

    phantom.exit(); 
}, 5000); 

好像这个页面需要一些认证/ CORS上班。如果phantomjs使用实际的请求(使用page.open)来获取源代码,我可以让它工作。但是,这个解决方案对我来说不起作用。 Phantomjs必须使用上述示例中的源代码(正如我所提到的,它一直在为所有其他站点工作)。

var page = require('webpage').create(), 
var system = require('system'); 
var address = 'http://cloud.firebrandtech.com/#!/login '; 

page.open(address, function(status) { 
    setTimeout(function(){ 
     if(page.content.indexOf('Sign In') > -1) 
      console.log('YAY!!! Javascript Rendered!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!') 
     else 
      console.log('BOO!!! Javascript NOT Rendered!!!!!!!!!!!!!!!!!!!!!!!!!!')  

     phantom.exit(); 
    }, 5000) 
}); 

我已经使用标志像下面已经尝试过,但他们似乎没有任何效果...

phantomjs --web-security=false --ignore-ssl-errors=true thefilebelow.js > OUTPUT.txt 2>&1 

回答

0

终于得到了这个工作......

,因为我用其他产品(不phantomjs)来检索页面源,我需要坚持发送与该请求发回的cookie。然后我不得不通过这些饼干使用addCookie像这样...

var page = require('webpage').create(), 
var system = require('system'); 
var address = 'http://cloud.firebrandtech.com/#!/login'; 
var rawHtml = 'same raw html as above...'; 

//THE NEXT 3 LINES ARE WHAT CHANGED 
var cookiesFromInitialRequest = [{name: 'aaa', value: 'bbb', domain: 'ccc'}, etc...] 
for(var i = 0; i < cookiesFromInitialRequest.length; i++) 
    phantom.addCookie(cookiesFromInitialRequest[i]) 

page.settings.resourceTimeout = 5000; 
page.settings.loadImages = false; 
page.setContent(rawHtml, address); 
window.setTimeout(function() { 
    if(page.content.indexOf('Sign In') > -1) 
     console.log('YAY!!! Javascript Rendered!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!') 
    else 
     console.log('BOO!!! Javascript NOT Rendered!!!!!!!!!!!!!!!!!!!!!!!!!!')  

    phantom.exit(); 
}, 5000); 
+0

所以......这是你的问题的答案? – Vaviloff

+0

是的,我只是不能选择它作为答案,直到明天。 – sjdirect