2013-07-04 45 views
2

我有一个问题和我的任务的问题。我在GruntJs上写了一些应用程序。 我必须通过gruntJs下载网页源代码。用Javascript和grunt将网页源文件下载到文件中

例如,我有一个页面:example.com/index.html

我想给Grunt任务提供URL,像这样: scr: "example.com/index.html"

然后,我必须在文件中有这个源文件,ex: source.txt

我该怎么做?

+0

有一个看看http://nodejs.org/api/http.html#http_http_request_options_callback。 –

+0

我尝试使用此代码。但没有任何反应......'grunt.log.writeln(res); http.get(“http://www.google.com/index.html”,函数(res){ grunt.log.writeln(“Got response:”+ res.statusCode); })。on 'error',函数(e){gmail.com/grunt.log.writeln(“出错:”+ e.message); });' – user2365163

+0

你是否遇到'http.get'错误? – user568109

回答

3

对此有几种方法。

首先是来自node.js API的原始http.get,如注释中所述。这将为您提供最初的页面加载服务。问题出现时,该网站广泛使用JavaScript来建立更多的HTML后,Ajax请求。

第二种方法是使用实​​际的浏览器引擎加载网站并执行任何JavaScript &进一步的HTML构建页面加载运行。最常见的引擎是PhantomJS,它被包装在一个名为grunt-lib-phantomjs的Grunt库中。

幸运的是,有人提供你问什么就顶一下,做几乎完全另一层:从上面的链接 https://github.com/cburgdorf/grunt-html-snapshot

的例子配置:

grunt.initConfig({ 
    htmlSnapshot: { 
     all: { 
      options: { 
      //that's the path where the snapshots should be placed 
      //it's empty by default which means they will go into the directory 
      //where your Gruntfile.js is placed 
      snapshotPath: 'snapshots/', 
      //This should be either the base path to your index.html file 
      //or your base URL. Currently the task does not use it's own 
      //webserver. So if your site needs a webserver to be fully 
      //functional configure it here. 
      sitePath: 'http://localhost:8888/my-website/', 
      //you can choose a prefix for your snapshots 
      //by default it's 'snapshot_' 
      fileNamePrefix: 'sp_', 
      //by default the task waits 500ms before fetching the html. 
      //this is to give the page enough time to to assemble itself. 
      //if your page needs more time, tweak here. 
      msWaitForPages: 1000, 
      //if you would rather not keep the script tags in the html snapshots 
      //set `removeScripts` to true. It's false by default 
      removeScripts: true, 
      //he goes the list of all urls that should be fetched 
      urls: [ 
       '', 
       '#!/en-gb/showcase' 
      ] 
      } 
     } 
    } 
});