2017-01-19 145 views
0

我一直在通过许多教程和问题在这里工作,但仍然无法解决为什么我在使用cURL和PHP登录到网站时遇到'403 Forbidden' 。有问题的网站登录名是:https://science.swansea.ac.uk/intranet/accounts/login/PHP + cURL网站登录'403禁止'

初始请求起作用(代码200),并将cookie保存到文件。然后,我将这个cookie剥离并根据需要添加到帖子表单中。

此外,我必须补充说,我在localhost wamp服务器上运行这个php脚本,如果这可能是一个问题?

如果任何人都可以指出我会朝着正确的方向发展,因为我一直在努力工作一段时间,但没有结果。

PHP +卷曲代码:

<?php 

    $base_url = 'https://science.swansea.ac.uk/intranet/accounts/login/?next=/intranet/'; 
    $login_url = 'https://science.swansea.ac.uk/intranet/accounts/login/'; 
    $user_agent = "Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/55.0.2883.87 Safari/537.36"; 
    $username = '*******'; 
    $password = '*******'; 
    $cookie = 'cookie.txt'; 

    $ch = curl_init(); 
    curl_setopt($ch, CURLOPT_URL, $base_url); 
    curl_setopt($ch, CURLOPT_USERAGENT,$user_agent); 
    curl_setopt($ch, CURLOPT_RETURNTRANSFER,true); 
    curl_setopt($ch, CURLOPT_FOLLOWLOCATION,true); 
    curl_setopt($ch, CURLOPT_AUTOREFERER, 1); 
    curl_setopt($ch, CURLOPT_HEADER, 1); 
    curl_setopt($ch, CURLOPT_ENCODING, 'gzip, deflate, br'); 
    curl_setopt($ch, CURLOPT_COOKIEJAR, realpath($cookie)); 
    curl_setopt($ch, CURLOPT_COOKIEFILE, realpath($cookie)); 
    curl_setopt($ch, CURLOPT_TIMEOUT,30); 
    curl_setopt($ch, CURLOPT_SSL_VERIFYHOST, 0); 
    curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, 0); 

    curl_setopt($ch, CURLOPT_VERBOSE, 1); 
    curl_setopt($ch, CURLOPT_STDERR, fopen(realpath("verbose.txt"), 'w')); 

    $resp = curl_exec($ch); 
    var_dump($resp); 


    $headers = array(
     'Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8', 
     'Connection: keep-alive', 
     'Cache-Control: max-age=0', 
     'Origin: https://science.swansea.ac.uk', 
     'Upgrade-Insecure-Requests: 1', 
     'Referer: https://science.swansea.ac.uk/intranet/accounts/login/?next=/intranet/', 
     'Accept-Language: en-US,en;q=0.8' 
    ); 


    // Strip cookie to get token 
    $csrfmiddlewaretoken = explode('csrftoken', file_get_contents(realpath($cookie))); 
    $csrfmiddlewaretoken = trim($csrfmiddlewaretoken[1]); 
    $csrfmiddlewaretoken = substr($csrfmiddlewaretoken, 0, strpos($csrfmiddlewaretoken, "#")); 

    $post = array(
     'csrfmiddlewaretoken' => $csrfmiddlewaretoken, 
     'username' => $username, 
     'password' => $password, 
     'next' => "/intranet/" 
    ); 

    curl_setopt($ch, CURLOPT_URL, $login_url); 
    curl_setopt($ch, CURLOPT_POST, 1); 
    curl_setopt($ch, CURLOPT_POSTFIELDS, http_build_query($post)); 

    // Add headers 
    curl_setopt($ch, CURLOPT_HTTPHEADER, $headers); 


     //Set headers out for debug 
    // curl_setopt($ch, CURLINFO_HEADER_OUT, true); 

    $exec = curl_exec($ch); 
    echo($exec); 

    $info = curl_getinfo($ch); 
    $hinfo = curl_getinfo($ch, CURLINFO_HEADER_OUT); 

    if ($info['http_code'] != 200) { 
     echo "Login failed! HTTP code {$info['http_code']}<br>\n"; 
     var_dump($exec); 

     // Echo post params 
     $params= http_build_query($post); 
     $params = str_replace("%0D%0A", '', $params); 
     echo("$params <br>\n"); 
     echo($hinfo); 
     exit; 
    } 

    echo "Login successful!<br>\n"; 

    // you are now logged in, use $ch to request pages as the logged in user 

    $url = $base_url; 

    curl_setopt($ch, CURLOPT_URL, $url); 
    curl_setopt($ch, CURLOPT_POST, 0); 

    $account = curl_exec($ch); 

?> 

详细输出:

* Trying 137.44.2.221... 
* Connected to science.swansea.ac.uk (137.44.2.221) port 443 (#0) 
* Cipher selection: ALL:!EXPORT:!EXPORT40:!EXPORT56:!aNULL:!LOW:!RC4:@STRENGTH 
* NPN, negotiated HTTP1.1 
* SSL connection using TLSv1.2/ECDHE-RSA-AES256-GCM-SHA384 
* Server certificate: 
* subject: C=GB; ST=West Glamorgan; L=SWANSEA; O=Swansea University; OU=College of Science; CN=science.swansea.ac.uk 
* start date: Apr 29 11:54:39 2016 GMT 
* expire date: Apr 29 11:54:36 2019 GMT 
* issuer: C=BM; O=QuoVadis Limited; CN=QuoVadis Global SSL ICA G2 
* SSL certificate verify result: self signed certificate in certificate chain (19), continuing anyway. 
> GET /intranet/accounts/login/?next=/intranet/ HTTP/1.1 
Host: science.swansea.ac.uk 
User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/55.0.2883.87 Safari/537.36 
Accept: */* 
Accept-Encoding: gzip, deflate, br 
Cookie: csrftoken=BNmrRv29juCijlFX63mpMkzkL4pO2x67; sessionid=unanto4vhu3k4s3cz3ngyjfq5zloihjr 

< HTTP/1.1 200 OK 
< Date: Thu, 19 Jan 2017 21:24:10 GMT 
< Content-Type: text/html; charset=utf-8 
< Transfer-Encoding: chunked 
< Connection: keep-alive 
< Server: gunicorn/0.17.2 
< Last-Modified: Thu, 19 Jan 2017 21:24:10 GMT 
< Expires: Thu, 19 Jan 2017 21:24:10 GMT 
< Vary: Cookie 
< Cache-Control: max-age=0 
* Replaced cookie csrftoken="BNmrRv29juCijlFX63mpMkzkL4pO2x67" for domain science.swansea.ac.uk, path /intranet/, expire 1516310650 
< Set-Cookie: csrftoken=BNmrRv29juCijlFX63mpMkzkL4pO2x67; expires=Thu, 18-Jan-2018 21:24:10 GMT; Max-Age=31449600; Path=/intranet/; secure 
* Replaced cookie sessionid="unanto4vhu3k4s3cz3ngyjfq5zloihjr" for domain science.swansea.ac.uk, path /intranet/, expire 1485033850 
< Set-Cookie: sessionid=unanto4vhu3k4s3cz3ngyjfq5zloihjr; expires=Sat, 21-Jan-2017 21:24:10 GMT; httponly; Max-Age=172800; Path=/intranet/; secure 
< Content-Encoding: gzip 
< 
* Connection #0 to host science.swansea.ac.uk left intact 
* Found bundle for host science.swansea.ac.uk: 0x264f6c800d0 [can pipeline] 
* Re-using existing connection! (#0) with host science.swansea.ac.uk 
* Connected to science.swansea.ac.uk (137.44.2.221) port 443 (#0) 
> POST /intranet/accounts/login/ HTTP/1.1 
Host: science.swansea.ac.uk 
User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/55.0.2883.87 Safari/537.36 
Accept-Encoding: gzip, deflate, br 
Cookie: csrftoken=BNmrRv29juCijlFX63mpMkzkL4pO2x67; sessionid=unanto4vhu3k4s3cz3ngyjfq5zloihjr 
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8 
Connection: keep-alive 
Cache-Control: max-age=0 
Origin: https://science.swansea.ac.uk 
Upgrade-Insecure-Requests: 1 
Referer: https://science.swansea.ac.uk/intranet/accounts/login/ 
Accept-Language: en-US,en;q=0.8 
Content-Length: 140 
Content-Type: application/x-www-form-urlencoded 

* upload completely sent off: 140 out of 140 bytes 
< HTTP/1.1 403 FORBIDDEN 
< Date: Thu, 19 Jan 2017 21:24:10 GMT 
< Content-Type: text/html; charset=utf-8 
< Transfer-Encoding: chunked 
< Connection: keep-alive 
< Server: gunicorn/0.17.2 
< Vary: Cookie 
* Replaced cookie sessionid="unanto4vhu3k4s3cz3ngyjfq5zloihjr" for domain science.swansea.ac.uk, path /intranet/, expire 1485033850 
< Set-Cookie: sessionid=unanto4vhu3k4s3cz3ngyjfq5zloihjr; expires=Sat, 21-Jan-2017 21:24:10 GMT; httponly; Max-Age=172800; Path=/intranet/; secure 
< Content-Encoding: gzip 
< 
* Connection #0 to host science.swansea.ac.uk left intact 
+0

我从来没有见过与br(Brotli)编码支持卷曲生成。你确定你的卷曲是用br编译的吗?如果没有,你会遇到麻烦第一次服务器实际上决定使用br编码.. – hanshenrik

+0

对于迟到的回复@hanshenrik抱歉。这个Brotli编码我只是在将我的头撞到墙上时试图使其工作。感谢您的提示,但! –

回答

1

你的问题是,这些行:

$csrfmiddlewaretoken = explode('csrftoken', file_get_contents(realpath($cookie))); 
$csrfmiddlewaretoken = trim($csrfmiddlewaretoken[1]); 
$csrfmiddlewaretoken = substr($csrfmiddlewaretoken, 0, strpos($csrfmiddlewaretoken, "#")); 

如果你把一个echo ']'.$csrfmiddlewaretoken.'[';此行之后$csrfmiddlewaretoken = substr($csrfmiddlewaretoken, 0, strpos($csrfmiddlewaretoken, "#")); 你可以看到有一个额外空间(请参阅更新)$csrfmiddlewaretoken字符串的末尾。因此,这与服务器寻找的内容以及获得响应的原因有所不同。

所以只是改变了最后两行以上块这样的:

$csrfmiddlewaretoken = $csrfmiddlewaretoken[1]; 
$csrfmiddlewaretoken = trim(substr($csrfmiddlewaretoken, 0, strpos($csrfmiddlewaretoken, "#"))); 

,你会得到一个< HTTP/1.1 200 OK响应

更新

多余的空间,实际上是%0D%0A这是与CR(回车或\ r)+ LF(换行或\ n)相同的ASCII字符“13”和“10”,或者用简单的词语表示,即

如果你看看cookie.txt文件,你有

science.swansea.ac.uk FALSE /intranet/ TRUE 1516357443 csrftoken s5mbN2Fa5tty4UAkjjSix4cxlBLygsHg 
#HttpOnly_science.swansea.ac.uk FALSE /intranet/ TRUE 1485080643 sessionid xvy7rikn6d3iv5xq0g6yisdrv00yjj0z 

这意味着你在一个行,在另一行的开始下一#csrftoken + token。而且因为这行:

$csrfmiddlewaretoken = substr($csrfmiddlewaretoken, 0, strpos($csrfmiddlewaretoken, "#")) 

你是最后一行的#后删除任何东西,但之前你是基平的\ r \ n(输入)。你不得不从你的字符串的末尾删除它

+0

非常感谢!这是我被困在很长时间里的东西。回声($ params)打印“csrfmiddlewaretoken = BNmrRv29juCijlFX63mpMkzkL4pO2x67&用户名=”导致我相信我的令牌没有问题(看不到空白)。是否有一个原因?就这样我知道供将来参考。 –

+0

@TheEnglishMan_ :)不客气,刚添加了一些更新到我的答案,并解释了更多关于你的问题的起源...... – EhsanT

+0

我可以看到你的代码中有这样一行:'$ params = str_replace(“%0D %0A“,'',$ params);'并且您试图从字符串中删除这些”%0D%0A“,但是,您正在执行此行'curl_setopt($ ch,CURLOPT_POSTFIELDS,http_build_query($ post ));'所以从技术上讲,你发送给'science.swansea.ac.uk'服务器的这两个字符都在其中。如果你刚才在那条线之前完成了,那么你就不会面对这个问题 – EhsanT