2015-09-12 21 views
1

那么,我需要发送多个http发布请求(超过2500),并通过curl(php或命令)获得结果,但我不知道它是最好和最好的方式发送它..如何通过CURL同时在PHP上发送多个POST请求? (超过2000请求)

在查询我送,我也必须发送有大约15万个字符(以防万一的)的“javax.faces.ViewState”参数...

我有一个了解和看到的示例代码它是如何工作的:

<?php 
// i need send especific data via post 
// and this data have a combination between internal id and year(from 2005 to current[2015]) 
$ids   = array("988","992","6","989","993","14","26","948","949","950","951","952","27","34","386","953","954","955","956","47","51","53","927","928","929","930","931","933","932","88","94","103","660","1045","1046","1047","1049","1050","1051","1052","1053","1054","1055","1056","1057","1058","1059","1060","1061","1062","1048","114","119","1063","1064","1065","1067","1068","1069","1070","1097","1151","1150","1071","127","132","140","957","959","960","961","962","963","964","965","966","967","968","958","150","151","152","1072","1073","1074","1093","157","158","159","160","188","189","190","195","385","1075","1076","1077","1078","1079","1080","1081","1082","1083","1094","193","1152","1325","1326","206","207","209","214","216","934","935","936","937","938","939","940","941","942","943","944","946","947","223","225","226","227","1084","1085","1086","1087","1088","1095","1251","240","241","244","245","659","662","1089","1090","1091","1092","1096","1328","1013","248","249","250","990","994","996","257","258","991","995","1220","1221","1222","1223","1224","1225","1226","1227","1228","1232","1233","1235","1244","1245","1246","1247","1248","1250","1321","1229","1230","1231","1234","1236","1237","1238","1239","1240","1249","1320","1322","1323","1355"); 
$startYear  = 2005; 
$currentYear = intval(date("Y")); 
// this is "javax.faces.ViewState" a param with between 50.000 to 150.000 characters 
$javaxFacesViewState = "H4sIAAAAAAAAAOy9CWAcSXkv3josWT5l+dxb7OXdxZYszSFpvSyMRrI09ujYmZGMvYBozbQ0bc1Mj3t6dJh/uF5CDkKAAAkECBAWSAjkgBCW5WaBQMJNSEhCwhneIy8hLwTCJnmw//qqr+qjprur24t3nw3b6qnjV1Vffd9XX3119Dv+hdvWkLm7JHl1gL/Ab8YvNFYG+Hq9IhZ5RZRqAwVZEPKK3CwqTVmYlUrCC1/6r69+eOWW7bs4brN+ieM6Oe4WS+6iVK1LNaGmDKRQ0KIobOQkSeF6..... ... ..."; 

// So, i dont have more server, i have only one, so i think use a proxy list over 50 ips 
$proxyList  = array(
    "xxx.xx.x.x:8080", 
    "xxx.xx.x.x:2353", 
    "xxx.xx.x.x:80", 
    "xxx.xx.x.x:434", 
    //... 
    //... 
); 
echo "<ul>"; 
$index = 1; 
for ($i = 0; $i < count($ids); $i++) { 
    echo "<li>"; 
     echo "<strong>ID: <em>".$ids[$i]."</em></strong>"; 
     echo "<ol>"; 
     for ($y = $startYear; $y <= $currentYear; $y++) { 
      echo "<li value='$index'>Year: $y; curl command:<br /><pre><code>curl --proxy ".$proxyList[array_rand($proxyList)]." http://example.com/information.jsp --data=id=".$ids[$i]."&year=$y$y&javax.faces.ViewState=$javaxFacesViewState...</code></pre></li>"; 
      $index++; 
     } 
     echo "</ol>"; 
    echo "</li>"; 
} 
echo "</ul>"; 
echo "<h1>Total request: ".number_format($index,0)."</h1>"; 
?> 

输出结果如下:

  • ID:988
    1. 年份:2005; curl命令:curl --proxy xxx.xx.x.x:455 http://example.com/information.jsp --data=id=12&year=2005&...
    2. 2005年度; curl命令:curl --proxy xxx.xx.x.x:80 http://example.com/information.jsp --data=id=23&year=2005&...
    3. 2005年度; curl命令:curl --proxy xxx.xx.x.x:8080 http://example.com/information.jsp --data=id=4556&year=2005&...
    4. 年份:2005; curl命令:curl --proxy xxx.xx.x.x:235 http://example.com/information.jsp --data=id=34&year=2005&...
    5. ...
    6. ...

总的要求:2135

所以,我需要在最短的时间内发送多交的请求......什么是最好和更好的方式呢?

我的服务器是一个(MT) - DV LVL 1:

  • 2GB RAM
  • 2TB带宽
  • CentOS 6的

较少的/ proc内/ cpuinfo

[root ~]# less /proc/cpuinfo 
processor  : 0 
vendor_id  : GenuineIntel 
cpu family  : 6 
model   : 62 
model name  : Intel(R) Xeon(R) CPU E5-2620 v2 @ 2.10GHz 
stepping  : 4 
microcode  : 1064 
cpu MHz   : 2094.833 
cache size  : 15360 KB 
physical id  : 0 
siblings  : 12 
core id   : 0 
cpu cores  : 6 
apicid   : 0 
initial apicid : 0 
fpu    : yes 
fpu_exception : yes 
cpuid level  : 13 
wp    : yes 
flags   : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good xtopology nonstop_tsc aperfmperf cpuid_faulting pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm pcid dca sse4_1 sse4_2 x2apic popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm ida arat epb pln pts dtherm tpr_shadow vnmi flexpriority ept vpid fsgsbase smep erms xsaveopt 
bogomips  : 4189.66 
clflush size : 64 
cache_alignment : 64 
address sizes : 46 bits physical, 48 bits virtual 
power management: 

processor  : 1 
vendor_id  : GenuineIntel 
cpu family  : 6 
model   : 62 
model name  : Intel(R) Xeon(R) CPU E5-2620 v2 @ 2.10GHz 
stepping  : 4 
  • PHP:5.4.13
  • 卷曲:7.19.7下(x86_64-红帽Linux的GNU的)的libcurl/7.19.7 NSS/3.19.1基本ECC的zlib/1.2.3的libidn/1.18 libssh2/1.4。 2
  • 阿帕奇:2.2.15(UNIX)
  • 的iptables:1.4.7

由于

回答

1

随着curl multi requests。原理:

$ids   = array("988","992","6","989","993","14","26","948","949","950","951","952","27","34","386","953","954","955","956","47","51","53","927","928","929","930","931","933","932","88","94","103","660","1045","1046","1047","1049","1050","1051","1052","1053","1054","1055","1056","1057","1058","1059","1060","1061","1062","1048","114","119","1063","1064","1065","1067","1068","1069","1070","1097","1151","1150","1071","127","132","140","957","959","960","961","962","963","964","965","966","967","968","958","150","151","152","1072","1073","1074","1093","157","158","159","160","188","189","190","195","385","1075","1076","1077","1078","1079","1080","1081","1082","1083","1094","193","1152","1325","1326","206","207","209","214","216","934","935","936","937","938","939","940","941","942","943","944","946","947","223","225","226","227","1084","1085","1086","1087","1088","1095","1251","240","241","244","245","659","662","1089","1090","1091","1092","1096","1328","1013","248","249","250","990","994","996","257","258","991","995","1220","1221","1222","1223","1224","1225","1226","1227","1228","1232","1233","1235","1244","1245","1246","1247","1248","1250","1321","1229","1230","1231","1234","1236","1237","1238","1239","1240","1249","1320","1322","1323","1355"); 

// this is "javax.faces.ViewState" a param with between 50.000 to 150.000 characters 
$javaxFacesViewState = "H4sIAAAAAAAAAOy9CWAcSXkv3josWT5l+dxb7OXdxZYszSFpvSyMRrI09ujYmZGMvYBozbQ0bc1Mj3t6dJh/uF5CDkKAAAkECBAWSAjkgBCW5WaBQMJNSEhCwhneIy8hLwTCJnmw//qqr+qjprur24t3nw3b6qnjV1Vffd9XX3119Dv+hdvWkLm7JHl1gL/Ab8YvNFYG+Hq9IhZ5RZRqAwVZEPKK3CwqTVmYlUrCC1/6r69+eOWW7bs4brN+ieM6Oe4WS+6iVK1LNaGmDKRQ0KIobOQkSeF6..... ... ..."; 

// So, i dont have more server, i have only one, so i think use a proxy list over 50 ips 
$proxyList  = array(
    "xxx.xx.x.x:8080", 
    "xxx.xx.x.x:2353", 
    "xxx.xx.x.x:80", 
    "xxx.xx.x.x:434", 
    //... 
    //... 
); 

// Processing... 

$total_requests = 0; 

$step_urls_count = 10; // 20... 
$curl_batch_urls = array_chunk($ids,$step_urls_count); 

// sending by 10 requests with curl_multi_exec (e.g. 1..10) 
foreach($curl_batch_urls as $batch){ 

    $master = curl_multi_init(); 
    $curl_arr = array(); 

    // common curl options 
    $curl_options = array(
    'CURLOPT_RETURNTRANSFER'=> true, 
    /* ..... 
     other curl options 
     ..... 
    */); 

    // generate curl instances 
    foreach($batch as $url_id){ 
    $ch = curl_init(); 

    // set unique url for each ID 
    $options['CURLOPT_URL'] = "http://example.com/information.jsp --data=id=".$url_id."&javax.faces.ViewState=".$javaxFacesViewState; 

    // random proxy 
    $rand_key = array_rand($proxyList); 
    $options['CURLOPT_PROXY'] = $proxyList[$rand_key]; 

    curl_setopt_array($ch,$options); 
    curl_multi_add_handle($master, $ch); 
    } 

    $running = null; 
    do { 

    // performing curl-handle batch 
    while(($execrun = curl_multi_exec($master, $running)) == CURLM_CALL_MULTI_PERFORM); 
    if($execrun != CURLM_OK) 
     break; 

    // checking each response 
    while($response = curl_multi_info_read($master)) { 
     $info = curl_getinfo($response['handle']); 
     if ($info['http_code'] == 200) { 

      // $output - it`s response for each request 
      $output = curl_multi_getcontent($response['handle']); 
      var_dump($output); 
     } else { 
      // Error! 
     } 
    } 
    } while ($running); 

    curl_multi_close($master);  

} // there go to next loop step (10..20 ids) 

注意: 需要玩$ step_urls_count每循环一步。此外,增加代理列表(100-500)并在设置选项之前检查每个代理的可用性。希望它对你有用。

+0

那么,除了与目标服务器相关的下载速度因素以及最终的Internet协议目标服务器(在我的情况下为IPv4)之外,此脚本需要修改PHP的因素'memory_limit'和'max_execution_time'。 所有这些都有助于改进代码,从而根据上述因素获得每秒更好的平均查询。 –