搜索最常见OCCURENCES

我有一个文本日志文件，包含由分隔数据的线“|”搜索最常见OCCURENCES

例如

date | time | ip | geo-location (city) | page viewed ......

我需要找到10个最发生的历史“页面视图”在文本文件中....

页面视图的每个日志被列为：

//pageurl

为TH Ë日志是单独的行我假设我会

// [url name] \r\n

之间搜索的网页网址如何将我的代码搜索，列出前10个网址，并列出他们到一个数组....

例如：

$url[0] <<this would be the most occuring url 
$url[1] <<thos would be the second most occuring url

等等.....直到我可以列出它们备份到：

$url[9] <<which would be the 10th most common url

我不肯定我怎么会在“//”和“\ r \ n”个之间

在此先感谢您的帮助搜索

然后转换前10位最常见的OCCURENCES到一个数组....：）

编辑：这里是我的日志的2倍线，只是为了帮助更多的，如果我能

sunday, january 22, 2012 | 16:14:36 | 82.**.***.*** | bolton | //error 
sunday, january 22, 2012 | 17:12:52 | 82.**.***.*** | bolton | //videos

感谢

来源

2012-01-22 DJ-P.I.M.P

你想要什么语言或工具OMN什么平台解决工作？或者你只想伪代码 – rene

我使用的是Windows下的Apache服务器上的PHP编码，感谢 –

$数据=“$时间| $ IP | $城市| $定位”。 “$结束”; <<<<<这是我使用以将数据写入到文本文件中的代码...... $端=“\ r \ n”个; <<<<<即变量$结束表示写入新的生产线，也许这将定义搜索的终点帮助.....我认为这是\ n，而是忘了，我不得不改变它用\ r \ n所以它实际上创造新的生产线 –

根据所给出的信息，这里是一个相当原始的方法：

/* get the contents of the log file */ 
$log_file = file_get_contents(__DIR__.'/log.txt'); 

/* split the log into an array of lines */ 
$log_lines = explode(PHP_EOL, $log_file); 

/* we don't need the log file anymore, so free up some memory */ 
unset($log_file); 

/* loop through each line */ 
$page_views = array(); 
foreach ($log_lines as $line) { 
    /* get the text after the last pipe character (the page view), minus the ' //' */ 
    $page_views[] = ltrim(array_pop(explode('|', $line)), ' /'); 
} 

/* we don't need the array of lines either, so free up that memory */ 
unset($log_lines); 

/* count the frequency of each unique occurrence */ 
$urls = array_count_values($page_views); 

/* sort highest to lowest (may be redundant, I think array_count_values does this) */ 
arsort($urls, SORT_NUMERIC); 

print_r($urls); 
/* [page_url] => num page views, ... */ 

/* that gives you occurrences, but you want a numerical 
    indexed array for a top ten, so... */ 

$top_ten = array(); 
$i = 0; 
/* loop through the array, and store the keys in a new one until we have 10 of them */ 
foreach ($urls as $url => $views) { 
    if ($i >= 10) break; 
    $top_ten[] = $url; 
    $i++; 
} 

print_r($top_ten); 
/* [0] => page url, ... */

**脚本输出：**

Array 
(
    [videos] => 1 
    [error ] => 1 
) 
Array 
(
    [0] => videos 
    [1] => error 
)

这不是最优化的解决方案，以及更大的日志文件，时间越长，将采取。为此，您最好登录到数据库并从中查询。

来源

2012-01-22 17:38:46 mrlee

感谢现在试试吧:) –

我创造了这个新的PHP文件，只是加载空白页（我得到这个代码时出错）我也曾尝试回声插入“你好“;在开始时看看是否有输出，什么都没有？这段代码对我来说太复杂了，以至于错误检查lol道歉 –

自从修复之后，我发现了一个小的语法错误。你会想添加'ini_set（'display_errors'，1）; error_reporting（E_ALL）;'开始，显示确实发生的错误。 – mrlee

搜索最常见OCCURENCES

回答

相关问题