2008-11-28 37 views
2

我在下面列出了一个文本列表,它来自一个名为EVE Online的流行在线游戏,当你杀死一个人在游戏中时,它基本上会邮寄给你。我正在构建一个工具来解析这些使用PHP来提取所有相关信息。我将需要显示所有信息,并且正在撰写类以很好地将其分解为相关的封装数据。以PHP解析动态文本列表的最佳方式

2008.06.19 20:53:00 

Victim: Massi 
Corp: Cygnus Alpha Syndicate 
Alliance: NONE 
Faction: NONE 
Destroyed: Raven 
System: Jan 
Security: 0.4 
Damage Taken: 48436 

Involved parties: 

Name: Kale Kold 
Security: -10.0 
Corp: Vicious Little Killers 
Alliance: NONE 
Faction: NONE 
Ship: Drake 
Weapon: Hobgoblin II 
Damage Done: 22093 

Name: Harulth (laid the final blow) 
Security: -10.0 
Corp: Vicious Little Killers 
Alliance: NONE 
Faction: NONE 
Ship: Drake 
Weapon: Caldari Navy Scourge Heavy Missile 
Damage Done: 16687 

Name: Gistatis Tribuni/Angel Cartel 
Damage Done: 9656 

Destroyed items: 

Capacitor Power Relay II, Qty: 2 
Paradise Cruise Missile, Qty: 23 
Cataclysm Cruise Missile, Qty: 12 
Small Tractor Beam I 
Alloyed Tritanium Bar, Qty: 2 (Cargo) 
Paradise Cruise Missile, Qty: 1874 (Cargo) 
Contaminated Nanite Compound (Cargo) 
Capacitor Control Circuit I, Qty: 3 
Ballistic Deflection Field I 
'Malkuth' Cruise Launcher I, Qty: 3 
Angel Electrum Tag, Qty: 2 (Cargo) 

Dropped items: 

Ballistic Control System I 
Shield Boost Amplifier I, Qty: 2 
Charred Micro Circuit, Qty: 4 (Cargo) 
Capacitor Power Relay II, Qty: 2 
Paradise Cruise Missile, Qty: 10 
Cataclysm Cruise Missile, Qty: 21 
X-Large Shield Booster II 
Cataclysm Cruise Missile, Qty: 3220 (Cargo) 
Fried Interface Circuit (Cargo) 
F-S15 Braced Deflection Shield Matrix, Qty: 2 
Salvager I 
'Arbalest' Cruise Launcher I 
'Malkuth' Cruise Launcher I, Qty: 2 

我正在考虑使用正则表达式来解析数据,但你会怎么做呢?你会将邮件折叠成一行字符串还是解析数组中的每一行?麻烦的是有一些异常可以解释。

首先,“涉及到的各方:”部分是动态的,可以包含大量具有类似结构的人员,但如果计算机控制的敌人也会对受害者进行射击,则会缩短为“姓名'和'Damage Done'字段,如上所示(Gistatis Tribuni/Angel Cartel)。

其次,'毁坏'和'丢弃'物品是动态的,并且每封邮件的长度都不相同,我还需要获得数量和其他货物的数量。

一种方法的想法是受欢迎的。

回答

3

如果你想要灵活的东西,使用状态机的方法。

如果你想快速和肮脏的东西,使用正则表达式。

对于第一种解决方案,您可以使用专门用于parsin的库,因为它不是一项简单的任务。但由于它是相当简单的格式,你可以破解一个天真的解析器,例如:

<?php 

class Parser 
{ 
    /* Enclosing the parser in a class is not mandatory but it' clean */ 

    function Parser() 
    { 

     /* data holder */ 
     $this->date = ''; 
     $this->parties = array(); 
     $this->victim = array(); 
     $this->items = array("Destroyed" => array(), 
              "Dropped" => array()); 

     /* Map you states on actions. Sub states can be necessary (and sub parsers too :-) */     
     $this->states = array('Victim' => 'victim_parsing', 
              'Involved' => 'parties_parsing' , 
              'items:' => "item_parsing"); 


     $this->state = 'start';      
     $this->item_parsing_state = 'Destroyed';  
     $this->partie_parsing_state = '';   
     $this->parse_tools = array('start' => 'start_parsing', 
              'parties_parsing' =>'parties_parsing', 
              'item_parsing' => 'item_parsing', 
              'victim_parsing' => 'victim_parsing'); 


    } 

    /* the magic job is done here */ 

    function checkLine($line) 
    { 
     foreach ($this->states as $keyword => $state) 
      if (strpos($line, $keyword) !== False) 
        $this->state = $this->states[$keyword]; 

     return trim($line); 
    } 

    function parse($file) 
    { 
     $this->file = new SplFileObject($file); 
     foreach ($this->file as $line) 
      if ($line = $this->checkLine($line)) 
       $this->{$this->parse_tools[$this->state]}($line); 
    } 


    /* then here you can define as much as parsing rules as you want */ 

    function victim_parsing($line) 
    { 
     $victim_caract = explode(': ', $line); 
     $this->victim[$victim_caract[0]] = $victim_caract[1]; 
    } 

    function start_parsing($line) 
    { 
     $this->date = $line; 
    } 

    function item_parsing($line) 
    { 
     if (strpos($line, 'items:') !== False) 
     { 
      $item_state = explode(' ', $line); 
      $this->item_parsing_state = $item_state[0]; 
     } 
      else 
     { 
       $item_caract = explode(', Qty: ', $line); 
       $this->items[$this->item_parsing_state][$item_caract[0]] = array(); 
       $item_infos = explode(' ', $item_caract[1]); 
       $this->items[$this->item_parsing_state][$item_caract[0]] ['qty'] = empty($item_infos[0]) ? 1 : $item_infos[0]; 
       $this->items[$this->item_parsing_state][$item_caract[0]] ['cargo'] = !empty($item_infos[1]) ? "True": "False"; 
       if (empty($this->items[$this->item_parsing_state][$item_caract[0]] ['qty'])) 
       print $line; 
     } 
    } 

    function parties_parsing($line) 
    {   

     $partie_caract = explode(': ', $line); 

     if ($partie_caract[0] == "Name") 
     { 
      $this->partie_parsing_state = $partie_caract[1]; 
      $this->parties[ $this->partie_parsing_state ] = array(); 
     } 
     else 
      $this->parties[ $this->partie_parsing_state ][$partie_caract[0]] = $partie_caract[1]; 

    } 

} 

/* a little test */ 

$parser = new Parser(); 
$parser->parse('test.txt'); 

echo "======== Fight report - ".$parser->date." ==========\n\n"; 
echo "Victim :\n\n"; 
print_r($parser->victim); 
echo "Parties :\n\n"; 
print_r($parser->parties); 
echo "Items: \n\n"; 
print_r($parser->items); 

?> 

我们能做到这一点,因为在这里,可靠性和PERF是不是一个问题:-)

快乐游戏!

12

我可能会采用状态机的方法,按顺序读取每一行并根据当前状态进行处理。

有些行像“Dropped items:”更改状态,导致您将下列行解释为项目。在“阅读有关各方”的状态中,你会将每一行添加到关于该人的一系列数据中,而当你阅读一个空白行时,你知道你有一个完整的记录。

这里有一个粗略的FSM我的GraphViz

State machine http://i34.tinypic.com/4zvtc5.png

敲起来有些边缘触发动作在你的代码,就像读空行。

+0

我无法与图片竞争,因为+1也打破了div :) – Owen 2008-11-28 09:51:17

+0

Definitly最专业的答案。但我不确定这会对他有多大帮助。它适用于EVE在线,不适用于ADA输出解析器... – 2008-11-28 12:59:15

相关问题