不能使用字符串作为哈希引用..？

我想解析一个网络索引程序的HTML文档。为此，我使用HTML::TokeParser。不能使用字符串作为哈希引用..？

我对我的第一个if语句的最后一行得到一个错误：

if ($token->[1] eq 'a') { 
    #href attribute of tag A 
    my $suffix = $token->[2]{href};

，说Can't use string ("<./a>") as a HASH ref while "strict refs" in use at ./indexer.pl line 270, <PAGE_DIR> line 1.

是我的问题是（？后缀或<./a>）是一个字符串，需要变成一个哈希引用？我查看了其他有类似错误的帖子......但我仍然对此一无所知。谢谢你的帮助。

sub parse_document { 

    #passed from input 
    my $html_filename = $_[0]; 

    #base url for links 
    my $base_url = $_[1]; 

    #created to hold tokens 
    my @tokens =(); 

    #created for doc links 
    my @links =(); 

    #creates parser 
    my $p = HTML::TokeParser->new($html_filename); 

    #loops through doc tags 
    while (my $token = $p->get_token()) { 
     #code for retrieving links 
     if ($token->[1] eq 'a') { 
      # href attribute of tag A 
      my $suffix = $token->[2]{href}; 

      #if href exists & isn't an email link 
      if (defined($suffix) && !($suffix =~ "^mailto:")) { 
       #make the url absolute 
       my $new_url = make_absolute_url $base_url, $suffix; 

       #make sure it's of the http:// scheme 
       if ($new_url =~ "^http://"){ 
        #normalize the url 
        my $new_normalized_url = normalize_url $new_url; 

        #add it to links array 
        push(@links, $new_normalized_url); 
       } 
      } 
     } 

     #code for text words 
     if ($token->[0] eq 'T') { 
      my $text = $token->[1]; 

      #add words to end of array 
      #(split by non-letter chars) 
      my @words = split(/\P{L}+/, $text); 
     } 
    } 

    return (\@tokens, \@links); 
}

来源

2011-10-31 mdegges

我会打印出一些调试语句，看看到底它认为令牌要通过数据::自卸车（$令牌），也见$ token - > [1]是什么。这可能是一个'或类似的东西搞乱了价值观。 – scrappedcola

get_token()方法返回一个数组，其中$token->[2]是包含您的href的哈希引用，仅当$token->[0]是S（即，开始标记）时。在这种情况下，您匹配的是结束标签（其中$token->[0]是E）。详情请参阅PerlDoc。

要修复，在你的循环顶部添加

next if $token->[0] ne 'S';

。

来源

2011-10-31 19:37:06

谢谢！我以为我可以忽略开始标签的检查，因为我并不真正了解它的用途......但我想在这里需要使用休息时间。 – mdegges

显然$token->[2]被解析为散列基准，其值是"</a>"。当然不希望你想要！

来源

2011-10-31 19:34:13 ennuikiller

实际上'$ token - > [2]'是一个字符串（'“”'），他试图*使用它作为散列引用。 –

@Brian是的，谢谢你的更正！ – ennuikiller

$token->[2]是一个字符串，而不是一个散列引用。

做一个print $token->[2]，你会看到它是包含字符串</a>

来源

2011-10-31 19:39:06

不能使用字符串作为哈希引用..？

回答

相关问题