用Perl编写文本中每个单词的字母

我想用Perl编写一个程序，它应该返回文件中所有单词的频率和文件中每个单词的长度（不是所有字符的总和！）从西班牙语文本中生成一条Zipf曲线（如果您不知道Zipf曲线是什么，则不算什么大问题）。现在我的问题是：我可以做的第一部分，我得到的所有字的频率，但我不怎么得到每个字的长度！ :(我知道在命令行 $ word_length =长度（$字），但试图改变代码后，我真的不知道我应该包括它，以及如何计算每个单词的长度。用Perl编写文本中每个单词的字母

这是我的代码看起来像，直到知道：

#!/usr/bin/perl 
use strict; 
use warnings; 

my %count_of; 
while (my $line = <>) { #read from file or STDIN 
    foreach my $word (split /\s+/gi, $line){ 
    $count_of{$word}++; 
    } 
} 
print "All words and their counts: \n"; 
for my $word (sort keys %count_of) { 
    print "$word: $count_of{$word}\n"; 
} 
__END__

我希望有人有任何建议

来源

2011-05-31 El_Patrón

的'gi'标志：'分裂/ \ s + /，$ line' – toolic 2011-05-31 14:42:22

你不妨检查一下这个问题：http://stackoverflow.com/questions/6170985/counting-individual-单词文本文件当你像你的文件一样进行分割时，你最终会得到'单词'，'单词'和'单词'，它们都被视为不同的单词，这可能不是你想要的。 – TLP 2011-05-31 17:22:03

如果要存储单词的长度，可以使用散列哈希。不需要

while (my $line = <>) { 
    foreach my $word (split /\s+/, $line) { 
     $count_of{$word}{word_count}++; 
     $count_of{$word}{word_length} = length($word); 
    } 
} 

print "All words and their counts and length: \n"; 
for my $word (sort keys %count_of) { 
    print "$word: $count_of{$word}{word_count} "; 
    print "Length of the word:$count_of{$word}{word_length}\n"; 
}

来源

2011-05-31 17:34:08 Shalini

这是一个好主意，谢谢 – 2011-06-02 12:40:06

这将打印旁边的计数长度：

print "$word: $count_of{$word} ", length($word), "\n";

来源

2011-05-31 14:37:44 toolic

哦，谢谢你的快速回答！它工作正常。我是这样写的： print $ word，“\ t”，$ count_of {$ word}，“\ t”，长度（$ word），“\ n”; – 2011-05-31 17:36:29

只为您的信息 - 其他的可能性

length length($word)

可能是：

$word =~ s/(\w)/$1/g

这是不清晰的解决方案为toolic，但可以给你在这个问题上其他视图（TIMTOWTDI :)）

小解释：

\ W和摹修改通过小号///

小号

$ 1可防止覆盖原始$字每一个字母匹配您的$字///返回字母数（与\ w匹配）$ word

来源

2011-05-31 16:30:04 czubatka

你的意思是'$ count = $ word =〜s /（\ w）// g;'会得到字母的个数。 ;） – TLP 2011-05-31 17:18:54

好，好的，我也会试试，谢谢。 – 2011-05-31 17:37:51

@TLP：选中此项： 'my $ word =“word”; 打印$字=〜S /（\ W）/ $ 1 /克;' 输出是： '7' 没有** $ 1 **，你将覆盖** $字**与许多计算字母。 – czubatka 2011-05-31 19:33:59

用Perl编写文本中每个单词的字母

回答

相关问题