2017-04-22 88 views



#!/usr/bin/env perl 
use 5.014; 
use warnings; 
use Digest::MD5; 
use Path::Tiny; 

# create some test-files in the tempdir 
my @filenames = qw(a b); 
my $testdir = Path::Tiny->tempdir; 
$testdir->child($_)->spew($_) for @filenames; #create 2 files 

dirmd5($testdir, @filenames); 

sub dirmd5 { 
    my($dir, @files) = @_; 

    my $dirctx = Digest::MD5->new; #the md5 for the whole directory 

    for my $fname (@files) { 

     # calculate the md5 for one file 
     my $filectx = Digest::MD5->new; 
     my $fd = $dir->child($fname)->openr_raw; 
     close $fd; 
     say "md5 for $fname : ", $filectx->clone->hexdigest; 

     # want somewhat "add" the above file-md5 to the directory md5  
     # this not work - even if the $filectx isn't reseted (note the "clone" above) 

     # works adding the file as bellow, 
     # but this calculating the md5 again 
     # e.g. for each file the calculation is done two times... 
     # once for the file-alone (above) 
     # and second time for the directory 
     # too bad if case of many and large files. ;(
     # especially, if i want calculate the md5sum for the whole directory trees 
     $fd = $dir->child($fname)->openr_raw; 
     close $fd; 
    say "md5 for dir: ", $dirctx->hexdigest; 


md5 for a : 0cc175b9c0f1b6a831c399e269772661 
md5 for b : 92eb5ffee6ae2fec3ad71c777531578f 
md5 for dir: 187ef4436122d1cc2f40dc2b92f0eba0 

这是正确的,但不幸的是低效率的方式。 (见评论)。

阅读the docs,我没有找到任何方式重用已经计算出来的md5。例如如上面的$dirctx->add($filectx);。可能这是不可能的。


编号:试图有所解决this question



号没有什么,涉及MD5(initial data)MD5(new data)MD5(initial data + new data)因为位置在流事务中的数据,以及其价值。否则它不会是一个非常有用的错误检查为abaaabbaa可能都具有相同的校验


#!/usr/bin/env perl 

use 5.014; 
use warnings 'all'; 

use Digest::MD5; 
use Path::Tiny; 

# create some test-files in the tempdir 
my @filenames = qw(a b); 
my $testdir = Path::Tiny->tempdir; 
$testdir->child($_)->spew($_) for @filenames; # create 2 files 

dirmd5($testdir, @filenames); 

sub dirmd5 { 
    my ($dir, @files) = @_; 

    my $dir_ctx = Digest::MD5->new; #the md5 for the whole directory 

    for my $fname (@files) { 

     my $data = $dir->child($fname)->slurp_raw; 

     # calculate the md5 for one file 
     my $file_md5 = Digest::MD5->new->add($data)->hexdigest; 
     say "md5 for $fname : $file_md5"; 


    my $dir_md5 = $dir_ctx->hexdigest; 
    say "md5 for dir: $dir_md5"; 


#!/usr/bin/env perl 

use 5.014; 
use warnings 'all'; 

use Digest::MD5; 
use Path::Tiny; 
use Fcntl ':seek'; 

# create some test-files in the tempdir 
my @filenames = qw(a b); 
my $testdir = Path::Tiny->tempdir; 
$testdir->child($_)->spew($_) for @filenames; # create 2 files 

dirmd5($testdir, @filenames); 

sub dirmd5 { 
    my ($dir, @files) = @_; 

    my $dir_ctx = Digest::MD5->new; # The digest for the whole directory 

    for my $fname (@files) { 

     my $fh = $dir->child($fname)->openr_raw; 

     # The digest for just the current file 
     my $file_md5 = Digest::MD5->new->addfile($fh)->hexdigest; 
     say "md5 for $fname : $file_md5"; 

     seek $fh, 0, SEEK_SET; 

    my $dir_md5 = $dir_ctx->hexdigest; 
    say "md5 for dir: $dir_md5"; 

啊所以。然后,想要为具有多个嵌套目录的整个目录树计算摘要是毫无意义的,因为我需要为每个文件重复计算每个文件的摘要并重复上面的每个目录......呃...... :(需要弄清楚一些其他的“逻辑”为[重复目录树](http://stackoverflow.com/q/43560796/869025)问题。“谢谢。 – cajwine


@cajwine:没有必要,只要保留一个文摘文件和树中的每一个祖先目录,任何文件中的数据都必须添加到每个祖先目录的摘要中,这与处理树中每个目录的大小几乎相同,只是你可以'最后只需为孩子们添加值。 – Borodin