在这个问题上“Generating an MD5 checksum of a file”,我有这样的代码:可以参考依靠关闭Python中的文件吗?
import hashlib
def hashfile(afile, hasher, blocksize=65536):
buf = afile.read(blocksize)
while len(buf) > 0:
hasher.update(buf)
buf = afile.read(blocksize)
return hasher.digest()
[(fname, hashfile(open(fname, 'rb'), hashlib.sha256())) for fname in fnamelst]
我被批评为打开列表理解的内部文件,以及一个人认为,如果我有足够长的名单我将耗尽打开文件句柄。接口显着降低了hashfile
的灵活性,并建议使用散列文件获取文件名参数并使用with
。
是否有必要?我真的做错了什么吗?
测试出这个代码:
#!/usr/bin/python3
import sys
from pprint import pprint # Pretty printing
class HereAndGone(object):
def __init__(self, i):
print("%d %x -> coming into existence." % (i, id(self)),
file=sys.stderr)
self.i_ = i
def __del__(self):
print("%d %x <- going away now." % (self.i_, id(self)),
file=sys.stderr)
def do_nothing(hag):
return id(hag)
l = [(i, do_nothing(HereAndGone(i))) for i in range(0, 10)]
pprint(l)
结果输出:
0 7f0346decef0 -> coming into existence.
0 7f0346decef0 <- going away now.
1 7f0346decef0 -> coming into existence.
1 7f0346decef0 <- going away now.
2 7f0346decef0 -> coming into existence.
2 7f0346decef0 <- going away now.
3 7f0346decef0 -> coming into existence.
3 7f0346decef0 <- going away now.
4 7f0346decef0 -> coming into existence.
4 7f0346decef0 <- going away now.
5 7f0346decef0 -> coming into existence.
5 7f0346decef0 <- going away now.
6 7f0346decef0 -> coming into existence.
6 7f0346decef0 <- going away now.
7 7f0346decef0 -> coming into existence.
7 7f0346decef0 <- going away now.
8 7f0346decef0 -> coming into existence.
8 7f0346decef0 <- going away now.
9 7f0346decef0 -> coming into existence.
9 7f0346decef0 <- going away now.
[(0, 139652050636528),
(1, 139652050636528),
(2, 139652050636528),
(3, 139652050636528),
(4, 139652050636528),
(5, 139652050636528),
(6, 139652050636528),
(7, 139652050636528),
(8, 139652050636528),
(9, 139652050636528)]
很明显,正在创建的列表理解的每个元素构成破坏每个HereAndGone
对象。只要没有引用它,Python引用计数就会释放该对象,这会在计算该列表元素的值后立即发生。
当然,也许一些其他的Python实现不这样做。 Python实现需要做某种形式的引用计数吗?从gc
模块的文档看来,引用计数是该语言的核心功能。
而且,如果我确实做错了,你会如何建议我重新编写它以保持列表理解的简洁明了,以及可以像文件一样读取的任何接口的灵活性?
“Python实现需要做某种形式的引用计数吗?” - 没有。 – user2357112
“它似乎从gc模块的文档中看起来像引用计数是该语言的核心功能。” - 大多数gc模块,尤其是关于关闭的部分,应被视为可选功能。 – user2357112
修改'hashfile',以便获取文件名并处理打开和关闭文件本身。一般来说,使用系统的内存管理来管理其他资源是一个糟糕的主意。 – tfb