如何打包打开的二进制流 - Python 2 file，Python 3 io.BufferedReader，io.BytesIO - io.TextIOWrapper？使用io.TextIOWrapper打包流打印

我想编写代码，将工作不变：

运行于Python的2
运行在Python 3的
与标准库生成的二进制流（即我可以无法控制它们是什么类型）
使二进制流成为测试双打（即没有文件句柄，无法重新打开）。
生成包装指定流的io.TextIOWrapper。

io.TextIOWrapper是需要的，因为它的API是标准库的其他部分所期望的。其他文件类型存在，但不提供正确的API。

例

包装纸呈现为subprocess.Popen.stdout属性二进制流：

import subprocess 
import io 

gnupg_subprocess = subprocess.Popen(
     ["gpg", "--version"], stdout=subprocess.PIPE) 
gnupg_stdout = io.TextIOWrapper(gnupg_subprocess.stdout, encoding="utf-8")

在单元测试中，流被替换为io.BytesIO实例来控制其内容不接触任何子过程或文件系统。

gnupg_subprocess.stdout = io.BytesIO("Lorem ipsum".encode("utf-8"))

在Python 3的标准库创建的流上工作正常。相同的代码，但是，无法在被Python 2生成的流：

[Python 2] 
>>> type(gnupg_subprocess.stdout) 
<type 'file'> 
>>> gnupg_stdout = io.TextIOWrapper(gnupg_subprocess.stdout, encoding="utf-8") 
Traceback (most recent call last): 
    File "<stdin>", line 1, in <module> 
AttributeError: 'file' object has no attribute 'readable'

不是一个解决方案：特殊处理`file`

一个明显的响应是具有其中测试是否流实际上代码中的分支是一个Python 2 file对象，并且与io.*对象的处理方式不同。

这不是井测试的代码的选项，因为它使单元测试的一个分支 - 这，为了尽可能快地运行，绝不能造成任何真正文件系统对象 - 不能运动。

单元测试将提供测试双打，而不是真正的file对象。因此创建一个不会被那些测试双打执行的分支就是击败测试套件。

不是一个解决方案：`io.open`

一些答复建议重新开口（例如与io.open）底层文件句柄：

3210

这两个的Python 3和Python 2的工作原理：

[Python 3] 
>>> type(gnupg_subprocess.stdout) 
<class '_io.BufferedReader'> 
>>> gnupg_stdout = io.open(gnupg_subprocess.stdout.fileno(), mode='r', encoding="utf-8") 
>>> type(gnupg_stdout) 
<class '_io.TextIOWrapper'>

[Python 2] 
>>> type(gnupg_subprocess.stdout) 
<type 'file'> 
>>> gnupg_stdout = io.open(gnupg_subprocess.stdout.fileno(), mode='r', encoding="utf-8") 
>>> type(gnupg_stdout) 
<type '_io.TextIOWrapper'>

但是，当然，它依赖于从重新打开一个真正的文件其文件句柄。因此，它在单元测试中失败当测试双为io.BytesIO实例：

>>> gnupg_subprocess.stdout = io.BytesIO("Lorem ipsum".encode("utf-8")) 
>>> type(gnupg_subprocess.stdout) 
<type '_io.BytesIO'> 
>>> gnupg_stdout = io.open(gnupg_subprocess.stdout.fileno(), mode='r', encoding="utf-8") 
Traceback (most recent call last): 
    File "<stdin>", line 1, in <module> 
io.UnsupportedOperation: fileno

不是一个解决方案：`codecs.getreader`

标准库还具有codecs模块，它提供了包装的特点：

import codecs 

gnupg_stdout = codecs.getreader("utf-8")(gnupg_subprocess.stdout)

这是好事，因为它不会尝试重新打开该流。但它未能提供io.TextIOWrapper API。具体来说，它不继承io.IOBase和没有encoding属性：

>>> type(gnupg_subprocess.stdout) 
<type 'file'> 
>>> gnupg_stdout = codecs.getreader("utf-8")(gnupg_subprocess.stdout) 
>>> type(gnupg_stdout) 
<type 'instance'> 
>>> isinstance(gnupg_stdout, io.IOBase) 
False 
>>> gnupg_stdout.encoding 
Traceback (most recent call last): 
    File "<stdin>", line 1, in <module> 
    File "/usr/lib/python2.7/codecs.py", line 643, in __getattr__ 
    return getattr(self.stream, name) 
AttributeError: '_io.BytesIO' object has no attribute 'encoding'

所以codecs不提供对象，其替代io.TextIOWrapper。

是什么呢？

所以，我怎么能写代码，无论对于Python 2和Python 3中工作，同时与测试双打和实物，其中周围的包装已经打开字节流的io.TextIOWrapper？

来源

2015-12-24 bignose

重：'io.open'你可以改变的单元测试，你知道的，例如一个'tempfile.TemporaryFile（）';这当然是一个解决方案的锤子... –

这是一个相当有限的一套限制。例如，单元测试*可以*打开文件，如果这绝对是正确测试某些东西的唯一方法。所以一个可以特殊处理'file'对象来获取文件描述符的包装函数，可以用unittest *进行测试。 –

工程基于在各种论坛多个建议，并与标准库试验，以符合标准，我目前的结论是这个不能做与图书馆和类型，因为我们现在有他们。

来源

2015-12-30 07:47:35 bignose

使用codecs.getreader以产生包装对象：

text_stream = codecs.getreader("utf-8")(bytes_stream)

关于Python 2和Python 3.

来源

2015-12-29 13:11:55 jbg

感谢您的建议。该对象虽然没有提供足够的'io.TextIOWrapper' API，所以不是解决方案。 – bignose

啊，太糟糕了。我想你可以把你的测试数据放在一个文件中......：/ – jbg

已经解决了这个问题：这也需要与非真实文件的测试双打一起工作。 – bignose

事实证明，你只需要来包装你io.BytesIO在io.BufferedReader它存在两个Python 2和Python 3的

import io 

reader = io.BufferedReader(io.BytesIO("Lorem ipsum".encode("utf-8"))) 
wrapper = io.TextIOWrapper(reader) 
wrapper.read() # returns Lorem ipsum

这个答案最初建议使用os.pipe，但管道的读端将不得不被包装在Python 2的io.BufferedReader中，因此这个解决方案更简单并且避免了分配管道。

来源

2015-12-30 08:38:12 jbg

一个Python 2'file'对象（如通过许多标准库函数创建）时不传递给'io.BufferedReader'构造工作：'AttributeError的：“文件”对象没有属性“readable''。 – bignose

对，我读了几个问题的分支，看看你现在得到了什么。正如你在自己的答案中确定的那样，我不认为你可以对Py2和Py3做这些，而不需要对对象类型和分支进行一些测试。 – jbg

好了，这似乎是一个完整的解决方案，在问题中提到的所有的情况下，与Python 2.7和Python 3.5的测试。一般的解决方案最终会重新打开文件描述符，但是不是io.BytesIO，您需要使用管道进行测试，以便您拥有文件描述符。

import io 
import subprocess 
import os 

# Example function, re-opens a file descriptor for UTF-8 decoding, 
# reads until EOF and prints what is read. 
def read_as_utf8(fileno): 
    fp = io.open(fileno, mode="r", encoding="utf-8", closefd=False) 
    print(fp.read()) 
    fp.close() 

# Subprocess 
gpg = subprocess.Popen(["gpg", "--version"], stdout=subprocess.PIPE) 
read_as_utf8(gpg.stdout.fileno()) 

# Normal file (contains "Lorem ipsum." as UTF-8 bytes) 
normal_file = open("loremipsum.txt", "rb") 
read_as_utf8(normal_file.fileno()) # prints "Lorem ipsum." 

# Pipe (for test harness - write whatever you want into the pipe) 
pipe_r, pipe_w = os.pipe() 
os.write(pipe_w, "Lorem ipsum.".encode("utf-8")) 
os.close(pipe_w) 
read_as_utf8(pipe_r) # prints "Lorem ipsum." 
os.close(pipe_r)

来源

2015-12-30 13:59:48 jbg

已经在问题中解决：测试双打不是真实的文件。 'io.open'将不起作用，因为测试双打不能通过路径或文件句柄重新打开。 – bignose

正如答案中所述，我正在通过使用管道而不是BytesIO进行测试双打来解决该问题......或者是否有某些原因限制您使用BytesIO？在我看来，BytesIO（在Python 2上）不够“像”你在真实代码中使用的对象是一个很好的理由，不会将它用作测试双... – jbg

整个单元测试套件是使用'io.StringIO'和'io.BytesIO'进行大量文件操作的测试双打。作为一种解决方案，我正在排除“为这种情况做一套特殊的测试双打”。我正在寻找一种解决方案，可以与通常的假文件（从io.IOBase继承的文件）和两个Python版本的正常文件一起使用。 – bignose

我需要这个为好，但基于这里的线程上，我决定了它使用的Python 2的io模块是不可能的。虽然这个伤了你的规则“为file特殊待遇”，我去的技术是创建file（下面的代码），然后可以打包在一个io.BufferedReader，这反过来又会传递给io.TextIOWrapper构造非常薄的包装。单元测试会很痛苦，因为显然新的代码路径不能在Python 3上测试。

顺便提一下，open()的结果可以直接传递给Python 3中的io.TextIOWrapper的原因是因为二进制-mode open()实际上返回一个io.BufferedReader实例与开始（至少在Python的3.4，这是我当时测试）。

import io 
import six # for six.PY2 

if six.PY2: 
    class _ReadableWrapper(object): 
     def __init__(self, raw): 
      self._raw = raw 

     def readable(self): 
      return True 

     def writable(self): 
      return False 

     def seekable(self): 
      return True 

     def __getattr__(self, name): 
      return getattr(self._raw, name) 

def wrap_text(stream, *args, **kwargs): 
    # Note: order important here, as 'file' doesn't exist in Python 3 
    if six.PY2 and isinstance(stream, file): 
     stream = io.BufferedReader(_ReadableWrapper(stream)) 

    return io.TextIOWrapper(stream)

至少这是小的，所以希望它最大限度地减少用于不能容易地进行单元测试部件曝光。

来源

2017-02-19 18:13:55 Vek

下面是一些代码，我在Python 2.7和Python 3.6都测试过。

这里的关键是，你需要首先在你前面的流使用分离（）。这不会关闭底层文件，它只是撕掉原始流对象，以便重用它。 detach（）将返回一个可用TextIOWrapper封装的对象。

作为一个例子，我打开一个二进制读取模式的文件，对它进行读取，然后通过io.TextIOWrapper切换到UTF-8解码的文本流。

我救了这个例子作为this-file.py

import io 

fileName = 'this-file.py' 
fp = io.open(fileName,'rb') 
fp.seek(20) 
someBytes = fp.read(10) 
print(type(someBytes) + len(someBytes)) 

# now let's do some wrapping to get a new text (non-binary) stream 
pos = fp.tell() # we're about to lose our position, so let's save it 
newStream = io.TextIOWrapper(fp.detach(),'utf-8') # FYI -- fp is now unusable 
newStream.seek(pos) 
theRest = newStream.read() 
print(type(theRest), len(theRest))

这是我得到的时候我既python2和python3运行它。

$ python2.7 this-file.py 
(<type 'str'>, 10) 
(<type 'unicode'>, 406) 
$ python3.6 this-file.py 
<class 'bytes'> 10 
<class 'str'> 406

显然，打印语法是不同的，如预期的变量类型的Python版本之间的不同，但就像它应该在这两种情况下。

来源

2017-03-01 23:15:48

使用io.TextIOWrapper打包流打印

不是一个解决方案：特殊处理file

不是一个解决方案：io.open

不是一个解决方案：codecs.getreader

是什么呢？

回答

相关问题

不是一个解决方案：特殊处理`file`

不是一个解决方案：`io.open`

不是一个解决方案：`codecs.getreader`