在tensorflow教程“UTF-8”解码错误

我运行到哪里，当我运行在tensorflow教程“UTF-8”解码错误

from tensorflow.examples.tutorials.mnist import input_data 

    mnist = input_data.read_data_sets('/home/fqiao/development/MNIST_data/', one_hot=True)

这个奇怪的问题，我得到：

File "<stdin>", line 1, in <module> 
    File "/usr/local/lib/python3.5/dist-packages/tensorflow/examples/tutorials/mnist/input_data.py", line 199, in read_data_sets 
    train_images = extract_images(local_file) 
    File "/usr/local/lib/python3.5/dist-packages/tensorflow/examples/tutorials/mnist/input_data.py", line 58, in extract_images 
    magic = _read32(bytestream) 
    File "/usr/local/lib/python3.5/dist-packages/tensorflow/examples/tutorials/mnist/input_data.py", line 51, in _read32 
    return numpy.frombuffer(bytestream.read(4), dtype=dt)[0] 
    File "/usr/lib/python3.5/gzip.py", line 274, in read 
    return self._buffer.read(size) 
    File "/usr/lib/python3.5/_compression.py", line 68, in readinto 
    data = self.read(len(byte_view)) 
    File "/usr/lib/python3.5/gzip.py", line 461, in read 
    if not self._read_gzip_header(): 
    File "/usr/lib/python3.5/gzip.py", line 404, in _read_gzip_header 
    magic = self._fp.read(2) 
    File "/usr/lib/python3.5/gzip.py", line 91, in read 
    self.file.read(size-self._length+read) 
    File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/platform/default/_gfile.py", line 45, in sync 
    return fn(self, *args, **kwargs) 
    File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/platform/default/_gfile.py", line 199, in read 
    return self._fp.read(n) 
    File "/usr/lib/python3.5/codecs.py", line 321, in decode 
    (result, consumed) = self._buffer_decode(data, self.errors, final) 
UnicodeDecodeError: 'utf-8' codec can't decode byte 0x8b in position 1: invalid start byte

但是，如果我只是运行的代码在input_data.py直接，一切似乎都很好：

>>> dt = numpy.dtype(numpy.uint32).newbyteorder('>') 
>>> f = tf.gfile.Open('/home/fqiao/development/MNIST_data/train-images-idx3-ubyte.gz', 'rb') 
>>> bytestream = gzip.GzipFile(fileobj=f) 
>>> testbytes = numpy.frombuffer(bytestream.read(4), dtype=dt)[0] 
>>> testbytes 
2051

任何人有任何想法是怎么回事？

我的系统：Ubuntu 15.10 x64 python 3.5.0。

来源

2016-02-19 Cescante

似乎存在文本编码问题，请检查文件的文本编码 – Cesar

该错误已由最近的更改555e73d解决。 MNIST文件需要用二进制“rb”模式打开，而不是只打开文本“r”。

来源

2016-02-20 01:56:27 Cescante

就我而言，问题出在数据文件的编码上。

打开使用vim并执行该文件：

:set fileencoding=utf-8

这解决了问题，在我的情况。

来源

2016-12-28 13:27:40 wael34218

在tensorflow教程“UTF-8”解码错误

回答

相关问题