我从P2.xlarge类型的AWS实例运行此模型。它是给一个错误:MemoryError tensorflow
Exception in thread Thread-16:
Traceback (most recent call last):
File "/usr/lib/python2.7/threading.py", line 801, in __bootstrap_inner
self.run()
File "/usr/lib/python2.7/threading.py", line 754, in run
self.__target(*self.__args, **self.__kwargs)
File "/home/ubuntu/tensorflow/models/summarization/textsum/batch_reader.py" , line 136, in _FillInputQueue
(article, abstract) = input_gen.next()
File "/home/ubuntu/tensorflow/models/summarization/textsum/batch_reader.py", line 245, in _TextGenerator
e = example_gen.next()
File "/home/ubuntu/tensorflow/models/summarization/textsum/data.py", line 109, in ExampleGen
example_str = struct.unpack('%ds' % str_len, reader.read(str_len))[0]
MemoryError
系统存储的信息是 -
Filesystem Size Used Avail Use% Mounted on
udev 30G 0 30G 0% /dev
tmpfs 6.0G 8.9M 6.0G 1% /run
/dev/xvda1 30G 12G 18G 39%/
tmpfs 30G 0 30G 0% /dev/shm
tmpfs 5.0M 0 5.0M 0% /run/lock
tmpfs 30G 0 30G 0% /sys/fs/cgroup
tmpfs 6.0G 0 6.0G 0% /run/user/1000
NVIDIA状态 -
[email protected]:~$ lspci | grep -i nvidia
00:1e.0 3D控制器:NVIDIA公司GK210GL [特斯拉K80 ](rev a1)
这是什么解决方案?
如果我更换str_len = struct.unpack('q', len_bytes)[0]
与str_len = struct.unpack('Bi', len_bytes)[0]
那么这个错误消失,新的错误上来如:
Exception in thread Thread-15:
Traceback (most recent call last):
File "/usr/lib/python2.7/threading.py", line 801, in __bootstrap_inner
self.run()
File "/usr/lib/python2.7/threading.py", line 754, in run
self.__target(*self.__args, **self.__kwargs)
File "/home/mindstix/bazel/models/Summarizer/textsum/batch_reader.py", line 136, in _FillInputQueue
(article, abstract) = input_gen.next()
File "/home/mindstix/bazel/models/Summarizer/textsum/batch_reader.py", line 248, in _TextGenerator
article_text = self._GetExFeatureText(e, self._article_key)
File "/home/mindstix/bazel/models/Summarizer/textsum/batch_reader.py", line 265, in _GetExFeatureText
return ex.features.feature[key].bytes_list.value[0]
IndexError: list index (0) out of range
如果我在屏幕上打印example_str
然后该值显示。但是,当我尝试打印ex.features.feature[key].bytes_list.value
时,它将返回空白。
应该怎么办才能解决这一切?
这是我下面的代码步骤:
>>> import tensorflow as tf
>>> import struct
>>>from tensorflow.core.example import example_pb2
>>> reader = open('data/training-1', 'rb')
>>> len_bytes = reader.read(8)
>>> str_len = struct.unpack('q', len_bytes)[0]
>>> str_len
2335523720558635124
>>> example_str = struct.unpack('%ds' % str_len, reader.read(str_len))[0]
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
MemoryError
>>> str_len = struct.unpack('Bi', len_bytes)[0]
>>> str_len
116
>>> example_str = struct.unpack('%ds' % str_len, reader.read(str_len))[0]
>>>e = example_pb2.Example.FromString(example_str)
>>> e.features.feature['article'].bytes_list.value
<google.protobuf.pyext._message.RepeatedScalarContainer object at 0x7fc25c9325a8>
>>> e.features.feature['article'].bytes_list.value[0]
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
IndexError: list index (0) out of range
如果没有其他代码作为上下文,很难说出任何内容。你能把它凝聚成一个最小但可运行的例子吗? –
@AllenLavoie我已经用示例代码更新了这个问题,我试图用tensorflow来运行它。 –
所以文章功能是空的?有没有理由认为它不应该是?打印整个例子('print(e)')来查看被解析的内容可能是有用的。也不知道'struct'用法是怎么回事:也许[TFRecord](https://www.tensorflow.org/api_guides/python/python_io)格式会是更稳定的存储格式? –