Bioformats-Python错误：'ascii'编解码器无法编码字符u'\ xb5'当使用OMEXML（）

我想在Python中使用bioformats读取显微镜图像（.lsm，.czi，.lif ，你将其命名），打印出元数据并显示图像。 ome = bf.OMEXML(md)给我一个错误（下面）。我想这是关于存储在md内的信息。它不喜欢md中的信息不全是ASCII。但是，我如何克服这个问题呢？这是我写的：Bioformats-Python错误：'ascii'编解码器无法编码字符u' xb5'当使用OMEXML（）

import Tkinter as Tk, tkFileDialog 
import os 
import javabridge as jv 
import bioformats as bf 
import matplotlib.pyplot as plt 
import numpy as np 

jv.start_vm(class_path=bf.JARS, max_heap_size='12G')

用户选择文件中numpy的阵列

raw_data = [] 
    for z in range(iome.Pixels.get_SizeZ()): 
    raw_image = reader.read(z=z, series=0, rescale=False) 
    raw_data.append(raw_image) 
raw_data = np.array(raw_data)

展上展出了元数据与

#hiding root alllows file diaglog GUI to be shown without any other GUI elements 
root = Tk.Tk() 
root.withdraw() 
file_full_path = tkFileDialog.askopenfilename() 
filepath, filename = os.path.split(file_full_path) 
os.chdir(os.path.dirname(file_full_path)) 

print('opening: %s' %filename) 
reader = bf.ImageReader(file_full_path) 
md = bf.get_omexml_metadata(file_full_path) 
ome = bf.OMEXML(md)

认沽图像工作

iome = ome.image(0) # e.g. first image 
print(iome.get_Name()) 
print(iome.Pixels.get_SizeX()) 
print(iome.Pixels.get_SizeY())

这里的e RROR我得到：

--------------------------------------------------------------------------- 
UnicodeEncodeError      Traceback (most recent call last) 
<ipython-input-22-a22c1dbbdd1e> in <module>() 
    11 reader = bf.ImageReader(file_full_path) 
    12 md = bf.get_omexml_metadata(file_full_path) 
---> 13 ome = bf.OMEXML(md) 

/anaconda/envs/env2_bioformats/lib/python2.7/site-packages/bioformats/omexml.pyc in __init__(self, xml) 
    318   if isinstance(xml, str): 
    319    xml = xml.encode("utf-8") 
--> 320   self.dom = ElementTree.ElementTree(ElementTree.fromstring(xml)) 
    321 
    322   # determine OME namespaces 

<string> in XML(text) 

UnicodeEncodeError: 'ascii' codec can't encode character u'\xb5' in position 1623: ordinal not in range(128)

这里有一个代表test image专有显微镜格式

来源

2017-04-24 puifais

你可以添加上传一个示例图像吗？ –

@MaximilianPeters，我刚刚添加了一个.lsm文件进行测试。任何建议，将不胜感激。谢谢！ – puifais

感谢您加入样本图像。这非常有帮助！

让我们先删除所有不必要的Tkinter代码，直到我们得到一个允许我们重现错误消息的Minimal, Complete and Verifiable Example。

import javabridge as jv 
import bioformats as bf 

jv.start_vm(class_path=bf.JARS, max_heap_size='12G') 

file_full_path = '/path/to/Cell1.lsm' 

md = bf.get_omexml_metadata(file_full_path) 

ome = bf.OMEXML(md) 

jv.kill_vm()

我们先了解一下3i SlideBook SlideBook6Reader library not found一些警告信息，但我们认为can apparently ignore。

你的错误信息读取UnicodeEncodeError: 'ascii' codec can't encode character u'\xb5' in position 1623: ordinal not in range(128)，所以让我们看看有什么我们可以发现周围的位置1623

如果添加print mdmd = bf.get_omexml_metadata(file_full_path)后，用元数据的整个XML被打印出来。让我们放大：

>>> print md[1604:1627] 
PhysicalSizeXUnit="µm"

所以，µ字符是罪魁祸首，它不能与'ascii' codec编码。

回首回溯：

/anaconda/envs/env2_bioformats/lib/python2.7/site-packages/bioformats/omexml.pyc in __init__(self, xml) 
    318   if isinstance(xml, str): 
    319    xml = xml.encode("utf-8") 
--> 320   self.dom = ElementTree.ElementTree(ElementTree.fromstring(xml)) 
    321 
    322   # determine OME namespaces

我们看到，在线路发生错误之前，我们编码我们xml到utf-8，应该解决我们的问题。那为什么不发生？

如果我们添加print type(md)，我们会返回<type 'unicode'>而不是<type 'str'>作为代码预期。所以这是omexml.py中的一个错误！

要解决此问题，请执行以下操作（可能需要root用户）;

转到/anaconda/envs/env2_bioformats/lib/python2.7/site-packages/bioformats/
除去omexml.pyc
在omexml.py变更线318从isinstance(xml, str):到if isinstance(xml, basestring):

basestring为str和unicode超类。它用于测试对象是否为str或unicode的实例。

我想为此提交一个错误，但它似乎已经有一个open issue。

来源

2017-04-26 09:59:33 BioGeek

非常感谢你！（哦，你的生物给了我一个笑声:)） – puifais

Bioformats-Python错误：'ascii'编解码器无法编码字符u'\ xb5'当使用OMEXML（）

回答

相关问题