我觉得你与Python3其中使用Unicode作为默认字符串类型的工作。字节串然后得到特殊的b
标记。
如果我生成使用Unicode而不是字节的数据,这个工程:
In [654]: data1 = np.zeros((3,),dtype=("U24,int,float"))
In [655]: data1['f0']='xxx' # more interesting string field
In [656]: with open('test.csv','w') as f:
writer=csv.writer(f,delimiter=',')
for row in data1:
writer.writerow(row)
In [658]: cat test.csv
xxx,0,0.0
xxx,0,0.0
xxx,0,0.0
np.savetxt
做同样的事情:
In [668]: np.savetxt('test.csv',data1,fmt='%s',delimiter=',')
In [669]: cat test.csv
xxx,0,0.0
xxx,0,0.0
xxx,0,0.0
的问题是,我可以解决此,同时保持S24
字段?例如打开文件为wb
?
我https://stackoverflow.com/a/27513196/901925 Trying to strip b' ' from my Numpy array
探讨过这个问题,前面看起来像我的解决方案是要么decode
字节字段,或者直接写一个字节的文件。由于您的数组混合了字符串和数字字段,因此decode
解决方案更乏味。
data1 = data.astype('U24,i,f') # convert bytestring field to unicode
一个辅助功能,可用于decode
字节串上飞:
In [147]: fn = lambda row: [j.decode() if isinstance(j,bytes) else j for j in row]
In [148]: with open('test.csv','w') as f:
writer=csv.writer(f,delimiter=',')
for row in data:
writer.writerow(fn(row))
.....:
In [149]: cat test.csv
xxx,0,0.0
yyy,0,0.0
zzz,0,0.0
这看起来像[开放问题#4543](https://github.com/numpy/ numpy/issues/4543) – askewchan