我使用python 3.5.2
和pytesseract,有一个错误TypeError: a bytes-like object is required, not 'str'
当我运行我的代码,(详情如下):类型错误:一类字节对象是必需的,而不是“海峡”在Python 3.5.2和pytesseract
代码:File "D:/test.py"
# -*- coding: utf-8 -*-
try:
import Image
except ImportError:
from PIL import Image
import pytesseract
print(pytesseract.image_to_string(Image.open('d:/testimages/name.gif'), lang='chi_sim'))
print(pytesseract.image_to_string(Image.open('d:/testimages/mobile.gif')))
错误:
Traceback (most recent call last):
File "D:/test.py", line 11, in <module>
print(pytesseract.image_to_string(Image.open('d:/testimages/name.gif'), lang='chi_sim'))
File "C:\Users\dell\AppData\Local\Programs\Python\Python35\lib\site-packages\pytesseract\pytesseract.py", line 164, in image_to_string
errors = get_errors(error_string)
File "C:\Users\dell\AppData\Local\Programs\Python\Python35\lib\site-packages\pytesseract\pytesseract.py", line 112, in get_errors
error_lines = tuple(line for line in lines if line.find('Error') >= 0)
File "C:\Users\dell\AppData\Local\Programs\Python\Python35\lib\site-packages\pytesseract\pytesseract.py", line 112, in <genexpr>
error_lines = tuple(line for line in lines if line.find('Error') >= 0)
TypeError: a bytes-like object is required, not 'str'
我该怎么办?
编辑:
我训练数据下载到C:\Program Files (x86)\Tesseract-OCR\tessdata
,像这样:
,我插入行error_string = error_string.decode("utf-8")
到get_errors()
,错误的是这样的:
Traceback (most recent call last):
File "D:/test.py", line 11, in <module>
print(pytesseract.image_to_string(Image.open('d:/testimages/name.gif'), lang='chi_sim'))
File "C:\Users\dell\AppData\Local\Programs\Python\Python35\lib\site-packages\pytesseract\pytesseract.py", line 165, in image_to_string
raise TesseractError(status, errors)
pytesseract.pytesseract.TesseractError: (1, 'Error opening data file \\Program Files (x86)\\Tesseract-OCR\\tessdata/chi_sim.traineddata')
它还有一些其他问题,请参阅我的编辑。 – zwl1619
@ zwl1619:我不知道pytessaract是如何工作的。修正编码错误表明训练数据未按预期方式安装。错误是之前被抛出,但由于编码问题,你从来没有得到它。也许这是某种权限问题? –