为什么返回此错误？tesseract（v3.03）输出为PDF

[email protected] ~/ocr_test # tesseract -l dan pdf.png out pdf 
Tesseract Open Source OCR Engine v3.03 with Leptonica 
Error opening data file /usr/local/share/tessdata/osd.traineddata 
Please make sure the TESSDATA_PREFIX environment variable is set to the parent directory of your "tessdata" directory. 
Failed loading language 'osd' 
Tesseract couldn't load any languages! 
Warning: Auto orientation and script detection requested, but osd language failed to load

语言列表

[email protected] ~/ocr_test # tesseract --list-langs 
List of available languages (3): 
eng 
dan 
dan-frak

输出为txt

这工作得很好，并输出文本out.txt

tesseract -l dan pdf.png out

输出PDF

这将创建out.pdf也retuns提到的错误，并在PDF中搜索文本没有意义

tesseract -l dan pdf.png out pdf

来源

2014-03-02 clarkk

该错误信息是明确的：它需要osd.traineddata文件。您可以从https://github.com/tesseract-ocr/tessdata安装或下载方向& Tesseract脚本检测数据。

来源

2014-03-02 22:20:57 nguyenq

存储库已移至https://github.com/tesseract-ocr/tessdata – Joe

如何安装？ – happybuddha

tesseract（v3.03）输出为PDF

语言列表

输出为txt

输出PDF

回答

相关问题