2017-06-24 203 views
1

我正在使用tesseract在screengrabs上执行OCR。我有一个应用程序使用tkinter窗口利用self.after在我的类的初始化执行不断的图像擦除和更新tkinter窗口中的标签等值。我搜索了多天,并找不到任何具体的例子如何利用CREATE_NO_WINDOW与Python3.6在Windows平台上调用pytesseract tesseract。当我使用CREATE_NO_WINDOW与pytesseract运行tesseract时如何隐藏控制台窗口

这涉及到这样一个问题:

How can I hide the console window when I run tesseract with pytesser

我只持续2周编程Python和不明白/如何执行在上述问题中的步骤。我打开了pytesseract.py文件并检查并找到了proc = subprocess.Popen(command,stderr = subproces.PIPE)行,但是当我尝试编辑它时,我得到了一堆我无法弄清楚的错误。

#!/usr/bin/env python 

''' 
Python-tesseract. For more information: https://github.com/madmaze/pytesseract 

''' 

try: 
    import Image 
except ImportError: 
    from PIL import Image 

import os 
import sys 
import subprocess 
import tempfile 
import shlex 


# CHANGE THIS IF TESSERACT IS NOT IN YOUR PATH, OR IS NAMED DIFFERENTLY 
tesseract_cmd = 'tesseract' 

__all__ = ['image_to_string'] 


def run_tesseract(input_filename, output_filename_base, lang=None, boxes=False, 
        config=None): 
    ''' 
    runs the command: 
     `tesseract_cmd` `input_filename` `output_filename_base` 

    returns the exit status of tesseract, as well as tesseract's stderr output 

    ''' 
    command = [tesseract_cmd, input_filename, output_filename_base] 

    if lang is not None: 
     command += ['-l', lang] 

    if boxes: 
     command += ['batch.nochop', 'makebox'] 

    if config: 
     command += shlex.split(config) 

    proc = subprocess.Popen(command, stderr=subprocess.PIPE) 
    status = proc.wait() 
    error_string = proc.stderr.read() 
    proc.stderr.close() 
    return status, error_string 


def cleanup(filename): 
    ''' tries to remove the given filename. Ignores non-existent files ''' 
    try: 
     os.remove(filename) 
    except OSError: 
     pass 


def get_errors(error_string): 
    ''' 
    returns all lines in the error_string that start with the string "error" 

    ''' 

    error_string = error_string.decode('utf-8') 
    lines = error_string.splitlines() 
    error_lines = tuple(line for line in lines if line.find(u'Error') >= 0) 
    if len(error_lines) > 0: 
     return u'\n'.join(error_lines) 
    else: 
     return error_string.strip() 


def tempnam(): 
    ''' returns a temporary file-name ''' 
    tmpfile = tempfile.NamedTemporaryFile(prefix="tess_") 
    return tmpfile.name 


class TesseractError(Exception): 
    def __init__(self, status, message): 
     self.status = status 
     self.message = message 
     self.args = (status, message) 


def image_to_string(image, lang=None, boxes=False, config=None): 
    ''' 
    Runs tesseract on the specified image. First, the image is written to disk, 
    and then the tesseract command is run on the image. Tesseract's result is 
    read, and the temporary files are erased. 

    Also supports boxes and config: 

    if boxes=True 
     "batch.nochop makebox" gets added to the tesseract call 

    if config is set, the config gets appended to the command. 
     ex: config="-psm 6" 
    ''' 

    if len(image.split()) == 4: 
     # In case we have 4 channels, lets discard the Alpha. 
     # Kind of a hack, should fix in the future some time. 
     r, g, b, a = image.split() 
     image = Image.merge("RGB", (r, g, b)) 

    input_file_name = '%s.bmp' % tempnam() 
    output_file_name_base = tempnam() 
    if not boxes: 
     output_file_name = '%s.txt' % output_file_name_base 
    else: 
     output_file_name = '%s.box' % output_file_name_base 
    try: 
     image.save(input_file_name) 
     status, error_string = run_tesseract(input_file_name, 
              output_file_name_base, 
              lang=lang, 
              boxes=boxes, 
              config=config) 
     if status: 
      errors = get_errors(error_string) 
      raise TesseractError(status, errors) 
     f = open(output_file_name, 'rb') 
     try: 
      return f.read().decode('utf-8').strip() 
     finally: 
      f.close() 
    finally: 
     cleanup(input_file_name) 
     cleanup(output_file_name) 


def main(): 
    if len(sys.argv) == 2: 
     filename = sys.argv[1] 
     try: 
      image = Image.open(filename) 
      if len(image.split()) == 4: 
       # In case we have 4 channels, lets discard the Alpha. 
       # Kind of a hack, should fix in the future some time. 
       r, g, b, a = image.split() 
       image = Image.merge("RGB", (r, g, b)) 
     except IOError: 
      sys.stderr.write('ERROR: Could not open file "%s"\n' % filename) 
      exit(1) 
     print(image_to_string(image)) 
    elif len(sys.argv) == 4 and sys.argv[1] == '-l': 
     lang = sys.argv[2] 
     filename = sys.argv[3] 
     try: 
      image = Image.open(filename) 
     except IOError: 
      sys.stderr.write('ERROR: Could not open file "%s"\n' % filename) 
      exit(1) 
     print(image_to_string(image, lang=lang)) 
    else: 
     sys.stderr.write('Usage: python pytesseract.py [-l lang] input_file\n') 
     exit(2) 


if __name__ == '__main__': 
    main() 

我利用的代码是在类似的问题类似的例子:

def get_string(img_path): 
    # Read image with opencv 
    img = cv2.imread(img_path) 
    # Convert to gray 
    img = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY) 
    # Apply dilation and erosion to remove some noise 
    kernel = np.ones((1, 1), np.uint8) 
    img = cv2.dilate(img, kernel, iterations=1) 
    img = cv2.erode(img, kernel, iterations=1) 
    # Write image after removed noise 
    cv2.imwrite(src_path + "removed_noise.png", img) 
    # Apply threshold to get image with only black and white 
    # Write the image after apply opencv to do some ... 
    cv2.imwrite(src_path + "thres.png", img) 
    # Recognize text with tesseract for python 

    result = pytesseract.image_to_string(Image.open(src_path + "thres.png")) 

    return result 

当它到达下面的行,有一个黑色的控制台窗口的闪光不足秒,然后它在运行命令时关闭。

result = pytesseract.image_to_string(Image.open(src_path + "thres.png")) 

这里是控制台窗口的画面:

Program Files (x86)_Tesseract

这里是从另一个问题建议:

You're currently working in IDLE, in which case I don't think it really matters if a console window pops up. If you're planning to develop a GUI app with this library, then you'll need to modify the subprocess.Popen call in pytesser.py to hide the console. I'd first try the CREATE_NO_WINDOW process creation flag. – eryksun

我将不胜感激如何任何帮助使用CREATE_NO_WINDOW修改pytesseract.py库文件中的subprocess.Popen调用。我也不确定pytesseract.py和pytesser.py库文件之间的区别。我会留下对其他问题的评论,要求澄清,但我不能直到我在这个网站上有更多的声望。

回答

3

我做更多的研究,并决定进一步了解subprocess.Popen:

Documentation for subprocess

我也引用下面的文章:

using python subprocess.popen..can't prevent exe stopped working prompt

我改变的代码原线在pytesseract。潘岳:

proc = subprocess.Popen(command, stderr=subprocess.PIPE) 

以下几点:

proc = subprocess.Popen(command, stderr=subprocess.PIPE, creationflags = CREATE_NO_WINDOW) 

我跑的代码,并得到了以下错误:

Exception in Tkinter callback Traceback (most recent call last):
File "C:\Users\Steve\AppData\Local\Programs\Python\Python36-32\lib\tkinter__init__.py", line 1699, in call return self.func(*args) File "C:\Users\Steve\Documents\Stocks\QuickOrder\QuickOrderGUI.py", line 403, in gather_data update_cash_button() File "C:\Users\Steve\Documents\Stocks\QuickOrder\QuickOrderGUI.py", line 208, in update_cash_button currentCash = get_string(src_path + "cash.png") File "C:\Users\Steve\Documents\Stocks\QuickOrder\QuickOrderGUI.py", line 150, in get_string result = pytesseract.image_to_string(Image.open(src_path + "thres.png")) File "C:\Users\Steve\AppData\Local\Programs\Python\Python36-32\lib\site-packages\pytesseract\pytesseract.py", line 125, in image_to_string config=config) File "C:\Users\Steve\AppData\Local\Programs\Python\Python36-32\lib\site-packages\pytesseract\pytesseract.py", line 49, in run_tesseract proc = subprocess.Popen(command, stderr=subprocess.PIPE, creationflags = CREATE_NO_WINDOW) NameError: name 'CREATE_NO_WINDOW' is not defined

然后我定义的CREATE_NO_WINDOW变量:

#Assignment of the value of CREATE_NO_WINDOW 
CREATE_NO_WINDOW = 0x08000000 

我得到了0x08000000的值从上面链接的文章。在添加了定义之后,我运行了应用程序,并且没有再获得控制台窗口弹出窗口。

相关问题