实时文本识别（OCR）

我想知道是否可以在iPhone实时相机模式下操作OCR而不会捕捉照片？字母数字文本遵循可预测或有时固定的组合（如序列号）。实时文本识别（OCR）

我试过OpenCV和Tesseract，但我无法弄清楚在实时相机馈送上有一些图像处理的方式。

我只是不知道我不得不承认我期待的文字！有没有其他的库可以用来完成这部分？

2015-06-30 Danialzo

您可以通过TesseractOCR并使用AVCaptureSession来实现此目的。

@interface YourClass() 
{ 
    BOOL canScanFrame; 
    BOOL isScanning; 
} 
@property (strong, nonatomic) NSTimer *timer; 

@end 

@implementation YourClass 
//... 
- (void)prepareToScan 
{ 
    //Prepare capture session, preview layer and so on 
    //... 

    self.timer = [NSTimer scheduledTimerWithTimeInterval:0.5 target:self selector:@selector(timerTicked) userInfo:nil repeats:YES]; 
} 

- (void)captureOutput:(AVCaptureOutput *)captureOutput didOutputSampleBuffer:(CMSampleBufferRef)sampleBuffer fromConnection:(AVCaptureConnection *)connection; 
{ 
    if (canScanFrame) { 
     canScanFrame = NO; 

     CGImageRef imageRef = [self imageFromSampleBuffer:sampleBuffer]; 
     UIImage *image = [UIImage imageWithCGImage:imageRef scale:1 orientation:UIImageOrientationRight]; 
     CGImageRelease(imageRef); 

     [self.scanner setImage:image]; 

     isScanning = YES; 
     dispatch_async(dispatch_get_global_queue(DISPATCH_QUEUE_PRIORITY_DEFAULT, 0), ^{ 
      NSLog(@"scan start"); 
      [self.scanner recognize]; 
      NSLog(@"scan stop"); 
      dispatch_async(dispatch_get_main_queue(), ^{ 
       isScanning = NO; 
       NSString *text = [self.scanner recognizedText]; 
       //do something with text      
      }); 
     }); 
    } 
} 

- (CGImageRef) imageFromSampleBuffer:(CMSampleBufferRef) sampleBuffer // Create a CGImageRef from sample buffer data 
{ 
    CVImageBufferRef imageBuffer = CMSampleBufferGetImageBuffer(sampleBuffer); 
    CVPixelBufferLockBaseAddress(imageBuffer,0);  // Lock the image buffer 

    uint8_t *baseAddress = (uint8_t *)CVPixelBufferGetBaseAddressOfPlane(imageBuffer, 0); // Get information of the image 
    size_t bytesPerRow = CVPixelBufferGetBytesPerRow(imageBuffer); 
    size_t width = CVPixelBufferGetWidth(imageBuffer); 
    size_t height = CVPixelBufferGetHeight(imageBuffer); 
    CGColorSpaceRef colorSpace = CGColorSpaceCreateDeviceRGB(); 

    CGContextRef newContext = CGBitmapContextCreate(baseAddress, width, height, 8, bytesPerRow, colorSpace, kCGBitmapByteOrder32Little | kCGImageAlphaPremultipliedFirst); 
    CGImageRef newImage = CGBitmapContextCreateImage(newContext); 
    CGContextRelease(newContext); 

    CGColorSpaceRelease(colorSpace); 
    CVPixelBufferUnlockBaseAddress(imageBuffer,0); 

    return newImage; 
} 
- (void)timerTicked 
{ 
    if (!isScanning) { 
     canScanFrame = YES; 
    } 
}

@end

来源

2015-06-30 11:04:47 arturdev

感谢您的回答！有用。你有什么建议，以尽量减少CPU使用率，并有更好的准确性？你是否建议在发送到tesseract或使用'tesseract.rect'之前在特定矩形上裁剪图像？ – Danialzo

是的，它会帮助。在将图像发送到tesseract之前，还要尝试将图像灰度化，这会提高识别准确性。 – arturdev

实时文本识别（OCR）

回答

相关问题