0
我想知道是否可以在iPhone实时相机模式下操作OCR而不会捕捉照片?字母数字文本遵循可预测或有时固定的组合(如序列号)。实时文本识别(OCR)
我试过OpenCV和Tesseract,但我无法弄清楚在实时相机馈送上有一些图像处理的方式。
我只是不知道我不得不承认我期待的文字!有没有其他的库可以用来完成这部分?
我想知道是否可以在iPhone实时相机模式下操作OCR而不会捕捉照片?字母数字文本遵循可预测或有时固定的组合(如序列号)。实时文本识别(OCR)
我试过OpenCV和Tesseract,但我无法弄清楚在实时相机馈送上有一些图像处理的方式。
我只是不知道我不得不承认我期待的文字!有没有其他的库可以用来完成这部分?
您可以通过TesseractOCR并使用AVCaptureSession
来实现此目的。
@interface YourClass()
{
BOOL canScanFrame;
BOOL isScanning;
}
@property (strong, nonatomic) NSTimer *timer;
@end
@implementation YourClass
//...
- (void)prepareToScan
{
//Prepare capture session, preview layer and so on
//...
self.timer = [NSTimer scheduledTimerWithTimeInterval:0.5 target:self selector:@selector(timerTicked) userInfo:nil repeats:YES];
}
- (void)captureOutput:(AVCaptureOutput *)captureOutput didOutputSampleBuffer:(CMSampleBufferRef)sampleBuffer fromConnection:(AVCaptureConnection *)connection;
{
if (canScanFrame) {
canScanFrame = NO;
CGImageRef imageRef = [self imageFromSampleBuffer:sampleBuffer];
UIImage *image = [UIImage imageWithCGImage:imageRef scale:1 orientation:UIImageOrientationRight];
CGImageRelease(imageRef);
[self.scanner setImage:image];
isScanning = YES;
dispatch_async(dispatch_get_global_queue(DISPATCH_QUEUE_PRIORITY_DEFAULT, 0), ^{
NSLog(@"scan start");
[self.scanner recognize];
NSLog(@"scan stop");
dispatch_async(dispatch_get_main_queue(), ^{
isScanning = NO;
NSString *text = [self.scanner recognizedText];
//do something with text
});
});
}
}
- (CGImageRef) imageFromSampleBuffer:(CMSampleBufferRef) sampleBuffer // Create a CGImageRef from sample buffer data
{
CVImageBufferRef imageBuffer = CMSampleBufferGetImageBuffer(sampleBuffer);
CVPixelBufferLockBaseAddress(imageBuffer,0); // Lock the image buffer
uint8_t *baseAddress = (uint8_t *)CVPixelBufferGetBaseAddressOfPlane(imageBuffer, 0); // Get information of the image
size_t bytesPerRow = CVPixelBufferGetBytesPerRow(imageBuffer);
size_t width = CVPixelBufferGetWidth(imageBuffer);
size_t height = CVPixelBufferGetHeight(imageBuffer);
CGColorSpaceRef colorSpace = CGColorSpaceCreateDeviceRGB();
CGContextRef newContext = CGBitmapContextCreate(baseAddress, width, height, 8, bytesPerRow, colorSpace, kCGBitmapByteOrder32Little | kCGImageAlphaPremultipliedFirst);
CGImageRef newImage = CGBitmapContextCreateImage(newContext);
CGContextRelease(newContext);
CGColorSpaceRelease(colorSpace);
CVPixelBufferUnlockBaseAddress(imageBuffer,0);
return newImage;
}
- (void)timerTicked
{
if (!isScanning) {
canScanFrame = YES;
}
}
@end
感谢您的回答!有用。你有什么建议,以尽量减少CPU使用率,并有更好的准确性?你是否建议在发送到tesseract或使用'tesseract.rect'之前在特定矩形上裁剪图像? – Danialzo
是的,它会帮助。在将图像发送到tesseract之前,还要尝试将图像灰度化,这会提高识别准确性。 – arturdev