使用多线程,但要注意为TessBaseAPI的每个线程创建一个实例。不要在不同的线程之间分享它们。创建N个线程(N> =内核数量),java将确保至少加快内核次数。
我要做的就是创造它在自己的环境中创建TessBaseAPI对象(run方法),并等待在一个循环OCR请求,直至打断了N个线程。
...
...
@Override
public void run() {
TessBaseAPI tessBaseApi = new TessBaseAPI();
tessBaseApi.init(Ocrrrer.DATA_PATH, "eng");
setTessVariable(tessBaseApi, "load_system_dawg", "0");
setTessVariable(tessBaseApi, "load_freq_dawg", "0");
setTessVariable(tessBaseApi, "load_unambig_dawg", "0");
setTessVariable(tessBaseApi, "load_punc_dawg", "0");
setTessVariable(tessBaseApi, "load_number_dawg", "0");
setTessVariable(tessBaseApi, "load_fixed_length_dawgs", "0");
setTessVariable(tessBaseApi, "load_bigram_dawg", "0");
setTessVariable(tessBaseApi, "wordrec_enable_assoc", "0");
setTessVariable(tessBaseApi, "tessedit_enable_bigram_correction", "0");
setTessVariable(tessBaseApi, "assume_fixed_pitch_char_segment", "1");
setTessVariable(tessBaseApi, TessBaseAPI.VAR_CHAR_WHITELIST, "1234567890ABCDEFGHIJKLMNOPQRSTUVWXYZ<");
Log.d(TAG, "Training file loaded");
while (!interrupted()) {
reentrantLock.lock();
try {
Log.d(TAG, this.getName() + " wait for OCR");
jobToDo.await();
Log.d(TAG, this.getName() + " input arrived. Do OCR");
this.ocrResult = doOcr(tessBaseApi);
ocrDone.signalAll();
} catch (InterruptedException e) {
return;
} finally {
try {
reentrantLock.unlock();
} catch (Exception ex) {
}
}
}
}
...
...
您可以看到tessBaseApi对象是run方法的本地对象,因此绝对不会共享。
您是否在谈论速度或识别的准确性? – rmtheis
我正在考虑速度,这是非常缓慢的。 –
嘿@QuiLlHoN你有没有发现任何解决方案如此缓慢的表现?我遇到了同样的问题:/ – Vucko