I'm attempting to use Tesseract-OCR-iOS in a new Swift 3.0 project. I'm using Xcode Version 8.1 (8B62). CocoaPods is version 1.1.1.
When I attempt to use tesseract.recognize(), my app crashes and I get the following output in the console:
actual_tessdata_num_entries_ <= TESSDATA_NUM_ENTRIES:Error:Assert failed:in file tessdatamanager.cpp, line 53
I found this post, which sounds I'm using the wrong version of traineddata. I downloaded tessdata from the tesseract-ocr/tessdata repo, so I'm baffled as to why I'd have a mismatch on the version numbers.
Any suggestions how to get Tesseract working are greatly appreciated. Below is additional information re: my setup.
Here's what my Podfile looks like:
# Uncomment the next line to define a global platform for your project
platform :ios, '9.0'
target 'TesseractDemo' do
  # Comment the next line if you're not using Swift and don't want to use dynamic frameworks
  use_frameworks!
  # Pods for TesseractDemo
pod 'TesseractOCRiOS', '4.0.0'
end
I've dragged a tessdata folder containing eng.traineddata into the root directory of my project outside of Xcode and dragged a reference from Finder to Xcode's Project Navigator.
Everything works fine up to this point. No compiler errors, linker whining, etc. In a UIViewController I'm importing TesseratOCR and calling it like so:
// MARK: - OCR Methods
func scanImage(image: UIImage) {
    if let tesseract = G8Tesseract(language: "eng") {
        tesseract.delegate = self
        tesseract.image = imageToScan?.g8_blackAndWhite()
        tesseract.recognize()
        textView.text = tesseract.recognizedText
    }
}
Update I found a link to a repo of traineddata files for version 4.0. I nuked my old eng.traineddata file and replaced it with the one from the 4.0 repo. I get the same error referencing the same line.
The current version of eng.traineddata linked above on GitHub will not work with the current version of the Tesseract-OCR-iOS.
The installation instructions posted on GitHub work perfectly if you've got the right <language>.traineddata file.
I discovered this after dragging the eng.traineddata from Lyndsey Scott's brilliant Tesseract tutorial on Ray Wenderlich.
This repo contains the eng.traineddata file I needed to get Tesseract working. I'm not sure if that applies to all languages.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With