So what I heard after research is that the only solid free OCR options are either Tesseract or CuneiForm.
Now, the Tesseract docs are plain horrible, all they give you is a bunch of Visual Studio code (for me on Windows) and from there you are on your own in an ocean of their API. All you can do is use the exe that compiles then use it on a tiff image.
I was expecting at least short documentation that tells you how to pull their API call to use OCR at least for a small example but no, there's nothing like that in their docs.
CuneiForm: I downloaded it and "great" everything is in Russian. :(
Is it really hard for those guys to pull a small example instead they supply us with bunch of irrelevant info that probably 90% of people won't reach, how can you reach there without starting on small things and they explain none of it!
So I have bunch of API but how the hell am I supposed to use it if it's explained nowhere?... Maybe someone can offer me advice and a solution? I'm not asking for a miracle, just something small to show me how things work.
Tesseract is an optical character recognition engine for various operating systems. It is free software, released under the Apache License.
Tesseract is a free and open source command line OCR engine that was developed at Hewlett-Packard in the mid 80s, and has been maintained by Google since 2006. It is well documented. Tesseract is written in C/C++.
Tesseract is performing well for high-resolution images. Certain morphological operations such as dilation, erosion, OTSU binarization can help increase pytesseract performance. EasyOCR is lightweight model which is giving a good performance for receipt or PDF conversion.
You might have given up, but there may be some other who are still trying. So here is what you need to start with tesseract:
First of all you should read all the documentation about tesseract. You may find something useful is the wiki.
To start using the API(v 3.0.1, currently in trunk, read also the README and ChangeLog from trunk) you should check out the baseapi.h. The documentation of how to use the api is right there, a comment above each function.  
For starters:
baseapi.h & construct TessBaseAPI objectInit() SetVariable() func. You can see all the params and their values if you print them in a file using PrintVariables() func.SetPageSegMode(). Tell tesseract what the image you are about to OCR represents - block or line of text, word or character.SetImage()GetUTF8Text()  (Again, that is just for starters.)
You can check the tesseract's community for alredy answerd questions or ask your own here.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With