I have installed tesseract in Google colab using the command
!pip install tesseract
But when I run the command
text = pytesseract.image_to_string(Image.open('cropped_img.png'))
I get the below error:
TesseractNotFoundError: tesseract is not installed or it's not in your path
This is a proven build sequence: cd tesseract ./autogen.sh mkdir -p bin/release cd bin/release ../../configure --disable-openmp --disable-shared 'CXXFLAGS=-g -O2 -fno-math-errno -Wall -Wextra -Wpedantic' # Build tesseract and training tools. Run `make` if you don't need the training tools. make training cd ../..
Add pytesseract.pytesseract.tesseract_cmd = r'/usr/local/bin/pytesseract'
This should solve the TesseractNotFoundError.
There could be a number of reasons for this, but normally it is because you do not have the C library available for tesseract. Even though pytesseract is required, it is only half of the solution.
You essentially need to install both the tesseract package for linux, along with the Python binding.
This would essentially be the solution:
! apt install tesseract-ocr
! apt install libtesseract-dev
The above installs the required dependencies for pytesseract. This is very important, especially the ! without which you cannot install directly to the underlying operating system.
The remainder of the process is relatively simple:
! pip install Pillow
! pip install pytesseract
This installs the Python binding.
The remainder is fairly simple and all you need to do is import!
import pytesseract
from PIL import ImageEnhance, ImageFilter, Image
Then you can let the magic happen.
Hopefully this helps someone.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With