Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

detect an initial/a sketch drawing on a text page

I would like to get the coordinates of the box around the initial ("H") on the following page (and similar ones with other initials, so opencv template matching is not an option):

enter image description here

Following this tutorial, I tried to solve the problem with opencv contours:

import cv2
import matplotlib.pyplot as plt

page = "image.jpg"

# read the image
image = cv2.imread(page)

# convert to RGB
image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
# convert to grayscale
gray = cv2.cvtColor(image, cv2.COLOR_RGB2GRAY)

# create a binary thresholded image
_, binary = cv2.threshold(gray, 0,150,cv2.THRESH_BINARY+cv2.THRESH_OTSU)
# find the contours from the thresholded image
contours, hierarchy = cv2.findContours(binary, cv2.RETR_TREE, cv2.CHAIN_APPROX_SIMPLE)
# draw all contours
image = cv2.drawContours(image, contours, 3, (0, 255, 0), 2)
plt.savefig("result.png")

The result is of course not exactly what I wanted:

enter image description here

Does anyone know of an viable algorithm (and possibly an implementation thereof) that could provide an easy solution to my task?

like image 594
Alex W. Avatar asked Sep 06 '25 03:09

Alex W.


1 Answers

You can find the target area by filtering your contours. Now, there's at least two filtering criteria that you can use. One is filter by area - that is, discard too small and too large contours until you get the contour you are looking for. The other one is by computing the extent of every contour. The extent is the ratio of the contour's area to its bounding rectangle area. You are looking for a square-like contour, so its extent should be close to 1.0.

Let's see the code:

# imports:
import cv2
import numpy as np

# Reading an image in default mode:
inputImage = cv2.imread(path + fileName)
# Deep copy for results:
inputImageCopy = inputImage.copy()

# Convert RGB to grayscale:
grayscaleImage = cv2.cvtColor(inputImage, cv2.COLOR_BGR2GRAY)

# Get binary image via Otsu:
_, binaryImage = cv2.threshold(grayscaleImage, 0, 255, cv2.THRESH_BINARY_INV + cv2.THRESH_OTSU)

The first portion of the code gets you a binary image that you can use as a mask to compute contours:

Now, let's filter contours. Let's use the area approach first. You need to define a range of minimum area and maximum area to filter everything that does not fall in this range. I've heuristically determined a range of areas from 30000 px to 150000 px:

# Find the contours on the binary image:
contours, hierarchy = cv2.findContours(binaryImage, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)

# Look for the outer bounding boxes (no children):
for _, c in enumerate(contours):

    # Get blob area:
    currentArea = cv2.contourArea(c)
    print("Contour Area: "+str(currentArea))

    # Set an area range:
    minArea = 30000
    maxArea = 150000

    if minArea < currentArea < maxArea:

        # Get the contour's bounding rectangle:
        boundRect = cv2.boundingRect(c)

        # Get the dimensions of the bounding rect:
        rectX = boundRect[0]
        rectY = boundRect[1]
        rectWidth = boundRect[2]
        rectHeight = boundRect[3]

        # Set bounding rect:
        color = (0, 0, 255)
        cv2.rectangle( inputImageCopy, (int(rectX), int(rectY)),
                       (int(rectX + rectWidth), int(rectY + rectHeight)), color, 2 )

        cv2.imshow("Rectangles", inputImageCopy)
        cv2.waitKey(0)

Once you successfully filter the area, you can then compute the bounding rectangle of the contour with cv2.boundingRect. You can retrieve the bounding rectangle's x, y (top left) coordinates as well as its width and height. After that just draw the rectangle on a deep copy of the original input.

Now, let's see the second option, using the contour's extent. The for loop gets modified as follows:

# Look for the outer bounding boxes (no children):
for _, c in enumerate(contours):

    # Get blob area:
    currentArea = cv2.contourArea(c)

    # Get the contour's bounding rectangle:
    boundRect = cv2.boundingRect(c)

    # Get the dimensions of the bounding rect:
    rectX = boundRect[0]
    rectY = boundRect[1]
    rectWidth = boundRect[2]
    rectHeight = boundRect[3]

    # Calculate extent:
    extent = float(currentArea)/(rectWidth *rectHeight)
    print("Extent: " + str(extent))

    # Set the extent filter, look for an extent close to 1.0:
    delta = abs(1.0 - extent)
    epsilon = 0.1

    if delta < epsilon:

        # Set bounding rect:
        color = (0, 0, 255)
        cv2.rectangle( inputImageCopy, (int(rectX), int(rectY)),
                       (int(rectX + rectWidth), int(rectY + rectHeight)), color, 2 )

        cv2.imshow("Rectangles", inputImageCopy)
        cv2.waitKey(0)

Both approaches yield this result:

like image 197
stateMachine Avatar answered Sep 07 '25 15:09

stateMachine