detect an initial/a sketch drawing on a text page

Question

I would like to get the coordinates of the box around the initial ("H") on the following page (and similar ones with other initials, so opencv template matching is not an option):

enter image description here

Following this tutorial, I tried to solve the problem with opencv contours:

import cv2
import matplotlib.pyplot as plt

page = "image.jpg"

# read the image
image = cv2.imread(page)

# convert to RGB
image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
# convert to grayscale
gray = cv2.cvtColor(image, cv2.COLOR_RGB2GRAY)

# create a binary thresholded image
_, binary = cv2.threshold(gray, 0,150,cv2.THRESH_BINARY+cv2.THRESH_OTSU)
# find the contours from the thresholded image
contours, hierarchy = cv2.findContours(binary, cv2.RETR_TREE, cv2.CHAIN_APPROX_SIMPLE)
# draw all contours
image = cv2.drawContours(image, contours, 3, (0, 255, 0), 2)
plt.savefig("result.png")

The result is of course not exactly what I wanted:

enter image description here

Does anyone know of an viable algorithm (and possibly an implementation thereof) that could provide an easy solution to my task?

stateMachine · Accepted Answer

You can find the target area by filtering your contours. Now, there's at least two filtering criteria that you can use. One is filter by area - that is, discard too small and too large contours until you get the contour you are looking for. The other one is by computing the extent of every contour. The extent is the ratio of the contour's area to its bounding rectangle area. You are looking for a square-like contour, so its extent should be close to 1.0.

Let's see the code:

# imports:
import cv2
import numpy as np

# Reading an image in default mode:
inputImage = cv2.imread(path + fileName)
# Deep copy for results:
inputImageCopy = inputImage.copy()

# Convert RGB to grayscale:
grayscaleImage = cv2.cvtColor(inputImage, cv2.COLOR_BGR2GRAY)

# Get binary image via Otsu:
_, binaryImage = cv2.threshold(grayscaleImage, 0, 255, cv2.THRESH_BINARY_INV + cv2.THRESH_OTSU)

The first portion of the code gets you a binary image that you can use as a mask to compute contours:

Now, let's filter contours. Let's use the area approach first. You need to define a range of minimum area and maximum area to filter everything that does not fall in this range. I've heuristically determined a range of areas from 30000 px to 150000 px:

# Find the contours on the binary image:
contours, hierarchy = cv2.findContours(binaryImage, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)

# Look for the outer bounding boxes (no children):
for _, c in enumerate(contours):

    # Get blob area:
    currentArea = cv2.contourArea(c)
    print("Contour Area: "+str(currentArea))

    # Set an area range:
    minArea = 30000
    maxArea = 150000

    if minArea < currentArea < maxArea:

        # Get the contour's bounding rectangle:
        boundRect = cv2.boundingRect(c)

        # Get the dimensions of the bounding rect:
        rectX = boundRect[0]
        rectY = boundRect[1]
        rectWidth = boundRect[2]
        rectHeight = boundRect[3]

        # Set bounding rect:
        color = (0, 0, 255)
        cv2.rectangle( inputImageCopy, (int(rectX), int(rectY)),
                       (int(rectX + rectWidth), int(rectY + rectHeight)), color, 2 )

        cv2.imshow("Rectangles", inputImageCopy)
        cv2.waitKey(0)

Once you successfully filter the area, you can then compute the bounding rectangle of the contour with cv2.boundingRect. You can retrieve the bounding rectangle's x, y (top left) coordinates as well as its width and height. After that just draw the rectangle on a deep copy of the original input.

Now, let's see the second option, using the contour's extent. The for loop gets modified as follows:

# Look for the outer bounding boxes (no children):
for _, c in enumerate(contours):

    # Get blob area:
    currentArea = cv2.contourArea(c)

    # Get the contour's bounding rectangle:
    boundRect = cv2.boundingRect(c)

    # Get the dimensions of the bounding rect:
    rectX = boundRect[0]
    rectY = boundRect[1]
    rectWidth = boundRect[2]
    rectHeight = boundRect[3]

    # Calculate extent:
    extent = float(currentArea)/(rectWidth *rectHeight)
    print("Extent: " + str(extent))

    # Set the extent filter, look for an extent close to 1.0:
    delta = abs(1.0 - extent)
    epsilon = 0.1

    if delta < epsilon:

        # Set bounding rect:
        color = (0, 0, 255)
        cv2.rectangle( inputImageCopy, (int(rectX), int(rectY)),
                       (int(rectX + rectWidth), int(rectY + rectHeight)), color, 2 )

        cv2.imshow("Rectangles", inputImageCopy)
        cv2.waitKey(0)

Both approaches yield this result:

detect an initial/a sketch drawing on a text page

Tags:

python

opencv

computer-vision

Alex W.

1 Answers

stateMachine

Recent Activity

Donate For Us

detect an initial/a sketch drawing on a text page

Tags:

python

opencv

computer-vision

Alex W.

1 Answers

stateMachine

Related questions

Recent Activity

Donate For Us