Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Crop and convert polygons to grayscale

I have a text detector which outputs polygon coordinates of detected text:

sample detected text

I am using below loop to show how the detected text looks like with bounding boxes:

for i in range(0, num_box):
    pts = np.array(boxes[0][i],np.int32)
    pts = pts.reshape((-1,1,2))
    print(pts)
    print('\n')
    img2 = cv2.polylines(img,[pts],True,(0,255,0),2)
return img2

Each pts stores all coordinates of a polygon, for one text box detection:

pts = 

[[[509 457]]

 [[555 457]]

 [[555 475]]

 [[509 475]]]

I would like to convert the area inside the bounding box described by pts to grayscale using:

gray_image = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)

However I am not sure how should I provide the image argument in above gray_image as I want to convert only the area described by pts to grayscale and not the entire image (img2). I want the rest of the image to be white.

like image 496
Ajinkya Avatar asked Dec 11 '25 02:12

Ajinkya


1 Answers

From my understanding you want to convert the content of the bounding box to grayscale, and set the rest of the image to white (background).

Here would be my solution to achieve that:

import cv2
import numpy as np

# Some input image
image = cv2.imread('path/to/your/image.png')

# Some pts 
pts = np.array([[60, 40], [340, 40], [340, 120], [60, 120]])

# Get extreme x, y coordinates from box
x1 = pts[0][0]
y1 = pts[0][1]
x2 = pts[1][0]
y2 = pts[2][1]

# Build output; initialize white background
image2 = 255 * np.ones(image.shape, np.uint8)
image2[y1:y2, x1:x2] = cv2.cvtColor(cv2.cvtColor(image[y1:y2, x1:x2], cv2.COLOR_BGR2GRAY), cv2.COLOR_GRAY2BGR)

# Show bounding box in original image
cv2.polylines(image, [pts], True, (0, 255, 0), 2)

cv2.imshow('image', image)
cv2.imshow('image2', image2)
cv2.waitKey(0)
cv2.destroyAllWindows()

The main "trick" is to use OpenCV's cvtColor method twice just on the region of interest (ROI) of the image, first time converting BGR to grayscale, and then grayscale back to BGR. Accessing rectangular ROIs in "Python OpenCV images" is done by proper NumPy array indexing and slicing. Operations solely on these ROIs are supported by most OpenCV functions (Python API).

EDIT: If your final image is a plain grayscale image, the backwards conversion of course can be omitted!

These are some outputs, I generated with my "standard image":

Output 1

Output 2

Hope that helps!

like image 112
HansHirse Avatar answered Dec 13 '25 15:12

HansHirse



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!