Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Finding coordinates of corners of the mask(rectengular shape) from Mask matrix(Boolean matrix) in Mask-RCNN?

in my project, I try to detect shop signs in my dataset. I'm using Mask-RCNN. The image sizes are 512x512. shop sign images with Mask-RCNN

results = model.detect([image], verbose=1)
r = results[0]
masked = r['masks']
rois = r['rois']

After I run above code, 'rois' gave me the coordinates of bounding boxes of the shop sign (e.g. [40, 52, 79, 249]). r['masks'] gave me the boolean matrix which represents each masks in the image. The pixel value in the mask matrix is 'True' if this pixel is in the mask region. And the pixel value is 'False' if this pixel is out of the mask region. If the model detects 7 shop signs (i.e. 7 masks) in the image, size of the r['masks'] is 512x512x7. Each channel represents different masks.

I have to deal with each mask individually, therefore I separated each channel and let's say get the first one. Then I found the coordinates in the mask array of the 'True' pixels.

array = masked[:,:,0]

true_points = []
for i in range(512):
    for j in range(512):
        if array[i][j] == True:
            true_points.append([j, i])

So, my question is how can I get the coordinates of the corner of the mask(i.e. shop sign) from this boolean matrix? Most of the shop signs are rectengular but they can be rotated. I have coordinates of bounding box, but it is not accurate when shop sign is rotated. I have coordinates of 'True' points. Can you suggest an algorithm to find corner 'True' values?

like image 746
Enes Berk Karahançer Avatar asked Oct 20 '25 18:10

Enes Berk Karahançer


2 Answers

Perspective transform can be used for this problem:

  1. Find corner points of shop signs from detected masks. (src points)
  2. Desired rectangle box points (dst points)
  3. Generate a new image with cv2.getPerspectiveTransform and cv2.warpPerspective

For corner points detection, we can use cv2.findContours and cv2.approxPolyDP

cv2.findContours find contours in a binary image.

contours, _ = cv2.findContours(r['masks'][:,:,0].astype(np.uint8), cv2.RETR_TREE, cv2.CHAIN_APPROX_SIMPLE)

Then approximate the rectangle from contours using cv2.approxPolyDP:

The crucial point in cv2.approxPolyDP is the epsilon (parameter for approximation accuracy). Custom thresholding for rectangle point detection(below)

def Contour2Quadrangle(contour):
    def getApprox(contour, alpha):
        epsilon = alpha * cv2.arcLength(contour, True)
        approx = cv2.approxPolyDP(contour, epsilon, True)
        return approx

    # find appropriate epsilon
    def getQuadrangle(contour):
        alpha = 0.1
        beta = 2 # larger than 1
        approx = getApprox(contour, alpha)
        if len(approx) < 4:
            while len(approx) < 4:
                alpha = alpha / beta
                approx = getApprox(contour, alpha)  
            alpha_lower = alpha
            alpha_upper = alpha * beta
        elif len(approx) > 4:
            while len(approx) > 4:
                alpha = alpha * beta
                approx = getApprox(contour, alpha)  
            alpha_lower = alpha / beta
            alpha_upper = alpha
        if len(approx) == 4:
            return approx
        alpha_middle = (alpha_lower * alpha_upper ) ** 0.5
        approx_middle = getApprox(contour, alpha_middle)
        while len(approx_middle) != 4:
            if len(approx_middle) < 4:
                alpha_upper = alpha_middle
                approx_upper = approx_middle
            if len(approx_middle) > 4:
                alpha_lower = alpha_middle
                approx_lower = approx_middle
            alpha_middle = ( alpha_lower * alpha_upper ) ** 0.5
            approx_middle = getApprox(contour, alpha_middle)
        return approx_middle

    def getQuadrangleWithRegularOrder(contour):
        approx = getQuadrangle(contour)
        hashable_approx = [tuple(a[0]) for a in approx]
        sorted_by_axis0 = sorted(hashable_approx, key=lambda x: x[0])
        sorted_by_axis1 = sorted(hashable_approx, key=lambda x: x[1])
        topleft_set = set(sorted_by_axis0[:2]) & set(sorted_by_axis1[:2])
        assert len(topleft_set) == 1
        topleft = topleft_set.pop()
        topleft_idx = hashable_approx.index(topleft)
        approx_with_reguler_order = [ approx[(topleft_idx + i) % 4] for i in range(4) ]
        return approx_with_reguler_order

    return getQuadrangleWithRegularOrder(contour)

Lastly, we generate a new image with our desired destination coordinates.

contour = max(contours, key=cv2.contourArea)
corner_points = Contour2Quadrangle(contour)
src = np.float32(list(map(lambda x: x[0], corner_points)))
dst = np.float32([[0,0],[0, 200],[400, 200],[200, 0]])

M = cv2.getPerspectiveTransform(src, dst)
transformed = cv2.warpPerspective(img, M, (rect_img_w, rect_img_h))
plt.imshow(transformed) # check the results
like image 119
Muhammad Haseeb Khan Avatar answered Oct 23 '25 06:10

Muhammad Haseeb Khan


If you know the rotation angle just rotate the bbox corners e.g. usig cv2.warpAffine on the corner points. If you dont , then you can find the extrema more-or-less easily like this

H,W = array.shape
left_edges = np.where(array.any(axis=1),array.argmax(axis=1),W+1)
flip_lr = cv2.flip(array,1) #1 horz vert 0
right_edges = W-np.where(flip_lr.any(axis=1),flip_lr.argmax(axis=1),W+1)
top_edges = np.where(array.any(axis=0),array.argmax(axis=0),H+1)
flip_ud = cv2.flip(array,0) #1 horz vert 0
bottom_edges = H - np.where(flip_ud.any(axis=0),flip_ud.argmax(axis=0),H+1)
leftmost = left_edges.min()
rightmost = right_edges.max()
topmost = top_edges.min()
bottommost = bottom_edges.max()

Your bbox has corners (leftmost, topmost), (rightmost, bottommost), here's an example where i tried it. BTW if you find yourself looping over pixels, you should know there's almost always a numpy vectorized operation that will do it a lot faster.

like image 32
jeremy_rutman Avatar answered Oct 23 '25 08:10

jeremy_rutman



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!