Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

how to detect image translation with only numpy and PIL

Given two images, I need to detect if there is a translation offset between the two. I am only able to use numpy and PIL.

This post shows how to apply an (x, y) translation with PIL, but haven't found something similar for how to detect the translation.

From what I've read, cross-correlation seems to be part of the solution, and there is the numpy.correlate function. However, I don't know how to use the output of this function to detect horizontal and vertical translation coordinates.

first image

second image

like image 975
Jelly Wu Avatar asked Jan 23 '26 06:01

Jelly Wu


1 Answers

Since these are (almost) 2D arrays, you want the scipy.signal.correlate2d() function.

First, read your images and cast as arrays:

import numpy as np
from PIL import Image
import requests
import io

image1 = "https://i.sstatic.net/lf2lc.png"
image2 = "https://i.sstatic.net/MMSdM.png"

img1 = np.asarray(Image.open(io.BytesIO(requests.get(image1).content)))
img2 = np.asarray(Image.open(io.BytesIO(requests.get(image2).content)))

# img2 is greyscale; make it 2D by taking mean of channel values.
img2 = np.mean(img2, axis=-1)

Now we have the two images, we can adapt the example in the scipy.signal.correlate2d() documentation:

from scipy import signal

corr = signal.correlate2d(img1, img2, mode='same')

If you want to avoid using scipy for some reason, then this should be equivalent:

pad = np.max(img1.shape) // 2
fft1 = np.fft.fft2(np.pad(img1, pad))
fft2 = np.fft.fft2(np.pad(img2, pad))
prod = fft1 * fft2.conj()
result_full = np.fft.fftshift(np.fft.ifft2(prod))
corr = result_full.real[1+pad:-pad+1, 1+pad:-pad+1]

Now we can compute the position of the maximum correlation:

y, x = np.unravel_index(np.argmax(corr), corr.shape)

Now we can visualize the result, again adapting the documentation example:

import matplotlib.pyplot as plt

y2, x2 = np.array(img2.shape) // 2

fig, (ax_img1, ax_img2, ax_corr) = plt.subplots(1, 3, figsize=(15, 5))
im = ax_img1.imshow(img1, cmap='gray')
ax_img1.set_title('img1')
ax_img2.imshow(img2, cmap='gray')
ax_img2.set_title('img2')
im = ax_corr.imshow(corr, cmap='viridis')
ax_corr.set_title('Cross-correlation')
ax_img1.plot(x, y, 'ro')
ax_img2.plot(x2, y2, 'go')
ax_corr.plot(x, y, 'ro')
fig.show()

The green point is the centre of img2. The red point is the position at which placing the green point gives the maximum correlation.

img1, img2 and correlation

like image 179
Matt Hall Avatar answered Jan 24 '26 21:01

Matt Hall