I will have two images.
They will be either the same or almost the same.
But sometimes either of the images may have been moved by a few pixels on either axis.
What would be the best way to detect if there is such a move going on?
Or better still, what would be the best way to manipulate the images so that they fix for this unwanted movement?
If the images are really nearly identical, and are simply translated (i.e. not skewed, rotated, scaled, etc), you could try using cross-correlation.
When you cross-correlate an image with itself (this is the auto-correlation), the maximum value will be at the center of the resulting matrix. If you shift the image vertically or horizontally and then cross-correlate with the original image the position of the maximum value will shift accordingly. By measuring the shift in the position of the maximum value, relative to the expected position, you can determine how far an image has been translated vertically and horizontally.
Here's a toy example in python. Start by importing some stuff, generating a test image, and examining the auto-correlation:
import numpy as np
from scipy.signal import correlate2d 
# generate a test image
num_rows, num_cols = 40, 60
image = np.random.random((num_rows, num_cols))
# get the auto-correlation
correlated = correlate2d(image, image, mode='full')
# get the coordinates of the maximum value
max_coords = np.unravel_index(correlated.argmax(), correlated.shape)
This produces coordinates max_coords = (39, 59). Now to test the approach, shift the image to the right one column, add some random values on the left, and find the max value in the cross-correlation again:
image_translated = np.concatenate(
    (np.random.random((image.shape[0], 1)), image[:, :-1]), 
    axis=1)
correlated = correlate2d(image_translated, image, mode='full')
new_max_coords = np.unravel_index(correlated.argmax(), correlated.shape)
This gives new_max_coords = (39, 60), correctly indicating the image is offset horizontally by 1 (because np.array(new_max_coords) - np.array(max_coords) is [0, 1]). Using this information you can shift images to compensate for translation.
Note that, should you decide to go this way, you may have a lot of kinks to work out. Off-by-one errors abound when determining, given the dimensions of an image, where the max coordinate 'should' be following correlation (i.e. to avoid computing the auto-correlation and determining these coordinates empirically), especially if the images have an even number of rows/columns. In the example above, the center is just [num_rows-1, num_cols-1] but I'm not sure if that's a safe assumption more generally.
But for many cases -- especially those with images that are almost exactly the same and only translated -- this approach should work quite well.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With