I'm processing a photo of the Earth from the Moon, as seen here.

What I want to do is get the contour points that correspond to the edge of the Earth that is not in shadow. (e.g., the areas that should form an ellipsoid). I want the physical points on the edge of globe itself, and I'm trying to avoid using Gaussian blurring to avoid data destruction.
I have this code that produces contours.
import numpy as np
import cv2 as cv
# Read in Image, convert to grayscale
image_path = r"moon_check.png"
im = cv.imread(image_path)
imgray = cv.cvtColor(im, cv.COLOR_BGR2GRAY)
# Threshold, then find contours
ret, thresh = cv.threshold(imgray, 0, 1, cv.THRESH_BINARY + cv.THRESH_OTSU)
contours, hierarchy = cv.findContours(thresh, cv.RETR_EXTERNAL , cv.CHAIN_APPROX_NONE)
contours = sorted(contours, key=cv.contourArea, reverse=True)
test = im
cnt = contours[0]
The resulting contour is pretty good:

... but I want to reduce the Contour elements to a non-contiguous one that excludes the shadows below the terminator, leaving me with only the curved elements near the top.
I've thus far tried the following with poor results:
Manual curvature sampling does work to a certain degree, but I've not found a good 2nd Order derivative check to exclude the regions at the bottom beyond some really manual tuning.
OpenCV minEnclosingCircle doesn't fit very well, possibly due to the oblate spheroid nature of the Earth. Plus, going from the fit to keeping the points closest to that fit is fairly inelegant. I guess I could apply it iteratively and filter by min-distance to the enclosing circle, but that doesn't seem particularly efficient.
I tried the OpenCV Hough's Circle, but it didn't work well using the grayscale image-- probably due to my avoidance of blurring the image etc.
The shadow contours make a classic circle fit method like Kasa's or Coope's not work well.
Convex matching should work for most of it, but many elements on the bottom are also convex.
Any suggestions on methods to explore? Should I be changing how I approach the initial contour creation in the first place?
(but good to know about if you don't know)
I would suggest RANSAC, as I almost always do when you have a regression whose problem is not accuracy of the data, but numerous outliers. Even more when it is a linear regression (theoretically RANSAC works with any regression. But in practice, there are zillions of libraries with RANSAC applied to linear regression, and very few with more general models).
But in your case it is even simpler. You don't really need some random choosing of the points (well, that almost what I am about to do. But not completely random), since you know that the correct points are contiguous. So, you just have to choose a subsequence of your sequence of points that yield to the smallest error.
(Kåsa and Coope's method)
First of all, finding a circle from points is a linear problem.
You know that for each sample point (x,y), we should have
(x-x₀)² + (y-y₀)² = R²
That is
x² + x₀² - 2x.x₀ + y² + y₀² - 2y.y₀ - R² = 0
that is
2x₀·x + 2y₀·y + (R²-x₀²-y₀²)·1 = x²+y²
where x₀, y₀ and R are the unknown I am looking for, that is center and radius of the circle.
So all you have to find are 2 coefficient (α=2x₀ and β=2y₀), and an intercept (γ=R²-x₀²-y₀²), such as αx+βy+γ = x²+y². Which is just a linear regression problem, easily solvable by a Moore-Penrose inverse.
So, if all your points were ok this code would find the circle
X=np.array([cnt[:,0,0],cnt[:,0,1],np.ones((len(cnt),))]).T # 3 columns of x, y and 1
Y=cnt[:,0,0]**2+cnt[:,0,1]**2 # column of x²+y²
# So we are looking for coefficient such as X@cf = Y. Which is done by Moore-Penrose
cf = np.linalg.inv(X.T@X)@X.T@Y # Those are our α, β, γ
# Now we just have to deduce x0, y0, R from that
x0 = cf[0]/2 # since α=2x₀
y0 = cf[1]/2 # since β=2y₀
R = np.sqrt(cf[2]+x0**2+y0**2) # Since γ=R²-x₀²-y₀²
But of course, it doesn't work, since most of the example points are not on the circle (from your image, one may believe that you have some 60% correct points, but in reality, the fractal nature of the shadow is such as it is more something like 15% correct points. 85% are the shadow border)
And a linear regression find a compromise between BS and correctness.
RANSAC algorithm try to find by trial an error a subset of points such as the number of points that are close to the model is big enough. Then increase the subset with all points close enough. Redo the regression.
What I do is almost that. Except that I don't need the increase part, nor really the random part. I just bet that there is a subset of W=500 consecutive points that fit well an arc of circle.
W=500 # Number of consecutive points that may be on the circle
results=[]
for i in range(0, len(cnt)-W, 50):
    # Subset
    XX=X[i:i+W]
    YY=Y[i:i+W]
    # Same as before, on subset
    cf=np.linalg.inv(XX.T@XX)@XX.T@YY
    x0=cf[0]/2
    y0=cf[1]/2
    R=np.sqrt(cf[2]+x0**2+y0**2)
    # Error computation
    rmse = ((np.sqrt(((XX[:,:2]-[x0,y0])**2).sum(axis=1))-R)**2).mean() # RMSE
    results.append([x0,y0,R,rmse])
results=np.array(results)
bestI = np.argmin(results[:,3])
x0,y0,R=results[bestI, :3]

See how we can't barely see the green point, except in the shadow border (because the circle is almost perfect — and my screenshot lowres :D)
We could improve that, of course, by doing what would be the next step of that "not-random ransac" (we have have done now, is finding a subset of points that match well an arc circle. And find the circle parameters for that subset, that is a model. Now that we have done so, we could go back to the whole set of points, select all those that are close enough to our model circle. And redo a regression again, to fine tune accuracy).
Which happens to be what we need to do to answer the orginal question
All that was to answer to your different trials of HoughCircle, MinCircle, ... But your title question is to remove outlier.
For that, I can simply compute the distance from the center to all points, and drop them if they are too far. Say 3 pixels from it
dist = np.sqrt((x0-X[:,0])**2 + (y0-X[:,1])**2)
err = (dist-R)**2
XX=X[err<=9]
That XX are the outline points
Which ends my main answer.
We can now, if we want, do a bit what RANSAC does: now that we have filtered the points, we can do a new regression from them, rather than from a subset of them
YY=Y[err<=9]
cf=np.linalg.inv(XX.T@XX)@XX.T@YY
x0=cf[0]/2
y0=cf[1]/2
R=np.sqrt(cf[2]+x0**2+y0**2)
And then, from there, if we wish, refilter the points by their error with this new model (which differs by less that 1 pixel from the previous, yet, it is slightly better). Then do a regression again. Then filter again. Etc.
In your case, since you know the good/bad points are contiguous, you could also just try to find the index interval of the bad points
np.argmax(err>100), len(err)-np.argmax(err[::-1]>100)
Filtered with a lenient threshold (10²)
is 479, 2631. So question could be turned into "which subset np.vstack([X[:a], X[b:]]), with a≤479 and b≥2631, gives the best regression.
You said
minEnclosingCircle doesn't fit very well, possibly due to the oblate spheroid nature of the Earth
There is no oblate spheroid nature of the Earth. Well, there is, of course. That is physics. Rotating planets are not perfect sphere. But that is way less important that people tend to believe, because of exaggerated pictures (a little bit like the misconception about impact of mountains and abyss. In both case we are talking something in the order of 20 kms, compared to a diameter of 12700 kms. That is barely 2 pixels on your image.
For most coder, the earth is a sphere. People who need to know that it is an oblate spheroid, not a sphere, are usually not coders :D (and if they are, they have way better image that you do :D)
I tried the OpenCV Hough's Circle, but it didn't work well using the grayscale image-- probably due to my avoidance of blurring the image etc.
No. It never does. (I am kidding of course. But every time I need to detect some circle in an image — and that has happened a few times, always with some specificity that make impossible to reuse directly the previous time method — I try it. And every time, I need to find something else). If wouldn't blame bluring on here. By its nature, Hough need some binning.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With