Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How do I calculate a line from a series of points?

Probably an easy question, but I could not find an easy solution so far. I'm working on a simple image recognition software for a very specific use case.

Given is a bunch of points that are supposedly on a straight line. However, some of the points are mistakenly placed and away from the line. Especially near the ends of the line, points may happen to be more or less inaccurate.

Example:

   X            // this guy is off
         X      // this one even more
 X              // looks fine
 X
  X
      X         // a mistake in the middle
  X
     X          // another mistake, not as bad as the previous
   X
    X
   X
    X
         X      // we're off the line again

The general direction of the line is known, in this case, it's vertical. The actual line in the example is in fact vertical with slight diagonal slope.

I'm only interested in the infinite line (that is, it's slope and offset), the position of the endpoints is not important.

As additional information (not sure if it is important), it is impossible for 2 points to lie next to each other horizontally. Example:

   X
   X
    X
   X X   // cannot happen
    X
     X

Performance is not important. I'm working in C#, but I'm fine with any language or just a generic idea, too.

like image 764
mafu Avatar asked Oct 27 '25 06:10

mafu


2 Answers

Linear regression (as mentioned by others) is good if you know you do not have outliers.

If you do have outliers, then one of my favorite methods is the median median line method: http://education.uncc.edu/droyster/courses/spring00/maed3103/Median-Median_Line.htm

Basically you sort the points by the X values and then split the points up into three equal sized groups (smallest values, medium values, and largest values). The final slope is the slope of the line going through the median of the small group and through the median of the large group. The median of the middle group is used with the other medians to calculate the final offset/intercept.

This is a simple algorithm that can be found on several graphing calculators.

By taking the three medians, you are completely ignoring any outliers (either on the far left, far right, far up, or far down).

The image below shows the linear regression and median-median lines for a set of data with a couple of large outliers.

Linear Regression vs. Median-Median

like image 76
Jason Moore Avatar answered Oct 29 '25 08:10

Jason Moore


I think you're looking for Least squares fit via Linear Regression

like image 40
Mike Pennington Avatar answered Oct 29 '25 10:10

Mike Pennington