HP Forums

Full Version: HP 48G Linear Regression Best Fit Line
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
Pages: 1 2 3 4
Is the HP 48G scatterplot function the right function to use for determining the best fit line for a set of points? These points are the x,y coordinates representing the property corners of a piece of land. YouTube videos explaining best fit line show the points in reference to the y-axis. Surveyors use points perpendicular to the best fit line. Is it possible to get residuals that are perpendicular to the best fit line? Surveyors would interpret the residuals as left or right of the line.
The least-squares fit is fairly easy. As one is fitting a quadratic, one can use linear algebra on the derivatives.

Minimizing the sum of absolute values of the vertical distances is non-linear and quite difficult. (Finding the maximum distance isn't easy either.)

One must decide why to use one type of fitting or the other. In many cases, there is little or no error in X values, so the regression of Y on X is reasonable. If the errors of both the X and Y values are similar (as in the already mentioned surveying problems), then minimizing the vertical distances is useful.

http://www.optimization-online.org/DB_FI...6/6065.pdf

This paper gives a method of solving the problem. One of my students chose this as a computer project and he did find a set of third-degree simultaneous equations and used Newton's method on them.
MNH, have a look at the topic "Orthogonal regression" near the bottom of this page: https://en.wikipedia.org/wiki/Deming_regression

Is that what you are looking for?

Can you give us a set of points to use as test data where you already know the result a surveyor would get?
(12-06-2021 01:24 AM)ttw Wrote: [ -> ]This paper gives a method of solving the problem.

The paper is above and beyond my current knowledge of mathematics. I only work in two-dimensional space.
(12-08-2021 10:41 AM)Rodger Rosenbaum Wrote: [ -> ]MNH, have a look at the topic "Orthogonal regression" near the bottom of this page: https://en.wikipedia.org/wiki/Deming_regression
Is that what you are looking for?

After a quick glance, yes.

Quote:Can you give us a set of points to use as test data where you already know the result a surveyor would get?

Yes. I'll post some data tomorrow evening.
Your first post mentioned Youtube videos. Can you give us a link to them?
(12-09-2021 05:21 AM)Rodger Rosenbaum Wrote: [ -> ]Can you give us a link to them?

Here's one of them.
(12-09-2021 01:34 AM)MNH Wrote: [ -> ]
(12-08-2021 10:41 AM)Rodger Rosenbaum Wrote: [ -> ]MNH, have a look at the topic "Orthogonal regression" near the bottom of this page: https://en.wikipedia.org/wiki/Deming_regression
Is that what you are looking for?

After a quick glance, yes.

Note the above is an orthogonal least squares (ORLS) regression rather than an orthogonal absolute value (ORLAV) regression. There are algorithms for the latter (ORLAV), such as those described by the following:

1. T.M.Cavalier and B.J.Melloy, An Iterative Linear Programming solution to the Euclidean Regression Model. Computers Ops Res. 18, 655-661 (1991).
2. Orthogonal Linear Regression Algorithm Based On Augmented Matrix Formulation –Andrzej Bargiela, Joanna K. Hartley, Burton St, Nottingham NG1 4BU

Here is a link to a copy of the second paper cited above (a bit math heavy, perhaps): https://citeseerx.ist.psu.edu/viewdoc/do...1&type=pdf
(12-09-2021 10:05 PM)ijabbott Wrote: [ -> ]Here is a link to a copy of the second paper cited above (a bit math heavy, perhaps): https://citeseerx.ist.psu.edu/viewdoc/do...1&type=pdf

Interesting how the outliers are rejected when using the ORLAV regression. Can the outliers be parameterized? I prefer rejecting the outliers myself. Surveying is often subjective.
MNH, don't forget to post a few data points that I can use to demonstrate how to perform orthogonal regression on the HP48G. It would be good if you also have the known regression for the data to verify our results.
(12-10-2021 05:51 AM)Rodger Rosenbaum Wrote: [ -> ]MNH, don't forget to post a few data points that I can use to demonstrate how to perform orthogonal regression on the HP48G.

Look for the data this evening. I worked late yesterday. Thank you for your continued interest!
Rodger Rosenbaum:

221,1530857.718,521617.749
222,1530857.053,521555.311
223,1530856.348,521492.757
225,1530851.655,521282.045
226,1530851.271,521161.978
227,1530851.231,521111.996
228,1530850.933,521062.028
229,1530850.418,520962.341
230,1530850.481,520912.131
232,1530858.981,521742.725
233,1530858.372,521680.206
234,1530850.268,520862.091
237,1530850.066,520812.156

The data is listed as Point #, Northing (y-coordinate), Easting (x-coordinate).
The beginning and ending points can be renumbered and described to differentiate them from their aliases. If a weight (option <0>) is assigned to the beginning and ending points, those points will be held and no best fit line will be calculated for them.
Do you have a result for regression of these data by some other software that you could post?

I'm typing in the values as I watch the evening news. :-)
(12-11-2021 03:16 AM)Rodger Rosenbaum Wrote: [ -> ]Do you have a result for regression of these data by some other software that you could post?

You're asking for too much, buddy! :-) Sorry, but I only use one brand of office survey software. I would like to write a program for my Emu48 (Samsung Galaxy S10e), which I always have with me at work. I don't want to carry a separate calculator because it might get damaged. We have field computers (Windows 10) with other survey software, but I'm pretty sure it doesn't have a best fit line routine.
I plotted your data and I get this:

Do you want to fit the apparent two groupings separately, or the whole thing as one line?
(12-11-2021 04:47 AM)Rodger Rosenbaum Wrote: [ -> ]Do you want to fit the apparent two groupings separately, or the whole thing as one line?

The whole thing as one line. The separation of the two groupings is due to the fact that they represent property corners along the right-of-way of two different blocks. The goal is to compare the calculated best fit line on one side of the street with points on the opposite side. This is done by using a perpendicular offset routine to those points. We then compare the right-of-way width of the located (what's in the ground) points with the recorded plat (map) right-of-way width.
Here's what I get for an orthogonal LS fit:

The slope and intercept for the orthogonal fit are:

slope = .010414546538
intercept = 1525424.85897

For an ordinary (not orthogonal) LS fit I get:

slope = .0104144648553
intercept = 1525424.90155

There is more than one way to do the orthogonal fit. I'll post some more later.
(12-08-2021 10:41 AM)Rodger Rosenbaum Wrote: [ -> ]MNH, have a look at the topic "Orthogonal regression" near the bottom of this page: https://en.wikipedia.org/wiki/Deming_regression

There is a flaw with the formula. It assumed slope is always positive.
We should solve the quadratic, then throw away the bad root instead.

Using MNH supplied example (solve orthogonal slope only)
For better accuracy, (X,Y) shifted so that ΣX = ΣY = 0
Shifted data have same regression slope, with simpler calculations.

Calculations done in XCas (data already somewhat shifted, to reduce typing) :

Y := [7.718, 7.053, 6.348, 1.655, 1.271, 1.231,
0.933, 0.418, 0.481, 8.981, 8.372, 0.268, 0.066]:;
X := [1617.749,1555.311,1492.757,1282.045,1161.978,1111.996,
1062.028,962.341,912.131,1742.725,1680.206,862.091,812.156]:;

X, Y := X .- mean(X), Y .- mean(Y) :;
N := len(X); ﻿ ﻿ ﻿ ﻿ ﻿ ﻿ → 13

B1 := dot(X,Y) / dot(X,X) // slope (ind. var. X)
B2 := dot(X,Y) / dot(Y,Y) // slope (ind. var. Y)

XCas> B1*B2 ﻿ ﻿ ﻿ ﻿ ﻿ ﻿ // R² = 0.93256592979
XCas> [B1, 1/B2] // orthongal fit line slope range

[0.0104144648675, 0.0111675373664]

Note: if R² = 1, orthongal slope = B1

XCas> proot([1, 1/B1-1/B2, -1] ﻿ ﻿ ﻿ ﻿ ﻿ ﻿ // ≡ proot([1, 1/B1-B1/R², -1])
[-96.0195430799, 0.0104145465384]

Product of roots = -1 implied 2 roots have opposite sign.
The slope we wanted (in bold) has same sign as B1

---

Proof: Let m = orthongal line slope

If we rotate (x,y) data by atan(-m), its linear regression slope must be 0.

X+Y*i = (x+y*i) * (1-m*i) = (x+y*m) + (y-x*m)*i

Let S(XY) = Σ((X-(ΣX/n)) * (Y-(ΣY/n))) = ΣXY - ΣX*ΣY/n
Linear regression slope = S(XY)/S(XX) = 0

S(XY) = 0
ΣXY = ΣX*ΣY / n
Σ((x+y*m)*(y-x*m)) = Σ(x+y*m) * Σ(y-x*m) / n
(1-m²)*Σxy + (Σy² - Σx²)*m = (1-m²)*Σx*Σy/n + m*((Σy)²-(Σx)²)/n
(1-m²) * S(xy) + m * (S(yy) - S(xx)) = 0
m² + (S(xx) - S(yy))/S(xy) * m - 1 = 0
MNH, of course, wants to do this on his HP48G (EMu48, I think).

I wouldn't use the technique described on that WikiPedia page; there's no need to involve complex variables.

It can be done with the built-in LR function. if the data are input via the standard method to enter single variable data, ΣDAT will contain the X and Y data. The various summary statistics variables are used to compute the linear regression (LR), and a small program can compute the orthogonal regression from those same variables.

I called the program ORTH and used it to solve MNH's problem.

Here it is:

<< LR SWAP DROP DUP CORR DUP SIGN 4 ROLLD SQ / SWAP INV -
2 / DUP SQ 1 + √ ROT * + DUP ΣX * NEG ΣY + NΣ / >>

To use this, enter the data in the usual way as if to enable the later use of the
LR function. The result will be the slope and intercept for the line which minimizes the SSE with the deviation from the line in the orthogonal direction.
(12-11-2021 05:38 PM)Rodger Rosenbaum Wrote: [ -> ]It can be done with the built-in LR function.

Can the LR function be executed within a program? Thank you for taking your time to help me out! I'll get down to some programming tomorrow.
Pages: 1 2 3 4
Reference URL's
• HP Forums: https://www.hpmuseum.org/forum/index.php
• :