Accuracy of Quadratic Regression

08202023, 05:07 PM
Post: #1




Accuracy of Quadratic Regression
Dear HP Prime User Thread
There seems to be a subtlety with the quadratic regression that I cannot figure out: the HP Prime results for a sample problem are subtly different from those from MS Excel, an on line tool (https://ezcalc.me/quadraticregressioncalculator/) and TI84 x y 2003 813 2004 941 2005 962 2006 1053 2007 1132 2008 1194 2009 1205 2010 1244 2011 1254 2012 1262 > Excel, and other tools calculate the quadratic fit as: y = 5.46969697 X^2 + 22010.28788 X 22141315.33 > HP calculates it is: y = 5.47571307 X^2 + 22034.44253 X 22165560.54 It is close ... but annoyingly different. I played with HOME/CAS settings and set home number format to standard and floating 11 and CAS format to standard 12 to no avail. I am not sure if this is related to floating point accuracy (although I thought that HP Prime has better accuracy than TI84), or I may be missing a special setting. I am a long time HP28S and HP4G user and HP fan in general and recently purchased my daughter an HP Prime. I would appreciate help/insight into this scenario Thx PS: The linear fit matches the other calculators exactly 

08212023, 10:03 AM
Post: #2




RE: Accuracy of Quadratic Regression
The correct coefficients are [361/66,1452679/66,66423946/3], you can get them from CAS with the following commands:
Code: m:=[[2003,813],[2004,941],[2005,962],[2006,1053],[2007,1132],[2008,1194],[2009,1205],[2010,1244],[2011,1254],[2012,1262]]; 

08212023, 05:52 PM
(This post was last modified: 08222023 11:15 AM by Albert Chan.)
Post: #3




RE: Accuracy of Quadratic Regression
(08202023 05:07 PM)bernhardSA Wrote: I am not sure if this is related to floating point accuracy Yes, it is due to floating point precision. (HOME have only 12 decimals) To solve for quadratic best fit, we need upto Σx²y (topright, m=2) Update: Worse, for quadratic fit, we need Σx^{4} (topleft, m=2) Generalized mean, M4 ≥ M1 OP example, Σx^{4} ≥ ((2003+2012)/2)^4 * 10 ≈ 1.624E14 // 15 digits! To get best fit coefficients, it may be better to reduce both x and y from its mean. We can simply knock down year 2000, (2003 .. 2012) > (3 .. 12) APPS quadratic fits get Y = 5.4696969697*X^2+131.5*X+472.545454545 CAS> symb2poly(5.4696969697*X^2+131.5*X+472.545454545  X=x2000 [−5.4696969697, 22010.2878788, −22141315.3333] This matched Excel results, y = 5.46969697 x^2 + 22010.28788 x 22141315.33 

08212023, 08:10 PM
(This post was last modified: 08212023 08:40 PM by Steve Simpkin.)
Post: #4




RE: Accuracy of Quadratic Regression
Interesting that the TI89 calculates the quadratic fit very close to the Excel and other tools values. Internally, the TI89 calculates and retains all decimal results with up to 14 significant digits (although a maximum of 12 are displayed).
Update: Oddly the HP 50g does not calculate Quadratic regression as a builtin regression type. The program PREGR appears to be able to do this (and more). https://www.hpcalc.org/details/3151 

08212023, 09:21 PM
(This post was last modified: 08222023 03:33 AM by Steve Simpkin.)
Post: #5




RE: Accuracy of Quadratic Regression
For reference, the least expensive calculator that I have that can perform Quadratic Regression is the Casio fx991EX (about $22 USD). Its answer matched the Excel results exactly. For Statistics, this model will remember the data until you switch out of Statistics mode.
See photo of results at: https://drive.google.com/file/d/1JUvxsNx...sp=sharing Update: Here are the results of the Quadratic Regression calculation on the TI36X Pro (about $20 USD). I used the online TI36X Pro Emulator for this example. Its answer also matched the Excel results exactly. See photo of results at: https://drive.google.com/file/d/14BazIaV...sp=sharing 

08212023, 09:33 PM
Post: #6




RE: Accuracy of Quadratic Regression
Dear Thread Responders
I appreciate the quick and insightful answers. This has been very useful. The first set of "less accurate" results were via the Statistics2Var app. I have been able to get the same high accuracy results with CAS command line programming, and the modified xvector in the statistics app as suggested. It is a little surprising that HP chose to segregate the lower precision for HOME and higher precision for CAS. Why not simply do all the math at highest precision and display in most accurate floating point format or super snazzy symbolic/fraction/exact format as required by user. A high school student may not care (or appreciate HOME vs CAS subtlety)  they just need to get the same answer as the teacher's solution key, and using the statistics app may seem faster more attractive than explicitly typing in matrices and commands ... it stands to reason that they might expect the same answer regardless of the path. As a longstanding old school HP user and fan, and one who was trying to convert his high school freshman child to HP and RPN, this has been an unexpected speedbump. Again thx for the quick help (I will keep the HP and my high schooler will now get a TI) 

08222023, 06:35 AM
Post: #7




RE: Accuracy of Quadratic Regression
(08212023 09:33 PM)bernhardSA Wrote: As a longstanding old school HP user and fan, and one who was trying to convert his high school freshman child to HP and RPN, this has been an unexpected speedbump.I believe this is a disappointing conclusion. Getting exactly the same answer as the teacher should not be that important. This difference could be a starting point to have a deeper understanding of the math behind : 1/ Floating point is not exact, and depending on the representation you will get different results for the same computation. For bad conditionned problems, this might be very different. Here if you define y1:=5.46969697 x^2 + 22010.28788 x 22141315.33 and y2:= 5.47571307 x^2 + 22034.44253 x 22165560.54 and plot(y2y1,x=2003..2010), you see that the absolute difference is less than about 0.05 for data about 1e3, that's a relative error of less than 0.01%, something that is negligible. 2/ Accuracy can sometimes be improved, here by translating the data. 3/ It is a good idea to check how the answer changes with slightly modified data, this will spot badly conditionned problems. Quadratic regression is more sensitive than a linear regression (and higher degree regressions would be much more sensitive). It should not be used to extrapolate far from the initial data x interval. 

08222023, 07:27 AM
Post: #8




RE: Accuracy of Quadratic Regression
(08212023 09:33 PM)bernhardSA Wrote: A high school student may not care [...] A high school student should be aware of what they are doing. Calculation devices have limits, and we run into one here. My regular MLR/POLR regression programs for the 42S fail utterly here, not even getting a single digit correct; the version with orthogonal factorization gets 56 digits correct (and is thus already better than the Prime's result in HOME). But the ultimate lesson is what Albert shows us here (and in a zillion other posts): have a look at the input data, and remove possible causes of numerical instability or inaccuracy. Cheers, Werner 41CV†,42S,48GX,49G,DM42,DM41X,17BII,15CE,DM15L,12C,16CE 

08222023, 11:00 AM
(This post was last modified: 08222023 11:23 AM by Joe Horn.)
Post: #9




RE: Accuracy of Quadratic Regression
(08212023 05:52 PM)Albert Chan Wrote: To get best fit coefficients, it may be better to reduce both x and y from its mean. (08222023 06:35 AM)parisse Wrote: 2/ Accuracy can sometimes be improved, here by translating the data. Most of the manuals for the older HP calculators explained this idea of "normalizing" data to avoid roundoff errors. For example, here's what the HP32S manual says: HP32S Owner's Handbook Wrote: A cursory search did not find a similar explanation in the HP 50g or Prime owner's manuals. <0ɸ0> Joe 

08222023, 12:22 PM
(This post was last modified: 08222023 12:40 PM by Steve Simpkin.)
Post: #10




RE: Accuracy of Quadratic Regression
(08212023 09:21 PM)Steve Simpkin Wrote: For reference, the least expensive calculator that I have that can perform Quadratic Regression is the Casio fx991EX (about $22 USD). Its answer matched the Excel results exactly. For Statistics, this model will remember the data until you switch out of Statistics mode. Not only did the Casio fx991EX come up with an accurate answer, its QR code feature displayed all of the data in a table, the quadratic coefficients and a graph of the data on the following web page: Example of Quadratic Regression problem The web page allows you to edit or add additional data to the data in the table, zoom or scroll on the graph, add notes and change many aspects of the results. Pretty neat for a $22 calculator. 

08222023, 02:15 PM
Post: #11




RE: Accuracy of Quadratic Regression
I might have done something wrong, but the CFIT program in the curve fitting pack on my HP71 could not even converge using the POLY program!


08222023, 10:56 PM
Post: #12




RE: Accuracy of Quadratic Regression
If we use data relative to its mean, and X is uniformly spaced, by symmetry, Σ(x^(2k+1)) = 0
This saved a lot of calculations! X = [2003,2004,2005,2006,2007,2008,2009,2010,2011,2012] Y = [813,941,962,1053,1132,1194,1205,1244,1254,1262] x = X  X̅ = X  2007.5 y = Y  Y̅ = Y  1106.0 a*Σx³ + b*Σx² + c*Σx = Σxy // uniform X spacing, a, c terms goes away b = Σxy / Σx² = 4080 / 82.5 = 544/11 a*Σx² + b*Σx + c*Σ1 = Σy = 0 // uniform X spacing, b terms goes away a*Σx^{4} + b*Σx³ + c*Σx² = Σx²y a = (Σy Σx²  Σ1 Σx²y) / (Σx² Σx²  Σ1 Σx^{4}) = (0  10*2888) / (82.5^2  10*1208.625) = 361/66 c = (Σy  b*Σx  a*Σx²) / Σ1 = a*Σx² / Σ1 = 361/66 * 82.5 / 10 = 361/8 (Y1106.0) = (361/66)*(X2007.5)^2 + (544/11)*(X2007.5) + 361/8 > Y ≈ 5.4696969697*X^2 + 22010.2878788*X  22141315.3333 

08232023, 11:45 AM
Post: #13




RE: Accuracy of Quadratic Regression
(08222023 10:56 PM)Albert Chan Wrote: c = (Σy  b*Σx  a*Σx²) / Σ1 This is how Casio FX3650P get quadratic regression constant term. With c out of the way, we simply solve for 2 equations with 2 unknown. \(\begin{bmatrix} Σ1\;ΣX^2  ΣX\;ΣX & Σ1\;ΣX^3  ΣX\;ΣX^2\\ Σ1\;ΣX^3  ΣX\;ΣX^2 & Σ1\;ΣX^4  ΣX^2\;ΣX^2 \end{bmatrix} \begin{bmatrix} b \\ a \end{bmatrix} = \begin{bmatrix} Σ1\;ΣXY  ΣX\;ΣY\\ Σ1\;ΣX^2Y  ΣX^2\;ΣY \end{bmatrix} \) Divide both side by n = Σ1, we have Casio's notation \(\begin{bmatrix} SXX & SXX^2\\ SXX^2 & SX^2X^2 \end{bmatrix} \begin{bmatrix} b \\ a \end{bmatrix} = \begin{bmatrix} SXY\\ SX^2Y \end{bmatrix} \) 

08252023, 12:06 AM
Post: #14




RE: Accuracy of Quadratic Regression
(08222023 07:27 AM)Werner Wrote: My regular MLR/POLR regression programs for the 42S fail utterly here, not even getting a single digit correct; the version with orthogonal factorization gets 56 digits correct (and is thus already better than the Prime's result in HOME). X = 2003 .. 2012 = 2007.5 ± 4.5 With huge center, tiny gap, all the equations are similar, if scaled by \(\bar{X}\) \(\begin{bmatrix} 1.00001228271 & 1.00000614135 & 1.00000204712\\ 1.00000614135 & 1.00000204712 & 1.0\\ 1.00000204712 & 1.0 & 1.0 \end{bmatrix} \begin{bmatrix} a\;\bar{X}^2 \\ b\;\bar{X} \\ c \end{bmatrix} = \begin{bmatrix} 1106.40866817 \\ 1106.20323786 \\ 1106.0 \end{bmatrix} \) This make the matrix illconditioned. Tiniest of rounding errors may cause trouble. Starting from original matrix m, we pretreat it a bit before solving it. \(\left[\begin{array}{cccc}162415528660693&80903876075&40300645&44588891682\\80903876075&40300645&20075&22207030\\40300645&20075&10&11060\end{array}\right]\) Cas> m := float(m); // work with float ... no cheating Cas> m(1) = 2007 * m(2); Cas> m(2) = 2007 * m(3); \(\left(\begin{array}{cccc} 41449378168 & 20481560.0 & 10120.0 & 19382472.0 \\ 20481560.0 & 10120.0 & 5.0 & 9610.0 \\ 40300645.0 & 20075.0 & 10.0 & 11060.0 \end{array}\right) \) Cas> m(1) = 2023 * m(2); // once more, for good measure \(\left(\begin{array}{cccc} 15182288.0 & 8800.0 & 5.0 & 58558.0 \\ 20481560.0 & 10120.0 & 5.0 & 9610.0 \\ 40300645.0 & 20075.0 & 10.0 & 11060.0 \end{array}\right) \) Equations are now very different, easy to solve. Cas> rref(m) \(\left(\begin{array}{cccc} 1.0 & 0.0 & 0.0 & 5.46969696961 \\ 0.0 & 1.0 & 0.0 & 22010.2878784 \\ 0.0 & 0.0 & 1.0 & 22141315.333 \end{array}\right)\) 

09052023, 11:28 AM
Post: #15




RE: Accuracy of Quadratic Regression
HP did better job with the HP38G in 1995, 18 years before the Prime was released.
The HP38G (and HP40GS) results : 5.46969698931 x^2 + 22010.2879576 x  2214135.4125 Not as good as the "Excel results" but a lot better than the Prime. The HP38G like the Prime uses BCD arithmetic with a 12 digit mantissa and 3 digit exponent. I suspect there is an issue with the recoding of the BCD math routines from HP38/9/40 System RPL to platformindependent C for the HP39Gii which was then used in the Prime. The HP39Gii gives the same result as the Prime. 

09052023, 11:28 PM
(This post was last modified: 09052023 11:28 PM by jte.)
Post: #16




RE: Accuracy of Quadratic Regression
(08212023 05:52 PM)Albert Chan Wrote: ⋮ For conceptual simplicity / implementation ease, I used higher precision arithmetics for the curve fitting in the Function app’s ”Sketch” feature. (129bit signmagnitude for quadratic fits and 161bit signmagnitude for cubic fits.) My intention to use higher precision there was not just based on wanting to rule out any roundoff during the production of the entries of the design matrix, but to also bound the roundoff involved when solving for the coefficients (of the quadratic / cubic fit). 

09062023, 06:18 PM
Post: #17




RE: Accuracy of Quadratic Regression
Why do people keep on saying the low accuracy is due to the Home floating point precision being 12 digits when it is clear that this is not the case? Please explain how HP managed to do it better with 12 digit precision on the HP38G from 28 years ago.


« Next Oldest  Next Newest »

User(s) browsing this thread: 1 Guest(s)