06-09-2015, 03:09 PM

I need a sanity check here. I'm running a logarithmic regression on some data, and I'm getting very slightly different correlation coefficients (r, not r^2) from three different calculators.

HP 48SX: 0.968372745387

TI-36X Pro: 0.96835943144

TI-89 Stats flash app: 0.968372745387

TI-89 custom function: 0.968372683432

Notice the 48SX and TI-89 stats app match up, so I'm inclined to believe those are the most accurate. The 36X Pro may have lower internal precision, or it's using a different faster/less accurate method to produce the result.

The custom function I made for the TI-89 (since the built in stat commands don't calculate correlation for logarithmic, exponential, or power regression for some reason) is also a little bit off. I used the formula shown about halfway down this page:

http://brownmath.com/ti83/regres89.htm

sum((x[i]-meanx)*(y[i]-meany),i,1,n)/((n-1)*sx*sy)

Where sx and sy are sample standard deviations of the x and y lists respectively. Also, the x list has been transformed with LN prior to any calculations.

I have a feeling taking the sum of products is making it lose precision somewhere. And if that's the case, is there a better approach? I tried the z-score method given on that same page, basically moving the standard deviations into the products within the sum, but I end up with a repeating decimal that looks a bit fishy.

This is the data I'm looking at. Note that a logarithmic fit is NOT correct for this particular data, I'm just testing the correlation calculation.

1999, 8456

2000, 14959

2001, 13516

2002, 11298

2003, 11109

2004, 15256

2005, 29316

2006, 46038

2007, 51726

2008, 56686

2009, 58372

2010, 68426

2011, 70760

2012, 77238

2013, 100836

2014, 95461

HP 48SX: 0.968372745387

TI-36X Pro: 0.96835943144

TI-89 Stats flash app: 0.968372745387

TI-89 custom function: 0.968372683432

Notice the 48SX and TI-89 stats app match up, so I'm inclined to believe those are the most accurate. The 36X Pro may have lower internal precision, or it's using a different faster/less accurate method to produce the result.

The custom function I made for the TI-89 (since the built in stat commands don't calculate correlation for logarithmic, exponential, or power regression for some reason) is also a little bit off. I used the formula shown about halfway down this page:

http://brownmath.com/ti83/regres89.htm

sum((x[i]-meanx)*(y[i]-meany),i,1,n)/((n-1)*sx*sy)

Where sx and sy are sample standard deviations of the x and y lists respectively. Also, the x list has been transformed with LN prior to any calculations.

I have a feeling taking the sum of products is making it lose precision somewhere. And if that's the case, is there a better approach? I tried the z-score method given on that same page, basically moving the standard deviations into the products within the sum, but I end up with a repeating decimal that looks a bit fishy.

This is the data I'm looking at. Note that a logarithmic fit is NOT correct for this particular data, I'm just testing the correlation calculation.

1999, 8456

2000, 14959

2001, 13516

2002, 11298

2003, 11109

2004, 15256

2005, 29316

2006, 46038

2007, 51726

2008, 56686

2009, 58372

2010, 68426

2011, 70760

2012, 77238

2013, 100836

2014, 95461