Logistic Fit

01172014, 11:43 PM
Post: #1




Logistic Fit
I have been stymied trying to get the Prime to perform a Logistic fit in the Statistics 2Var app. Whenever I attempt to choose "Logistic" in the Symb view and then press Plot, I get Error: Invalid Input. Other function choices do fit the data (albiet badly). Anyone else seen this behavior?


01182014, 02:21 PM
Post: #2




RE: Logistic Fit
Please post the data you are trying to fit. That fit will misbehave very badly if you don't have data that is quite well behaved.
TW Although I work for the HP calculator group, the views and opinions I post here are my own. 

01182014, 04:14 PM
Post: #3




RE: Logistic Fit
Tim,
The data was generated by adding random noise to a sigmoid function: c1:= 10:0.1:10; c2:= sigmoid(c1)+random()*0.2; where: sigmoid(x):= 1/(1+exp(x)) 

01192014, 09:48 PM
Post: #4




RE: Logistic Fit
Are you aware that your sigmoid function has a pole x=0? So it doesn't seem to make much sense to fit from 10 to 10.
Even so, I tried to fit from 3 to 10 in 0.1 steps with a random noise added, and that also won't work with the logistic fit (with fixed L, A, B), so it seems to me that this problem could be investigated by HP. 

01192014, 10:12 PM
Post: #5




RE: Logistic Fit  
01202014, 05:44 AM
Post: #6




RE: Logistic Fit
Oops! I inadvertently entered 1/(1e^(x)),, which is discontinuous at x=0! Of course, 1/(1+e^(x)) is not, and behaves as shown. Thanks for pointing that out.


01202014, 01:52 PM
Post: #7




RE: Logistic Fit
You will find, however, that it still doesn't fit.


01202014, 03:01 PM
Post: #8




RE: Logistic Fit
I already found that out
Although, for certain subsets of the x range, no problem, e.g. 1 .. 5 with 0.1 step. 

01202014, 06:44 PM
Post: #9




RE: Logistic Fit
Interesting...I hadn't noticed that.
Still, it seems that if it could fit any data reliably this would be the easiest set. Hopefully Tim can add it to the list of things to investigate for a future release. 

01212014, 06:43 AM
Post: #10




RE: Logistic Fit
I suggest you prefix your thread title with "[BUG]"
Would be good to know if there will be a firmware update and what is in scope... 

01212014, 04:46 PM
Post: #11




RE: Logistic Fit
Well, this logistic fit was pulled from the old hp math library, but frankly it was never any good due to being so sensitive to even minor changes in numbers. When I was reimplementing it for the 39gII, I really wanted to switch it for a much better and more robust algorithm. However, after many many fruitless days of searching (over many months) I was never able to find a good fit that behaved predictably. The biggest challenge with this type of fit is finding good initial estimates. A human can easily identify what a reasonable estimate is, and whether it should be an increasing/decreasing version, but finding an algo that matched what is desired proved ridiculous difficult.
If anyone has any recommendations or suggestions, I am totally welcome to them. TW Although I work for the HP calculator group, the views and opinions I post here are my own. 

01222014, 04:02 AM
(This post was last modified: 01222014 02:55 PM by Han.)
Post: #12




RE: Logistic Fit
Is the current implementation merely a linear regression of something similar to \( \mathrm{logit}(P) = \alpha + \beta x \) where \( \mathrm{logit}(P) = \ln( \frac{P}{1P}) \)? I was naively thinking about taking the min and max value of \( P \) and normalize it to between 0+0.0000001 and 10.0000001 using a linear function (so that there are no issues with \( \mathrm{logit}(P) \), doing a linear regression, and then taking the inverse of the normalizing function. I take it I'm forgetting something quite obvious...
Here's my naive approach in code (for data that is central around the origin). Code:
At the home screen: Code:
In the 2vars Stats app, press [Num] and select C0 (and then C1, and C2) and press "Make" C0: Expression: L0(X), X starts from 1 to 201 step 1 C1: Expression: L1(X), X starts from 1 to 201 step 1 C2: Expression: use formula given by logreg(L0,L1), X starts from 10 to 10 step .1 Hit [Plot] and ignore the error message. Change your plot settings accordingly. Here's a screenshot: A smarter algorithm with check the \( R^2 \) value of the linear regression to see if outliers need to be filtered. Perhaps there may even be a preference for the points closer to the origin after normalization since \( \ln (\frac{P}{1P}) \) grows large for \( P \) values close to 0 and 1. Or perhaps do two linear regressions (one favoring points near the origin) and compare the \( R^2 \) values, and choose the tighter fit. Here's the linear regression of \( \ln (\frac{P}{1P}) \) after \( P \) has be normalized in the example above. Edit: this doesn't work for domains not centered about the origin. Graph 3D  QPI  SolveSys 

01222014, 03:51 PM
Post: #13




RE: Logistic Fit
I think that is roughly what it does based on my rather fuzzy memory.
TW Although I work for the HP calculator group, the views and opinions I post here are my own. 

01222014, 04:14 PM
Post: #14




RE: Logistic Fit
This looks good to me, and it works!
Another possibility would be to let the user specify starting values, along with the fit, and use something like the LevenbergMarquardt method, in analogy to the Moda library (V1.52) on hpcalc.org for the HP49/50. 

01222014, 04:38 PM
(This post was last modified: 01222014 04:40 PM by Han.)
Post: #15




RE: Logistic Fit
(01222014 04:14 PM)Helge Gabert Wrote: This looks good to me, and it works! I'm not sure if you were referring to my post (the one about a "naive" approach to logistic fitting), but if you were, do keep in mind that this technique is quite limited. For example, if the data is such that the domain is restricted to the interval \( [1,5] \) and whose range is in the interval \( [.5,2] \) then this technique fails. If we normalize the range \( [.5,2]\) to \( [P_{min}, P_{max} ] \), how does one determine whether \( P_{min} \) is closer to 0 or .1 or .5 or even .75 (similarly if \( P_{max} \) should be 1, or much smaller). The domain may be of some help. That is, if we're "far to the right" then \(P_{min}\) and \( P_{max} \) will presumably be each closer to 1. However, there are similar issues even when using the domain. So when you say have the user specify the starting values, would that be essentially the same as allowing them to select the \( P_{min} \) and \( P_{max} \) values? I think that could work for data that is not central about the origin. I'm a little rusty in logistics modeling, but vaguely remember something about maximum likelihood estimates have some connection here (?). Graph 3D  QPI  SolveSys 

01222014, 06:16 PM
Post: #16




RE: Logistic Fit
Yes, starting values should help to circumvent the "not central about the origin" issue. Coupled with a technique like LevenbergMarquardt (alternating between Newton and Steepest Descent) works for most data sets   although local minima might be encountered. I believe that is what is implemented in MODA, and also in MRQ library for the 49/50.


« Next Oldest  Next Newest »

User(s) browsing this thread: 1 Guest(s)