From a brilliant piece of detective work by Seth Morabito, plus
further experimental work by me today, I now know more about the
problem, all as set out below.
The malfunction is NOT caused by endless looping (but read on...).
KNOWN DEFECTIVE CALCULATORS
But first, here is the current list of serial numbers known thus far
to be prone to the aberrant behaviour described below:
- CNA 72100255
- CNA 72100299
- CNA 72101944
- CNA 72102148
- CNA 72102361
LATEST INFO ON THE PROBLEM
What happens is that something as-yet undiscovered about my program
code causes the calculator processor to seize control of the
calculator and to refuse absolutely to hand control back to the user unless and until it encountes a program instruction that stops
execution. We don't know whether the problem is to do with some
special sequence of program steps that lets go completely, some set of flag settings, something to do with the length of the program, or what.
No keystroke combination or sequence has been found that will
interrupt the processor when this happens. It stops only when it
encounters a R/S or an INPUT instruction. I don't know about VIEW.
PSE does NOT make it let go.
The danger posed by the above, which is what happened to me, is that, if this aberrant behaviour occurs whilst test-running a program for debugging, and an error in the program code causes the processor to enter an endless loop, then you are completely stuffed. Nothing but nothing will interrupt it, short of paper-clip-resetting in the hole at the back, and this wipes all of the work, in all programs. Memory cleared.
PROGRAM CODE FOR TESTING
What my experiments today confirmed is that it isn't necessary to be stuck in a loop for the fault to occur. I have now emailed two detailed annotated commented M$Word-formatted program code listings to half a dozen forum subscribers whose email addresses I have and who have worked hard to help diagnose and solve this problem.
BEAM 8110 ver.03.doc
BEAM 8110 ver.050.doc.
Please, anyone who wants a copy, just post a request on this forum and someone will send it to you.
Version 03 is the buggy version with the endless loop. This causes
disastrous loss of all work, as explained above. Please stop using it (more on this below).
Version 04 is a debugged version that seems from an hour or two of
testing so far to work correctly every time, from start to finish. It works. Subject to furtehr testing, you can design the section size and main flexural reinforcement for reinforced concrete beams with it (if you know what you are doing!).
Here's the thing, though. When it runs, it takes about 14 seconds to complete its analysis. During this time the user loses all control of the calculator -- it cannot be stopped by any known means. But again, read on...!
Users will know that the screen normally displays, "RUNNING" during
program execution. Well, when it enters this lock-up condition, this message is absent from the display. That is a sure way of knowing you've lost it and won't get it back unless and until it finds something in the program to make it stop.
What Seth discovered, and I can confirm, is that, if the program is
restarted after entering data, not by pressing R/S straight away, but by pressing the down-arrow key to execute single program steps a couple of times, and then pressing R/S, then this bad misbehaviour does not happen. I agree with Seth - this really, really strange.
ONGOING TESTING STILL NEEDED
I recommend further testing should now continue, not with my program version 03, but with version 05. Version 03 will make you lose all the work if you let it enter the endless loop. Version 05 seems to be free of endless loops (from tests so far), so this lets you try out the problem more safely. Thus far (in maybe 50 tests), I have been able to lose total control of the calculator without having to reset and erase all work, because it runs until the program finishes, then stops displaying the results.
What to do:
Take data entry as far as the point where it asks for steel bar
diameters and then (with version 05),
enter eg 25 R/S to see in reasonable safety if your serial number
loses all control; which you should get back after about 14 seconds
when the program finishes,
enter 25 SINGLE-STEP SINGLE-STEP R/S
(SINGLE-STEP is done by pressing the down-arrow)
this demonstrates how the calculator stays under your control,
allowing you to interrupt it at any time (i.e. correct behaviour).
Let me repeat, please DO NOT use version 03 of the program for any
more testing. Use version 05. This lets you replicate and test for the reported fault in reasonable safety. I say "reasonable" because I cannot guarantee total safety, naturally.
Seth has discovered a neat work-around but this is clearly not
satisfactory. First, it might not always work. Second, it is so easy to forget and enter R/S when testing a program without first
single-stepping through a couple of program steps. The result of
either of these is complete disaster if there happens to be an endless loop in unfinished program code.
With no possible means of saving programs except in volatile memory, extremely robust firmware is fundamentally important. This defect breach this crucial principle.
Please, everyone, feel free to distribute these programs for testing, development and use by anyone who wants them.
I am getting some help, since yesterday (15-Oct-07) from :
"GT" wrote to me to ask how he/she can help and I am delighted to tell you all that he/she is very keen to investigate this and find a solution because "these experiences so far have been less than [HP's] intention". I have GT's kind permission to report the above and I will continue to liaise with GT by direct email. The fault is currently being investigated on GT's own HP35s (with a much higher serial number) and I am helping with this. The signs are that it might not have this fault.