|Re: Solving a linear system with physical meaning|
Message #12 Posted by Rodger Rosenbaum on 22 Jan 2008, 5:11 a.m.,
in response to message #11 by Nick
Rodger, this doesn't have to do with mathematically exact solutions that aren't possible physically, like for example partcles with imaginary mass fulfilling special relativity and travelling at speeds greater than c.
It may not have to do with "particles with imaginary mass", but it most certainly has to do with mathematically exact solutions being unsatisfactory for reasons having to do with the physical aspects of the problem. That is what I intended when I started this thread, and I hinted at that in the subject line.
Let me give a concrete example that I also hinted at in the first post. Consider this freshman physics problem.
From a point on a platform, 10 feet above the ground, launch a steel ball upward with an initial velocity of 60 feet per second (perhaps at a slight angle so it doesn't fall back on your head!). How many seconds does it take until the ball reaches the ground? Taking the acceleration of gravity as 32.17 f/s, we set up the following equation:
16.085*t^2 - 60*t - 10 = 0
Solving, we get two solutions, t=3.89 and t=-.1598
Do we really believe that the ball will hit the ground -.1598 seconds after we launch it, and again in 3.89 seconds?
The problem I have given in this thread is similar in that there are numerous solutions that seem reasonable along with the one mathematically exact solution. We must look to the physical nature of the problem to help us select the appropriate solution.
Let me digress.
In the November 1970 issue of The American Mathematical Monthly, there is an article titled "Pitfalls in Computation, or Why a Math Book isn't Enough", by Professor G. E. Forsythe. He gives an example:
Here is a small linear system, A*x = B:
[[ .780 .563 ] * [[ x1 ] = [[ .217 ]
[ .913 .659 ]] [ x2 ]] [ .254 ]]
Two solutions to the system are proposed, [ .999 -1.001 ]T and [ .341 -.087 ]T.
Which one is better? The usual check is to substitute them both into the original problem. Substituting [ .999 -1.001 ]T gives a residual ( A*x-B ) of [ -.001343 -.001572 ]T. Substituting [ .341 -.087 ]T gives a residual of [ -.000001 0.0 ]T. It seems clear that the second proposed solution is better than the first, since it makes the residual far smaller.
However, in fact the true solution is [ 1 -1 ]T as the reader can easily verify. Hence, the first proposed solution is far closer to the true solution than the second.
A persistent person may ask again: which solution is really better? Clearly, the answer must depend on one's criterion of goodness: a small residual, closeness to the (mathematically) true solution, or something else. Surely one will want different criteria for different problems.
The problem I gave has physical aspects that lead to certain choices for a solution. First of all, the meters are nearly identical. That leads to the notion that they should be weighted nearly equally. Also, I assumed that the noise (errors) is zero mean, uncorrelated between meters. This is a reasonable assumption, absent any evidence to the contrary. In a real situation, there might be reason to assome otherwise. If so, then it will have to be dealt with some other way, but for the purposes of this thread, that's the assumption.
Because the noise tends to be reduced by averaging processes, whereas the desired quantity (value of the current) isn't, we don't want to do what Jonathan Eisch suggested:
In other words: Why would you expect [.25 .25 .25 .25] when [1000 -999 0 0] is just as good?
The reason the weights [1000 -999 0 0] are not just as good is because those weights will greatly magnify the errors with respect to the desired current reading. Plus, of course, the readings from two of the meters are thrown away, losing any helpful information they may carry. The weights are mathematically equivalent if the current readings are identical for each meter, but that's not the case here.
And besides the purely physical consideration that we would like to include the readings from all 4 meters (otherwise why bother having 4 meters?), there is a mathematical reason:
In this problem, the sum of the weights in any of the solutions we've seen so far is nearly 1. It is well known that, given 4 numbers whose sum is 1, the root-mean-square value of the 4 is a minimum when they are all 4 equal. Since the additive noise in the readings is uncorrelated, a weighted sum of the 4 noise signals will be minimized when the weights are nearly equal.
So, let's assume that we have determined that the weights should be nearly [ .25 .25 .25 .25 ]T, which are far from the weights the exact linear system solution gives.
It would appear that our criterion should be to judge the goodness of our solution by looking at the norm of the residuals. Calculate this on the HP50 by using the ABS function applied to (A*w-B).
Let's see if we can find the minimum residual norm solution where the weights are constrained to be equal. First so we will have a basis for comparison, let's compute the residual norm with weights of [ .25 .25 .25 .25 ]T (using the A matrix I gave in the first post of this thread). I get .07818.
To calculate the residual norm where the weights are constrained to be equal, multply A by [ 1 1 1 1 ]T and get [ 4.14 7.82 11.83 15.87 ]T. Put this on level 1 of the stack and put [ 1 2 3 4 ]T on level 2 of the stack; press LSQ and get [[ .252607010711 ]]. Using [.252607 .252607 .252607 .252607 }T as a solution vector, compute the residual norm. I get .0540137, a substantial improvement over .07818.
It is possible to do even better if the individual weights are not constrained to be equal.
So far, nobody has given much in the way of numerical solutions. There have been discussions of various procedures, but no examples to show if they give a "good" result, with "good" meaning the sort of thing I've discussed.
Nonetheless, as already said, there is also absolutely nothing wrong with negative weights, should the non-restricted model be considered good enough.
There are several things wrong with negative weights, and I hope I've explained why.
Now, please use your calculators to find some weights with even lower residual norm, but close to [.25 .25 .25 .25 ]T, and tell us how you did it.
But, be aware that there is a trade off. If you do a search for a global minimum in the residual norm, without constraints the search takes you to the exact linear system solution, with a residual norm of zero. It would seem that if you allow the individual weights to vary, you should be able to get a smaller residual norm, with the norm getting smaller as you approach the exact linear system solution.
What I'm looking for from all you who are interested is a procedure to find a set of weights with better residual norm than the one where the weights are constrained to be equal. And, try to do it with linear algebra techniques.