HP Forums

Nelder Mead Optimization of a function of 2 variables.
References: https://en.wikipedia.org/wiki/Nelder–Mead_method
Application:
Precondition: The function that is to be optimized is in the subroutine “FU”.’
Input of FU is value of variable X in the X register and of variable Y in the
Y-register.
Output of FU is f(X,Y) in the X-register.
Note: NMO searches a minimum of f(X,Y). If a maximum has to be
searched take the negative of f(X,Y) as output of FU.
Program uses registers 00 and 05 until 19.

Input: Termination criterion: (a small number, e.g. .0001) in register 00.
Original length of search steps (the more shallow the function is the larger, e.g. .2 to .5): Register Z
Starting point for variable Y: Register Y
Starting point of variable X: Register X

Output is on the stack:
Z-register: Function value f(X,Y) at minimum
Y-register: Y
X-register: X

Example: Use Function FU as listed below. This function is a banana-shaped function that is used in
literature as a difficult to optimize function
f(X,Y) = 100*(X²-Y)²+(1-X)²
The minimum is at X=Y=1 and f(1,1)=0.
We start with a first estimate of X=-2, Y=2 and a step width of 0.2.
The termination criterion is .0001.
Input is as follows:
.0001 STO 00
.2 ENTER
2 ENTER
-2
CATALOG – PGM – NMO
Output:
Z: 1.9662E-10  Estimate of function at minimum
Y: 1.0000  Estimate of Y at minimum
X: 1.0000  Estimate of X at minimum

00 { 274-Byte Prgm }
01▸LBL "NMO"
02 STO 05
03 STO 08
04 STO 11
05 R↓
06 STO 06
07 STO 09
08 STO 12
09 R↓
10 STO+ 08
11 2
12 ÷
13 STO+ 11
14 3
15 SQRT
16 ×
17 STO+ 12
18 RCL 06
19 RCL 05
20 XEQ "FU"
21 STO 07
22 XEQ 31
23 XEQ 11
24▸LBL 01
25 RCL 05
26 RCL 08
27 +
28 2
29 ÷
30 STO 14
31 RCL 06
32 RCL 09
33 +
34 2
35 ÷
36 STO 15
37 XEQ 21
38 RCL 05
39 RCL 08
40 -
41 X↑2
42 RCL 06
43 RCL 09
44 -
45 X↑2
46 +
47 SQRT
48 RCL 05
49 RCL 11
50 -
51 X↑2
52 RCL 06
53 RCL 12
54 -
55 X↑2
56 +
57 SQRT
58 +
59 2
60 ÷
61 RCL 00
62 X<Y?
63 GTO 01
64 RCL 07
65 RCL 06
66 RCL 05
67 RTN
68▸LBL 11
69 RCL 10
70 RCL 07
71 X≤Y?
72 GTO 12
73 STO 10
74 X<>Y
75 STO 07
76 RCL 05
77 RCL 08
78 STO 05
79 X<>Y
80 STO 08
81 RCL 06
82 RCL 09
83 STO 06
84 X<>Y
85 STO 09
86▸LBL 12
87 RCL 13
88 RCL 10
89 X≤Y?
90 GTO 13
91 STO 13
92 X<>Y
93 STO 10
94 RCL 08
95 RCL 11
96 STO 08
97 X<>Y
98 STO 11
99 RCL 09
100 RCL 12
101 STO 09
102 X<>Y
103 STO 12
104 GTO 11
105▸LBL 13
106 RTN
107▸LBL 21
108 1
109 STO 19
110 XEQ 24
111 RCL 13
112 X<Y?
113 GTO 23
114 RCL 18
115 RCL 10
116 X>Y?
117 GTO 22
118 R↓
119 RCL 07
120 X≤Y?
121 GTO 22
122 2
123 STO 19
124▸LBL 22
125 RCL 16
126 STO 11
127 RCL 17
128 STO 12
129 RCL 18
130 STO 13
131 XEQ 11
132 RTN
133▸LBL 23
134 0.5
135 STO 19
136 XEQ 24
137 RCL 13
138 X>Y?
139 GTO 22
140 2
141 STO÷ 08
142 STO÷ 09
143 STO÷ 11
144 STO÷ 12
145 RCL 05
146 2
147 ÷
148 STO+ 08
149 STO+ 11
150 RCL 06
151 2
152 ÷
153 STO+ 09
154 STO+ 12
155 XEQ 31
156 XEQ 11
157 RTN
158▸LBL 24
159 RCL 15
160 RCL 12
161 -
162 RCL 19
163 ×
164 RCL 15
165 +
166 STO 17
167 RCL 14
168 RCL 11
169 -
170 RCL 19
171 ×
172 RCL 14
173 +
174 STO 16
175 XEQ "FU"
176 STO 18
177 RTN
178▸LBL 31
179 RCL 09
180 RCL 08
181 XEQ "FU"
182 STO 10
183 RCL 12
184 RCL 11
185 XEQ "FU"
186 STO 13
187 RTN
188 END

00 { 24-Byte Prgm }
01▸LBL "FU"
02 STO 01
03 X↑2
04 X<>Y
05 STO 02
06 -
07 X↑2
08 100
09 ×
10 1
11 RCL 01
12 -
13 X↑2
14 +
15 RTN
16 END

Very nice, thanks!

Could you please give us a few comments about the code:
- expand, contract, shrink (etc) functions (ie LBLs in the code?)
- use of the registers (simplex coordinates, number of iterations)
There should also be a way for DM42 users to draw the simplex evolutions, or at least print the steps used.

Regards,
Thibault

Hi,

Thank you very much for your interest.

First some explanation concerning the code:

Use of registers:
00 --> Minimum termination criterion
19 --> Factor for reflection

______Simplex___*)_new point
x_____05_08_11_14_16
y_____06_09_12_15_17
f(x,y)__07_10_13____18

*) center for reflection: Mean of best and second best value

Values of simplex are sorted so that 07 is the minimum and 13 the maximum

The code:
Steps 2 to 17: Building the simplex
18 - 22: Computing function values
23: Sorting the simplex so that 05, 06, 07 is the minimum 11, 12, 13 the maximum
25 - 36: computing the center for reflection
37: Optimization explanation see later
38 - 60: computing of termination criterion
This is a change to the original Nelder Mead method, where the termination criterion is the differences in function values f(x,y). I thought it is better to take the differences of the x and y values instead.
61 - 63: Comparison of termination criterion with minimum termination criterion. Continue process if not yet reached.
64 - 67: final steps and end of program
68 - 106: Sorting the points of the simplex.
107 - 157: Optimization procedure
108 - 109: Sets reflection factor to standard
110: Subroutine 24 does the reflection. New x an y are stored in 16 and 17 and new function value is in 18
111 - 113: If the new value is worse than the worst of the simplex goto 23 line 133ff (explanation see later)
114 - 121: If new value is better than the worst but not better than the best the new value replaces the worst value and the simplex is sorted and the process continued (line 124-132)
122 - 123: If the new value is better than the old best value reflection factor is doubled and process continues with replacing the worst value by the new one (line 124-132)
133 ff is the routine when the new value is the worst.
134 -139: Reflection factor is halved and a new try with the new reflection factor is done. If the result is better than the worst value of the simplex the new value replaces the old worst value of the simplex (124 - 132)
140 - 157: If the new value is worse than the worst value of the simplex the size of the simplex is halved, the function values of the new simplex points is computed and the simplex is sorted. With the new simplex a new try is started.
158 - 177: The reflection of the worst value at the mean of the two best using the reflection factor in Reg 19 and computing the new function value. Results are stored in 16, 17 and 18.
178 - 187: Computing the function values of the second best and worst point of the simplex.

Concerning printing:
This is difficult because there are changes at many points of the code as shown. If you restrict to complete optimization steps you could add printing commands after step 37.

I hope this helps a bit.

Best regards

Raimund Wildner

Yes it helps, vielen dank! (Thanks a lot)

I added a counter line 37 before the call of the optimisation procedure, it’s quite funny to see that 2140 iterations are needed for a .0001 precision, but “only” 2191 for a precision of 10^-9

About you criterion calculation: so you use the distances between the original points of the simplex and not their image through FU.
It must be faster, but I’m really wondering about the consequences of this method, especially for non derivable functions.

Another question: you don’t care about the domain of FU? Inserting for example a square root in FU should create random comportment with complex values, don’t you think?

Regards,
Thibault

Hi,

concerning the termination criterion: I do not think that it is faster to use the differences in the X and Y than in the f(X,Y). Near the optimum most derivable functions are rather flat with a derivate well below 1, so that the differences in the function are smaller than the differences in the variables. So it should be slower. But I thought if you set a threshold you want to set the exactness of your input factors.

Concerning the domain of FU you are right. If the square route would be negative the procedure would stop. I put not enough care on the function which was just for demonstration purposes.

Concerning not derivable functions: These are a problem to all optimization functions. It can be that the simplex does not get along the valley and the size of the simplex is shrinked to virtually zero. But in this case both termination criteria would stop the procedure. A possible way to handle such cases is to start the procedure again at the solution reached so far.

Best

Raimund

Maybe the best way to proceed with the domain of FU is to let the function handle the out of bounds values, ie by returning 9E99, then the NMO routine keeps simple and will avoid those regions.

I should try this.

I also tried (x+1)^2+(y+1)^2, it worked perfectly!

Hi,

I once more thought about the square roots and I do not see that there can be a problem. In the function FU there is no square root. And concerning the computation of the termination criterion in line 47 and 57 these are square roots of squares of differences of real numbers which cannot be negative. So there is no need to work with penalty functions etc.

Best

Raimund

I hope you’re not bothered by my questions, your program is really interesting and I am investigating the N-M optimization, thanks to you.
I used it for the following formula for FU:

Code:

-3(x+3)^2+5x^3+1/2*(x-1)^4+1/2*y^2

See graphics:
[attachment=8396]

Gives nice results when starting from (-3,0):
With original length set to .2 the simplex gets stuck in the wrong valley, and the result (-3.811,0) is a local minimum.
If original length is set to .5 then the steps are bigger enough for the simplex to « find » the right valley in (1.312,0).

Then I entered a small pitfall in FU:

Code:

-3(x+3)^2+5x^3+1/2*(x-1)^4+x*sqrt(x+5)+1/2*y^2

See graphics:
[attachment=8397]

As you guess I had to insert a simple rule to avoid values of x below -5, giving 50 as a result (dummy management of out of function’s domain values).
With length of .2 or .5 starting in (-3,0) I get stuck in the wrong valley (-3.778,0).
Using length of 1 and starting from (-2,0) I find the right minimum. I will investigate more, but I think it is the good way to manage the domain in a tricky way.
Using length of 2 and starting from -3 I suppose the simplex gets stuck on the Z=50 wall, missing the exit...

Once again thanks for this interesting program.
Regards,
Thibault

Me again!

I changed the behavior of my FU, creating a gradient for values of x<(-5)
It returns 10*(-x).
Starting from (-30,0) with length of .5, 1 or 2 the simplex stays in the wrong valley (-3.778,0).
With length of 50 (zum beispiel), the right valley is found (1.227,0).
Your algorithm is really well built.

Regards.

Hi Thibault,

first: I like the discussion and I am not bothered at all.

Your experience with the local minimum shows a problem that comes with all numerical optimization procedures. That is the reason why it is recommended to start the procedure from several starting points.

If you are interested in numerical optimization procedures I can recommend the book by Ulrich Hoffmann and Hanns Hofmann "Einführung in die Optimierung mit Anwendungsbeispielen aus dem Chemie-Ingenieur-Wesen", Weinheim / Bergstr. 1971. It is in German but I got the impression that you understand German.

Best

Raimund

Sure I could order some food in Germany, or maybe talk with someone in the street, but no and I regret I could not read such a book! And I did not find any translation. Anyway thanks for the advice, I could read some other books to improve my knowledge about optimization, it has not been in my current matters since 20 years (and I was working with experts, I was not one of them).

Anyway, playing with N-M-O is funny because 1) you have to find an interesting function z=f(x,y) that shows particularities, and 2) you have to search for initial values and to try several times, hoping, like always in numerical optimization, that you will not miss the minimum.

Regards.

rawi

pinkman

rawi

pinkman

rawi

pinkman

rawi

pinkman

pinkman

rawi

pinkman