Crawling amoebas

I just finished implementing the Nelder-Mead algorithm for finding the minima of functions. This is a fairly robust algorithm that operates in n dimensions. It requires no information on derivatives, and can efficiently crawl through valleys and the eyes of needles. How? Simulate an idealized amoeba!

Say you have a function f : \mathbb{R}^n \rightarrow \mathbb{R} which you want to minimize. Construct an n-dimensional simplex (the simplex is the higher dimensional analog of the triangle or tetrahedron). Now let the simplex, consisting of points S = \{ x_i, i = 1, \dots, n+1 \} crawl around \mathbb{R}^n.

For each iteration, let x_w be the worst of the \{ x_i \} (in the sense that it has the largest value of f), x_s be the second worst, and x_b be the best. Also, let S' = S - \{ x_w \}.

We will also need the vector which describes reflecting one point through the hyperplane formed by the rest of the points of the simplex. This is given by r = \frac{1}{n-1} \sum_{x_i \in S'} (x_i - x_w), the vector to the center of mass of that hyperplane.

We make the simplex crawl according to the following algorithm:

  1. If f(x_w + 2r) < f(x_b), try going twice as far beyond the simplex: x_w + 3r, then replace x_w with whichever gives of a lower value of f: x_w + 2r or x_w + 3r.
  2. Otherwise, if f(x_w + 2r) < f(x_s), then we just replace x_w with x_w + 2r (the reflection isn’t promising enough to try the extension).
  3. At this point life is looking bad. Try just pulling x_w in towards the rest of the simplex: if f(x_w + \frac{1}{2}r) < f(x_w), replace x_w with x_w + \frac{1}{2}r.
  4. If even that isn’t an improvement, we try to go to a smaller length scale (it may be that we’re hovering over a narrow valley, or even across several valleys, and are being confused by this). To do this, we contract the simplex around x_b. Take S' - \{ x_b \}, add to it whichever is better of x_w and x_w + 2r (remember that we only tested up to the point where x_w + 2r was better than x_s). Then we start again with the new simplex S” = \{ x_b + \frac{1}{2} ( x_i - x_b ) \} where the \{ x_i \} are drawn from (S' - \{ x_b \} ) \cup the better of x_w + 2r and x_w.

Wikipedia has a very nice animated gif of a simplex crawling according to these rules.

I have one serious problem with the simplex algorithm: I have no idea to prove when or if it will work. There are modifications of it to try to improve its reliability, particularly in high dimensions, but I know of no theoretical examination of these properties.

Leave a comment